Entries for January 2026

@onusoz · 2026-01-31

got fully sandboxed @openclaw to run finally, starting scrape the UNDESIRABLE now I'm a security nut and didn't want to run even the gateway unsandboxed. openclaw apparently currently doesn't have support for FULL sandboxing. it took me a few hours to get it to work because docker builds suck. I'm also tired this, so I'm just gonna wipe an old thinkpad and go full yolo so yeah, time to scrape some posts

@onusoz · 2026-01-31

The metacortex — a distributed cloud of software agents that surrounds him in netspace, borrowing CPU cycles from convenient processors (such as his robot pet) — is as much a part of Manfred as the society of mind that occupies his skull; his thoughts migrate into it, spawning new agents to research new experiences, and at night, they return to roost and share their knowledge. This was written in 2005... "trigger...

Read more →

@onusoz · 2026-01-31

We need better filters both for ourselves and the agents. Locally runnable models to filter out undesirable content with high precision. Fully open source datasets, weights, MIT license

@onusoz · 2026-01-30

Correction, it's not a perfect illustration. I actually never YOLO locally, only in containers So there is actually 4 modes IMO that is sustainable with current SOTA. @grok create an image with only Figure 1, 2, 5 and 6 And then YOLO is another axis, unrelated to this

@onusoz · 2026-01-30

Gastown is crazy. But this figure until Level 7 is a perfect illustration of how my workflow evolved since Claude 3.5 Sonnet in Cursor I am at the stage where I ralph 1-2 tasks before I sleep. During the day, I am switching back and forth between minimum 2-3 CLIs, sometimes up to 5 This maps exactly to token usage as well. 1 month ago, I was running into limits in 1 OpenAI Pro plan, around the day it was suppose...

Read more →

@onusoz · 2026-01-30

Ilya was right. Reliability is the most important thing when it comes to models. That's why gpt 5.2 xhigh and co. is my daily driver

@onusoz · 2026-01-27

With this extremely unwise move, anthropic will soon witness moltbot’s brand recognition surpass that of claude and realize they could have rided that wave all along

@onusoz · 2026-01-27

Yesterday had multiple cases of swearing to gpt-5.2-codex xhigh. model feels nerfed. might be my bias now I'll be going back to gpt 5.2 xhigh for some tasks can't wait for open models to have this performance so that I will never have nerf paranoia ever again

@onusoz · 2026-01-26

I queued 2 ralph-style tasks on our private cloud devbox codexes last night. Just queued the same message like 10 times in yolo mode Task 1: impose a ruff rule for ANN for all Python code in the monorepo, to enforce types for all function arg and return types Result was... disappointing. Model was supposed to create types for everything and stub where needed. It instead created an Unknown type = object and used ...

Read more →

@onusoz · 2026-01-25

Buying a mac mini for clawdbot is not so wise. if anything you should be buying mac studio, because mac mini not be running any good llms locally anytime soon

Python limitations in the agent era

I'm really starting to dislike Python in the age of agents. What was before an advantage is now a hindrance I finally achieved full ty coverage in @TextCortex monorepo. I have made it extra strict by turning warnings into errors. But lo and behold, simple pydantic config like use_enum_values=True can render static typechecking meaningless. okay, let's never use that then... and also field_validator() args must a...

Read more →

@onusoz · 2026-01-24

vscode my not be as bloated as cursor, but it has extremely stupid things like this that they are not fixing fast the new agent ui, icons, spacing etc. are UGLY. it's clear that the person who was managing the original product experience is not there anymore. microslop has hit again @zeddotdev on the other hand works out of the box and feels like it's been built by people who clearly knows what they are doing. i...

Read more →

@onusoz · 2026-01-23

I want an editor that puts the terminal in the foreground and editor in the background. a cross-platform, lightweight desktop app which integrates ghostty, and brings up the editor only when I need it something that lets me view the file and PR diffs easily, which I can directly use to operate github or other scm

@onusoz · 2026-01-23

I'm going back from cursor to vs code now. I have no use for it other than viewing files/diffs, doing search, git blaming with gitlens cursor's default setup is more aesthetic, but it's also a memory and cpu hog, which is the last thing I expect from a devtool

@onusoz · 2026-01-23

woke up and all invalid-argument-type issues are resolved. some unit tests broke, and now fixed after pointing out to them

@onusoz · 2026-01-22

codex is happily churning away some remaining thousands of @astral_sh ty issues in yolo mode on my remote devbox going to sleep, let's see if it will survive context compaction this time

Responsible engineering in agent workflows

on being a responsible engineer ran my first ralph loop on codex yolo mode for resolving python ty errors, while I sleep, using the devbox infra I created I had never run yolo mode locally, because I don't want to be the one who deletes our github or google org by some novel attack so I containerize it on our private cloud, and give it the only permissions it needs, no admin, no bypass to main branch, no deploy...

Read more →

Agents as enforcers of engineering culture and process

AI agents are the greatest instrument for imposing organization rules and culture. AGENTS.md, agent skills are still underrated in this aspect. Few understand this Everybody in an org will use agents to do work. An AI agent is the single chokepoint to teach and propagate new rules to an org, onboard new members, preserve good culture Whereas propagating a new rule to humans normally took weeks to months and coun...

Read more →

@onusoz · 2026-01-21

just added session persistence to our kubernetes managed devboxes using zmx by Eric Bower (neurosnap/zmx on github). like tmux but with native scrollback! I don't want to give agents access to my personal computer, so I host them on hetzner. one click spawn, and start working

GitHub's trust model breaks under AI-generated code

The fundamental problem with GitHub is trust: humans are to be trusted. If you don't trust a human, why did you hire them in the first place? Anyone who reviews and approves PRs bears responsibility. Rulesets exist and can enforce e.g. CODEOWNER reviews or only let certain people make changes to a certain folder But the initial repo setup on GitHub is allow-by-default. Anyone can change anything until they are r...

Read more →

Stop defaulting to weak coding agents for serious work

STOP using Claude Code and Sl(opus) to code if ❌ you are not a developer, ❌ or you are an inexperienced dev, ❌ or you are an experienced dev but working on a codebase you don't understand If you *are* any of these, then STOP using models that are NOT state of the art. (See below for what you *should* use) When you don't know what you are doing, then at least the model should know what you are doing. The less kn...

Read more →

@onusoz · 2026-01-17

It is clear at this point is that github's trust and data models will have to change fundamentally to accommodate agentic workflows, or risk being replaced by other SCM One *cannot* do these things easily with github now: - granular control: this agent running in this sandbox can only push to this specific branch. If an agent runs amok, it could delete everybody's branches and close PRs. github allows for recover...

Read more →

@onusoz · 2026-01-17

Codex says "It's only reachable from داخل the kubernetes cluster" Little does Codex know turkish has borrowed loanwords from over 7 languages and I can understand it

@onusoz · 2026-01-17

Automated AI reviews on github by creating an ai-review skill and a script to paste trigger prompts and wait for their response. It is instructed to loop and not stop until all AI review feedback is resolved. This AI review workflow developed gradually based on the current capabilities, and I've realized recently that it became quite mechanical. So decided to automate it in full ralph spirit (it's ok because it's...

Read more →

GitHub has to change

It is clear at this point is that GitHub’s trust and data models will have to change fundamentally to accommodate agentic workflows, or risk being replaced by other SCM

Read more →

@onusoz · 2026-01-14

As someone who is frontrunning mainstream by roughly 6 months, I can tell you that you will be raving about pi and @openclaw 6 months instead of claude code. Go check them out at https://t.co/LXTbI8c5Mz and https://t.co/feZl2QDONg

A --skill convention for distributing agent capabilities

I propose a new way to distribute agent skills: like --help, a new CLI flag convention --skill should let agents list and install skills bundled with CLI tools Skills are just folders so calling --skill export my-skill on a tool could just output a tarball of the skill. I then set up the skillflag npm package so that you can pipe that into: ... | npx skillflag install --agent codex which installs the skill into...

Read more →

Anthropic pricing discourages agent-native development

Anthropic earlier last year announced this pricing scheme $20 -> 1x usage $100 -> 5x usage $200 -> 1̶0̶x̶ 20x usage As you can see, it's not growing linearly. This is classic Jensen "the more you buy, the more you save" But here is the thing. You are not selling hardware like Jensen. You are selling a software service *through an API*. It's the worst possible pricing for the category of product. Long term, peop...

Read more →

@onusoz · 2026-01-09

The models, they just wanna work. They want to build your product, fix your bugs, serve your users. You feed them the right context, give them good tools. You don’t assume what they cannot do without trying, and you don’t prematurely constrain them into deterministic workflows.

@onusoz · 2026-01-06

.@openclaw workspace and memory files can be version-controlled! In our pod, inotify triggers a watcher script every time there is a change to workspace folder, to sync these files to our monorepo. It then goes through the same steps: - Create zeno-workspace branch if doesn't exist, otherwise, skip - Sync changes to the branch, then commit - Create PR on github if doesn't exist - PRs can then be merged every once ...

Read more →

@onusoz · 2026-01-06

I see @bcherny and raise one. I not only did not open an IDE, I did not touch a terminal since last night, thanks to @steipete's @openclaw Opus in k8s pod pulls errors from gcloud, debugs the issue, and creates PR all inside Discord. I call this Discord Driven Development

@onusoz · 2026-01-04

GPT 4.5 is still the best model for prose and humor here it is generating a greentext from my blog post "Our muscles will atrophy as we climb the Kardashev Scale"

@onusoz · 2026-01-03

75k lines of Rust later, here is what I’ve built during the first Christmas with agents, using OpenAI Codex 🎄🤖 - A full mobile rewrite and port of my Python Instagram video production pipeline (single video production time: 1hr -> 5min) (ig: nerdonbars) - Bespoke animation engine using primitives (think Adobe Flash, Manim) - Proprietary new canvas UI library in Rust, because I don’t want to lock myself into Swift...

Read more →

@onusoz · 2026-01-02

SimpleDoc now has the check command for CI/CD Add to your PR checks to catch agent littering before merge. osolmaz/SimpleDoc on GitHub

@onusoz · 2026-01-02

Migrating @TextCortex to SimpleDoc. It's really easy with the CLI wizard! npx @simpledoc/simpledoc migrate We have a LOT of docs spanning back to 2022, pre coding agent era. Now we will have CI/CD in place so that coding agents can't litter the repo with random Markdown files