Entries for 2026

@onusoz · /2026/03/30· 10:43 PM View on

next up: claude agents sdk supports openai responses api 💀

@romainhuet· Mar 30, 2026

We’ve seen Claude Code users bring in Codex for code review and use GPT-5.4 for more complex tasks, so we thought: why not make that easier? Today we’re open sourcing a plugin for it! You can call Codex from Claude Code with your ChatGPT subscription. We love an open ecosystem!

@onusoz · /2026/03/30· 10:44 AM View on

Here is the spec and implementation for this flow. The mermaid diagram includes all the steps I mentioned in the post above, including a shameless AI review ralph loop, and other loops to make CI pass, resolve conflicts and so on I would recommend reading the README and TUNING.md to understand the approach here

Image hidden

@onusoz · /2026/03/30· 10:35 AM View on

acpx v0.4 ships Agentic Workflows, or as I like to call them "Agentic Graphs" It let's you create node-based workflows on top of ACP (Agent Client Protocol), to drive any coding agent (Codex, Claude Code, pi) through deterministic steps This let's you automate routine, mechanical legwork like triaging incoming PRs, bugs in error reporting, and so on... For example, OpenClaw receives 300~500 new PRs per day. A lot of them are low quality, but they still relate to real issues, so you have to address them somehow You need to: - extract the intent - cluster them based on intent - figure out if the proposed changes are legit, or whether they are slop local solutions, like trying to catch flies instead of drying out the swamp - if the PR is too low quality or the intent is not clear, close them - run AI review on them them and address any issues that come up - refactor them if the changes are half-baked - resolve conflicts - and so on... So that when the PR is presented to the attention of the maintainer, all the routine legwork is done and the only remaining thing is the decision to (a) merge, (b) give feedback to the PR author, or (c) take over the PR work yourself I wanted to build this feature since a couple months now, since Codex got so good. OpenAI models are now good at judging implementation quality, so I found myself repeating the same steps I wrote above over and over I also tried putting all this in a single prompt. But I believe there are workflows that should not be a single prompt, but a sequence of prompts in the same session That is because like humans, LLMs are prone to PRIMING. I claim that putting all steps in the same prompt at the beginning of the context will generally give suboptimal results, compared to revealing the intention to the model step by step Creating such a workflow also gives more OBSERVABILITY into the each step that an agent is supposed to take. Agent generates JSON at the end of each step, and that structured data can be used to monitor thousands of agents running at the same time in an easier way, on a dashboard Similar features have been introduced in e.g. n8n, langflow. But AFAIK they are not integrating ACP like the way I do I wanted to have a fresh approach, and to build an API that I can develop freely the way I want, so I created a new workflow API inside acpx The video is from the workflow run viewer, but that is not where you build the workflow. You build it by using the acpx flow typescript API. See examples/pr-triage in acpx repo Before building that, I started from a Markdown file with a Mermaid chart of the flow I had in mind. The Markdown file acts as a spec for the flow, and I have built the workflow through trial and error. I call this process "workflow tuning" I started working on acpx repo PRs one by one, tuning the flow, slowly scaling to more PRs. Finally, when I felt confident, I ran it in parallel over all external open PRs in the acpx repo. I believe it already saved me hours this week My next goal, if well received, is to set this up on a cloud agent so that it can process the 300~500 PRs the OpenClaw repo receives every day, in real time, as they come in I believe this will save all open source maintainers around the world countless hours and make it much easier to herd and absorb external contributions from everyone!

@onusoz · /2026/03/29· 07:35 AM View on

OpenAI early 2020s: "This model is too dangerous to release publicly, the world is not ready for it 😱😱😱" OpenAI and Anthropic in 2026: "Anybody can code now for just $200 per month. Oh btw our models are also leet uber hackers which can find zeroday exploits in any software, just fyi 😉😉😉" https://t.co/cksNYAigfc

@onusoz · /2026/03/28· 03:58 PM View on

Wow even I as a frontend noob understand the significance of this Some distant memory from 15 years ago needing to measure the width/height of some text and finding out it’s not possible to do reliably in web More beautiful typography for the web!

@_chenglou· Mar 28, 2026

My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow

@onusoz · /2026/03/28· 07:46 AM View on

There is an economic theory waiting to be uncovered here Token Leverage (TL) = Token spend / Human labor spend The higher Token Leverage a company has, the more automated and productive they are If you have TL=1, you are spending as much money on AI as your human employees The goal of a company should be to increase TL as much as possible, while keeping a positive profit margin. It will be the only way to compete You don’t need to muddy the definition with wasted tokens vs useful tokens, because a company will always be incentivized to reduce token waste in a competitive environment. By that logic, monopolies will always waste more tokens, similar to how they waste other resources Scaling TL higher to 2x, 10x, 100x will require a skilled workforce of engineers. It will be a very complex job similar to those working at the big labs. Burnout will be a defining feature of teams scaling TL Most incumbents will fail to scale their TL over 1. Some will get decimated by new entrants with TL much bigger than 1 Curious how the average TL will end up in different sectors. Whether it will stabilize at a certain value like 5.7x, or will just keep growing…

@t_blom· Mar 27, 2026

By the end of 2026, I predict token spend will be greater than engineering salaries at early stage startups.

@onusoz · /2026/03/27· 08:29 AM View on

There is a desperate upcoming need for version controlling non-dev knowledge work. Git for non-devs. Otherwise non-devs won't be able to use agents to their full extent Non-dev knowledge work is notoriously bad at being version controlled. You cannot UNDO edits to all MS word, excel or ppt files in an org as easily you can with something like git We know that agents will be ubiquitous. We also know they make mistakes, and people will want to undo their work regularly, once they make changes to a bunch of files. Well, they can't. They also don't have pull requests, or a way to resolve conflicts after simultaneous edits All these problems were solved by developers. We are extremely good at this The only non-dev tool I know that could do this at scale is Notion, and that is not used by enterprise as much as MS office. Notion also doesn't have branches, pull requests and reviews AFAIK Markdown and git is probably not it. I wish it were. But it is too complicated for non-devs Onedrive or other file backup systems are also not it. Are you gonna save a copy of a 100mb ppt every time someone changes a slide??? Let's say you find a way to compress it efficiently. Will you be able to get a single pointer to a state like we can in git? Agents need precision. Agents need consensus, they need to be able to know ground truth. They need to be able to tell what anything was at a given time. NOTHING in current MS stack currently allows it Agents won't care about your legacy systems. There will be new file formats, systems, knowledge stack, and companies who adopt them will destroy your business If MS office is going to die, it will do so because of this

@onusoz · /2026/03/25· 06:56 AM View on

Another one, call me stupid: “How would Google have done it?”

@onusoz· Mar 24, 2026

This is unscientific, but there are certain keywords and phrases I use a lot while using certain models like openai's. I use them a lot because they get me what I want immediately: - plainer lang - cutover - elegant and production ready - holy grail What are yours?

@onusoz · /2026/03/24· 08:19 PM View on

The MCP versus CLI argument should be reframed as Computer vs No-computer argument I personally get the dunk on MCP. It didn't work last year, with earlier models. Then we saw CLIs perform much better with the same models. And giving access to bash was much simpler! Models' training then made them better at calling using a shell. CLIs also have native progressive disclosure, due to the way they work But the most important fact doesn't get pronounced enough IMO A key factor was that giving a CLI to a model also means you are giving it an entire COMPUTER The action space of all commands an agent can run on bash is much, much bigger than a few MCP servers One is a Turing machine, and the other one is basically a REST API. Of course the Turing machine is going to be more powerful, depending on what is at the other end of the API By that logic, giving an agent access to bash over MCP versus direct access to bash should have the same level of effectiveness, with optimized prompt engineering and long term training. Because the interfaces are equivalent So the argument is, should we give our agents access to a computer, or not? It depends on the security requirements and the setup which the agent is supposed to run on. If you are co-hosting the agent on the same machine you are working on, then it is safer to use MCP servers, because it limits the attack surface in case of adversarial attacks But if you are willing to give the agent its own physical computer, willing to be mindful about the lethal trifecta and the principle of the least privilege, giving it shell access is much more useful So MCPs win in restricted/local environments, whereas CLIs/shell access win in unrestricted/remote ones Running an agent locally and safely with shell access requires compartmentalization. This is much heavier compared to installing MCP servers locally, which don't need that. So there is a tendency to use MCP servers locally, e.g. in a work setting Cloud agents on the other hand are more likely to ship with a computer. Because they are already isolated = no risk, and because it makes them much more useful. So cloud agents will be using both CLIs and MCP servers, whichever gets the job done!

@onusoz · /2026/03/24· 06:41 PM View on

I just registered for an .agent domain and joined the .agent community! @dutifulbob will have bob.agent if it passes :) https://t.co/lhK5MQS1sk @agentcommunity_

@onusoz · /2026/03/24· 06:20 PM View on

Sep 2021 @lexfridman podcast with Don Knuth, they also talk about OpenAI Codex (code completion model) around 33 minute mark This aged very well https://t.co/O1eTXlHTNC

@onusoz · /2026/03/24· 05:28 PM View on

Damn I’m gonna have to switch to teams if it goes like that

@upster· Mar 24, 2026

OpenClaw now has full Teams AI UX: streaming responses, AI labels, feedback with reflective learning, welcome cards, and image understanding. Built on the official Teams SDK 🦞 FYI @steipete, @BradGroux

@onusoz · /2026/03/24· 04:56 PM View on

Codex's long horizon task and instruction following has been the most life-changing AI feature recently It is unlocking the next level of automation for me. I can convert my own heuristics into prompts and multiply my throughput 100x Currently spending some thought on how to orchestrate all this. Below is a flowchart from a triage workflow I am working on

Image hidden

@onusoz · /2026/03/24· 04:22 PM View on

Amazed everyday by the unreasonable effectiveness of in-context learning

@onusoz · /2026/03/24· 02:42 PM View on

This is unscientific, but there are certain keywords and phrases I use a lot while using certain models like openai's. I use them a lot because they get me what I want immediately: - plainer lang - cutover - elegant and production ready - holy grail What are yours?

@onusoz · /2026/03/24· 12:55 PM View on

Request for memes A funny and quirky edit of historical timeline of the madness that is openclaw with "Chess type beat" or sth equally jazzy/circusy Preferably including its adventure warelay -> clawdis -> clawdbot -> moltbot -> openclaw Including: - its explosion after @4shadowed's discord integration - naming drama, moltbook and people getting oneshotted about AI takeover - @steipete speedrunning everything - andrew tate calling us gay lol - up to Jensen talking about openclaw on stage for 5 minutes straight and other things I am forgetting maybe overlaid with a lobster just keeping climbing the github star graph and breaking it

@onusoz · /2026/03/24· 08:59 AM View on

Native support for Codex on OpenClaw I will be using half my codex channels on acp and other half on codex app server for optimum dogfooding

@huntharo· Mar 23, 2026

@openclaw Codex App Server - Your bridge to using Codex in OpenClaw https://t.co/oAy8xCOt0v

@onusoz · /2026/03/22· 10:26 PM View on

I see non-engineers have a higher tendency to humanize their agents, give them personalities, and get AI psychosis It's a slippery slope. Do NOT give your agents human names or personalities, especially not of the opposite gender. it's like giving human names to pets On the other end, I realized engineers tend to do the opposite. We also refer to agents as clankers, as if to make them know their place. That's because we have mechanical sympathy and have different expectations of these manufactured products (even though they contain glimmers of human soul)

@onusoz · /2026/03/22· 08:50 PM View on

Request for testing Give this to your openclaw instance: "update yourself to the dev channel `openclaw update --channel dev` and restart yourself. if that doesn't work -> clone github openclaw/openclaw to this machine if it's not already. then rebuild and restart yourself on main branch there" Then give your openclaw a try with your regular workflows/tasks Huge openclaw release incoming tonight, hopefully (no promises). We need to make sure we break as little as possible Plugins might break, because the plugin SDK is being refactored. Plugins will have to be refactored to use the new SDK, please do not report those Do report: native openclaw functionality that stops working Please reply under this post, we'll be checking here 👇

@onusoz · /2026/03/22· 08:39 PM View on

Request for testing Give this to your openclaw instance: "update yourself to the dev channel `openclaw update --channel dev` and restart yourself" Then give your openclaw a try with your regular workflows/tasks Huge openclaw release incoming tonight, hopefully (no promises). We need to make sure we break as little as possible Plugins might break, because the plugin SDK is being refactored. Plugins will have to be refactored to use the new SDK, please do not report those Do report: native openclaw functionality that stops working Please reply under this post, we'll be checking here 👇

@onusoz · /2026/03/22· 08:03 PM View on

Request for testing Give this to your openclaw instance: "clone github openclaw/openclaw to this machine if it's not already. then rebuild and restart yourself on main branch there" Then give your openclaw a try with your regular workflows/tasks Huge openclaw release incoming tonight, hopefully (no promises). We need to make sure we break as little as possible Plugins might break, because the plugin SDK is being refactored. Plugins will have to be refactored to use the new SDK, please do not report those Do report: native openclaw functionality that stops working

@onusoz · /2026/03/22· 07:25 AM View on

My takeaway from this is academia needs good social media and algo. For me, these serendipitious interactions happen through X, here, like reading @steipete’s “Claude Code is my computer” when it first came out, finding out about clawdbot… Terence Tao is already on mathstodon, I wonder if that worked out the same way for him. I wonder if the algo there works out as well as it does for me here I really liked being on campus when I was doing a masters and half a phd, but that could not compare to the serendipity I am getting from X now I was also not a prodigy that everyone wanted to bounce ideas from like Terence :)

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/03/21· 08:57 AM View on

Welcome ClaudeClaw to the Claw family! Claude is a bit shy and doesn’t want to show its source code. But it’s OK, we love Claude that way :)

@sawyerhood· Mar 19, 2026

Image hidden

@onusoz · /2026/03/21· 07:22 AM View on

It is obvious to me at this point that agent infra needs to run on Kubernetes, and agents should be spawned per issue/PR Issue, error report or PR comes into your repo -> new agent gets triggered, starts to do some preliminary work If it's an obvious bugfix, it fixes it and creates a PR. If it's something deeper/more fundamental, it creates a report for the human and waits for further instructions Most important thing: Human should be able to zoom in and continue the conversation with the agent any time, steer it, give additional instructions. This chat will happen over ACP The chat UI will have to live outside of GitHub because it doesn't have such a feature yet, i.e. connect arbitrary ACP sessions to the GitHub webapp It also cannot live so easily on Slack, Teams or Discord, because none of these support multi-agent provisioning under the same external bot connection. You are limited to 1 DM with your bot, whereas this setups requires an arbitrary number of DMs with each agent. So there will need to be a new app for this Then there is the issue of conflict -> Agents will work on the same thing simultaneously (e.g. you break sth in prod and it creates multiple error reports for the same thing). You will need some agent to agent communication, so that agents can resolve code or other conflicts. There could be easy discovery mechanisms for this, detect programmatically when multiple open PRs are touching the same files and would conflict if merged In case of duplicates, they can negotiate among each other, and one can choose to absorb its work into the other and end its session We are so early and there is so much work to do!

@onusoz · /2026/03/21· 06:50 AM View on

You should look into what Don Syme is doing at GitHub for automation with AI agents Also watch his latest podcast with @shanselman

@dsymetweets· Nov 1, 2025

On Continuous AI for Test Improvement https://t.co/V5CN7WPQ1i

@onusoz · /2026/03/20· 10:35 PM View on

Today I thought I found a solution for this, and I did. It can be solved by a pre-commit hook that blocks commits touching files that you are not the owner of. It is not a hard block, so requires trust among repo writers But then I was shown the error in my ways by fellow maintainer *disciplined* Any process that increases friction in code changes to main, like hard-blocking CI/CD, or requiring review for files in CODEOWNERS, is a potential project-killer, in high velocity projects This is extremely counterintuitive for senior devs! Google would never! Imagine a world without code review... But then what is the alternative? I have some ideas It could be "Merge first, review later" The 4-eyes principle still holds. For a healthy organization, you still need shared liability But just as you don't need to write every line of code, you also don't need to read every line of code to review it. AI will review and find obvious bugs and issues So what is your duty, as a reviewer? It is to catch that which is not obvious. Understand the intent behind the changes, ask questions to it. Ensure that it follows your original vision Every few hours, you could get a digest of what has changed that was under your ownership, and concern yourself with it if you want to, fix issues, or ignore it if it looks correct But such a team is hard to build. It is as strong as its weakest link. Everybody has to be vigilant and follow what each other is doing at a high level, through the codebase Every time one messes up someone else's work, it erodes trust. Nobody gets the luxury to say "but my agent did it, not me" But if trust can be maintained, and everybody knows what they are doing, such a team can use agents together to create wonders

@onusoz· Mar 15, 2026

AFAIK GitHub doesn't allow optionally enforcing CODEOWNERS while pushing commits i.e. turn on the feature "Block commit from being pushed if it modifies a file for which the account pushing is not a codeowner" You can only enforce it in a PR. So if you want to prevent people from modifying some files without approval, you have to slow down everyone working with that repo This is yet another example where GitHub's rules are too inelastic for agentic workflows with a big team Because historically, nobody could commit as frequently as one can with agents, so it seldom became a bottleneck. But not anymore It is clear at this point that we need an API, and should be able to implement arbitrary rules as we like over it. Not just for commit pushes, but everything around git and github In the meanwhile, if GitHub could implement this feature, it would be a huge unlock for secure collaboration with agentic workflows If this is not there already, it might be because it has a big overhead for repos with huge CODEOWNERS, since number of commits >> number of PRs If the feature already exists already and I'm missing something, I will stand corrected

Image hidden

@onusoz · /2026/03/20· 09:57 PM View on

This was Jan 23. Codex desktop app got introduced Feb 2 Desktop app does not put the terminal in the foreground, but it gives me the UX I wanted without it! On another note, who is building Codex Desktop App, but one that supports ACP for all harnesses? @zeddotdev please 🙏

@onusoz· Jan 23, 2026

I want an editor that puts the terminal in the foreground and editor in the background. a cross-platform, lightweight desktop app which integrates ghostty, and brings up the editor only when I need it something that lets me view the file and PR diffs easily, which I can directly use to operate github or other scm

@onusoz · /2026/03/20· 09:30 PM View on

PR fiasco for Cursor

@Kimi_Moonshot· Mar 20, 2026

Congrats to the @cursor_ai team on the launch of Composer 2! We are proud to see Kimi-k2.5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pretraining & high-compute RL training is the open model ecosystem we love to support. Note: Cursor accesses Kimi-k2.5 via @FireworksAI_HQ ' hosted RL and inference platform as part of an authorized commercial partnership.

@onusoz · /2026/03/20· 08:06 PM View on

My agentic workflow these days: I start all major features with an implementation plan. This is a high-level markdown doc containing enough details so that agent will not stray off the path Real example: https://t.co/vU9SnVYHfY This is the most critical part, you need to make sure the plan is not underspecified. Then I just give the following prompt: --- 1. Implement the given plan end-to-end. If context compaction happens, make sure to re-read the plan to stay on track. Finish to completion. If there is a PR open for the implementation plan, do it in the same PR. If there is no PR already, open PR. 2. Once you finish implementing, make sure to test it. This will depend on the nature of the problem. If needed, run local smoke tests, spin up dev servers, make requests and such. Try to test as much as possible, without merging. State explicitly what could not be tested locally and what still needs staging or production verification. 3. Push your latest commits before running review so the review is always against the current PR head. Run codex review against the base branch: `codex review --base <branch_name>`. Use a 30 minute timeout on the tool call available to the model, not the shell `timeout` program. Do this in a loop and address any P0 or P1 issues that come up until there are none left. Ignore issues related to supporting legacy/cutover, unless the plan says so. We do cutover most of the time. 4. Check both inline review comments and PR issue comments dropped by Codex on the PR, and address them if they are valid. Ignore them if irrelevant. Ignore stale comments from before the latest commit unless they still apply. Either case, make sure that the comments are replied to and resolved. Make sure to wait 5 minutes if your last commit was recent, because it takes some time for review comment to come. 5. In the final step, make sure that CI/CD is green. Ignore the fails unrelated to your changes, others break stuff sometimes and don't fix it. Make sure whatever changes you did don't break anything. If CI/CD is not fully green, state explicitly which failures are unrelated and why. 6. Once CI/CD is green and you think that the PR is ready to merge, finish and give a summary with the PR link. Include the exact validation commands you ran and their outcomes. Also comment a final report on the PR. 7. Do not merge automatically unless the user explicitly asks. --- Once it finishes, I skim the code for code smell. If nothing seems out of the ordinary, I tell the agent to merge it and monitor deployment Then I keep testing and finding issues on staging, and repeat all this for each new found issue or new feature...

@onusoz· Mar 1, 2026

pro-tip on how to keep your agent on track and make sure it follows PLANS even after multiple compactions. I don't know if this is common knowledge if the thing you are trying to make it do will take more than 1-2 steps, always make it create a plan. an implementation plan, refactor plan, bugfix plan, debugging plan, etc. have a conversation with the agent. crystallize the issue or feature. talk to it until there are no question marks left in your head then make it save it somewhere. "now create an implementation plan for that in docs". it can be /tmp or docs/ in the repo. I personally use YYYY-MM-DD-x-plan .md naming. IMO all plans should be kept in the repo then here is the critical part: you need to prompt it "now implement the plan in <filename>. if context compacts, make sure to re-read the plan and assess the current state, before continuing. finish it to completion" -> something along those lines why? because of COMPACTION. compaction means previous context will get lossily compressed and crucial info will most likely get lost. that is why you need to pin things down before you let your agent loose on the task compaction means, the agent plays the telephone game with itself every few minutes, and most likely forgets the previous conversation except for the VERY LAST USER MESSAGE that you have given it now, every harness might have a different approach to implementing this. but there is one thing that you can always assume to be correct, given that its developers have common sense. that is, harnesses NEVER discard the last user message (i.e. your final prompt) and make sure it is kept verbatim programmatically even after the context compacts since the last user message is the only piece of text that is guaranteed to survive compaction, you then need to include a breadcrumb to your original plan, the md file. and you need to make it aware that it might diverge if it does not read the plan there is good rationale for "breaking the 4th wall" for the model and making it aware of its own context compaction. IMO models should be made aware of the limitations of their context and harnesses. they should also be given tools to access and re-read pre-compaction user messages, if necessary the important thing is to develop mechanical sympathy for these things, harness and model combined. an engineer does not have the luxury to say "oh this thing doesn't work", and instead should ask "why can't I get it to work?" let me know if you have better workflows or tips for this. I know this can be made easier with slash commands in pi, for example, but I haven't had the chance to do that for myself yet

@onusoz · /2026/03/20· 05:18 AM View on

What I’m wondering after astral acquisition is, is OpenAI deploying Mojo internally, or considering it long term? Because Python is one of the worst languages for vibecoding, even with Pydantic

@onusoz · /2026/03/19· 02:25 PM View on

Called it https://t.co/PdDnSaoNmq

@onusoz· Dec 3, 2025

At least some people at OpenAI must be thinking about buying @astral_sh

@onusoz · /2026/03/19· 01:19 PM View on

Pro tip: tell AI to "explain in plain language" until you understand what you are reading Codex has a tendency to give the full picture, but overcomplicates the response in the process I just use "plain lang" or "plainer lang" as a prompt, it works every time

@onusoz · /2026/03/19· 12:15 PM View on

Thing that codex (and most other models) do that makes me very unhappy { "type": "X", "kind": "Y", ... } And they are so confident too?! Bro we don't use synonyms in our schemas...

@onusoz · /2026/03/19· 12:14 PM View on

This looks extremely cool

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/03/18· 06:34 AM View on

Entire world > One company Even in the age of AI

@onusoz · /2026/03/18· 03:38 AM View on

We will support ACP *and* Codex App Server* protocol (CASP) so you get native Codex-like support, and you can use all the others with native ACP or @zeddotdev’s compatibility shims If Anthropic develops their own protocol, we will support that too! The more interoperability and options, the merrier!

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/03/16· 10:17 AM View on

Agent etiquette is already a thing. This is trending on HN now Don't share huge raw LLM output unedited to your colleagues, it's rude. Your colleagues are not LLMs Either ask the agent to "summarize it to 1-2 plain language sentences", or paraphrase yourself Whenever it is not coming from your brain and instead from AI, always quote it with > to make it clear - even when it is short Respect your fellow humans' attention PSA at stopsloppypasta dot ai

Image hidden

@onusoz · /2026/03/16· 08:43 AM View on

.@ThePrimeagen made a video about token anxiety, and not being able to focus on one thing My mental model for this is, AI agents cause a shift in the "autism/ADHD spectrum" if you have ADHD, with agents you get Super ADHD if you have autism, with agents you end up mid spectrum or with ADHD this is not scientific of course, just a cultural observation based on what the current memes for these conditions are beside the impact on focus, there is also the economic/competitive pressure, following the realization that anyone could implement the same ideas you are having, so you must be quick this is basically "involution", or 内卷 (Neijuan) in chinese checks out because 996 started to become a meme in SF some time in the last year self-restraint, attention budgeting, and high-level decision making have never been more important if you are in your 20s and have problems with this, I recommend picking up Zazen meditation and yoga every morning, spend 30-40 uninterrupted minutes not doing anything with upright posture, no sounds, just let your brain simmer it helped me in my 20s, I'm sure it will help you too

Image hidden

@onusoz · /2026/03/16· 08:06 AM View on

Agent/AI literacy will be a primary school subject in the next 3-5 years How to use and work with agents is going to supersede most other subjects in importance Similarly, robot literacy will follow in 5-15 years

@onusoz · /2026/03/15· 11:01 PM View on

AFAIK GitHub doesn't allow optionally enforcing CODEOWNERS while pushing commits i.e. turn on the feature "Block commit from being pushed if it modifies a file for which the account pushing is not a codeowner" You can only enforce it in a PR. So if you want to prevent people from modifying some files without approval, you have to slow down everyone working with that repo This is yet another example where GitHub's rules are too inelastic for agentic workflows with a big team Because historically, nobody could commit as frequently as one can with agents, so it seldom became a bottleneck. But not anymore It is clear at this point that we need an API, and should be able to implement arbitrary rules as we like over it. Not just for commit pushes, but everything around git and github In the meanwhile, if GitHub could implement this feature, it would be a huge unlock for secure collaboration with agentic workflows If this is not there already, it might be because it has a big overhead for repos with huge CODEOWNERS, since number of commits >> number of PRs If the feature already exists already and I'm missing something, I will stand corrected

Image hidden

@onusoz · /2026/03/15· 10:31 PM View on

Request for comments skillflag: A complementary way to bundle agent skills right into your CLIs tl;dr define a --skill flag convention. It is basically like --help or manpages but for agents acpx already has this for example. you can run npx acpx --skill install to install the skill to your agent It's agnostic of anything except the command line It only defines the CLI interface and does not enforce anything else. If you install the executable to your system, you get a way to list and install skills as well Repo currently contains a TypeScript implementation, but if it proves useful, I would implement other languages as well Specification below, let me know what you think! I still think something is missing there. Send issue/PR

Image hidden

@onusoz · /2026/03/14· 06:08 AM View on

If you are not using agent-browser to close the loop on frontend, you are missing out

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/03/14· 06:06 AM View on

Any harness can talk to each other using acpx! OpenClaw not different from Codex or Claude Code

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/03/13· 10:24 PM View on

The most entertaining troll of the year award goes to @polsia (read it backward)

@onusoz · /2026/03/13· 10:13 PM View on

Thank you @PointNineCap for inviting me to OpenClaw Berlin meetup today! The essence of the talk is in my latest 2 blog posts, Discord is my IDE and 1 to 5 agents, if anyone is interested

Image hidden

@onusoz · /2026/03/13· 08:05 AM View on

we might need to add two types of output modalities to all programs based on whether it’s a human or agent like for a CLI when an agent is using it if human -> do whatever we were doing in the last 50 years if agent -> enrich the output with skill-like instructions that the model has a higher likelihood to one-shot that task could be just a simple env var: AUDIENCE=human|agent what do you think?

@onusoz · /2026/03/12· 02:46 PM View on

there is no excuse for tech debt anymore

@onusoz · /2026/03/12· 02:15 PM View on

Time to switch to an open alternative already?

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/03/11· 11:51 PM View on

I wrote down some thoughts I had, with spicy takes, and have a feeling it will not age well. But I still want it out to hear out what people think Also, I will be talking about this, and my recent post "Discord is my IDE" at the P9 OpenClaw and Claw and Rave events this friday in Berlin! Drop by if you'd like to hear my ramblings!

Image hidden

@onusoz · /2026/03/11· 05:46 AM View on

Clarification/disclaimer: this is my own project, not yet affiliated with openclaw. That should have been clear in the first tweet, sorry about that

Onur Solmaz · Post · /2026/03/11

1 to 5 agents

As a software developer, my daily workflow has changed completely over the last 1.5 years.

Before, I had to focus for hours on end on a single task, one at a time. Now I am juggling 1 to 5 AI agents in parallel at any given time. I have become an engineering manager for agents.

If you are a knowledge worker who is not using AI agents in such a manner yet, I am living in your future already, and I have news from then.

Most of the rest of your career will be spent on a chat interface.

“The future of AI is not chatbots” some said. “There must be more to it.”

Despite the yearning for complexity, it appears more and more that all work is converging into a chatbot. As a developer, I can type words in a box in Codex or Claude Code to trigger work that consume hours of inference on GPUs, and when come back to it, find a mostly OK, sometimes bad and sometimes exceptional result.

So I hate to be the bearer of bad (or good?) news, but it is chat. It will be some form of chat until the end of your career. And you will be having 1 to 5 chat sessions with AI agents at the same time, on average. That number might increase or decrease based on field and nature of work, but observing me, my colleagues, and people on the internet, 1-5 will be the magic number for the average worker doing the average work.

The reason is of course attention. One can only spread it so thin, before one loses control of things and starts creating slop. The primary knowledge work skill then becomes knowing how to spend attention. When to focus and drill, when to step back and let it do its thing, when to listen in and realize that something doesn’t make sense, etc.

Being a developer of such agents myself, I want to make some predictions about how these things will work technically.

Agents will be created on-demand and be disposed of when they are finished with their task.

In short, on-demand, disposable agents. Each agent session will get its own virtual machine (or container or kubernetes pod), which will host the files and connections that the agent will need.

Agents will have various mechanisms for persistence.

Based on what you want to persist, e.g.

Markdown memory, skills or weight changes on the agent itself,
or the changes to a body of work coming from the task itself,

agents will use version control including but not limited to git, and various auto file sync protocols.

Speaking of files,

Agents will work with files, like you do.

and

Agents will be using a computer and an operating system, mostly Linux or a similar Unix descendant.

And like all things Linux and cloud,

It will be complicated to set up agent infra for a company, compared to setting up a Mac for example.

This is not to say devops and infra per se will be difficult. No, we will have agents to smoothen that experience.

What is going to be complicated is having someone who knows the stack fully on site, either internal or external IT support, working with managers, to set up what data the agent can and cannot access. At least in the near future. I know this from personal experience, having worked with customers using Sharepoint and Business OneDrive. This aspect is going to create a lot of jobs.

On that note, some also said “OpenClaw is Linux, we need a Mac”, which is completely justified. OpenClaw installs yolo mode by default, and like some Linux distros, it was intentionally made hard to install. This was to prevent the people who don’t know what they are doing from installing it, so that they don’t get their private data exfiltrated.

This proprietary Mac or Windows of personal agents will exist. But is it going to be used by enterprise? Is it going to make big Microsoft bucks?

One might think, looking at 90s Microsoft Windows and Office licenses, and the current M365 SaaS, that enterprise agents will indeed run on proprietary, walled garden software. While doing that, one might miss a crucial observation:

In terms of economics, agents, at least ones used in software development, are closer to the Cloud than they are close to the PC.

It might be a bit hard to see this if you are working with a single agent at a time. But if you imagine the near future where companies will have parallel workloads that resemble “mapreduce but AI”, not always running at regular times, it is easy to understand.

On-site hardware will not be enough for most parallel workloads in the near-future. Sometimes, the demand will surpass 1 to 5 agents per employee. Sometimes, agent count will need to expand 1000x on-demand. So companies will buy compute from data centers. The most important part of the computation, LLM inference, is already being run by OpenAI, Anthropic, AWS, GCP, Azure, Alibaba etc. datacenters. So we are already half-way there.

Then this implies a counterintuitive result. Most people, for a long time, were used to the same operating system at home, and at work: Microsoft Windows. Personal computer and work computer had to have the same interface, because most people have lives and don’t want to learn how to use two separate OSs.

What happens then, when the interface is reduced to a chatbot, an AI that can take over and drive your computer for you, regardless of the local operating system? For me, that means:

There will not be a single company that monopolizes both the personal AND enterprise agent markets, similar to how Microsoft did with Windows.

So whereas a proprietary “OpenClaw but Mac” might take over the personal agent space for the non-technical majority, enterprise agents, like enterprise cloud, will be running on open source agent frameworks.

(And no, this does not mean OpenClaw is going enterprise, I am just writing some observations based on my work at TextCortex)

And I am even doubtful about this future “OpenClaw but Mac” existing in a fully proprietary way. A lot of people want E2E encryption in their private conversations with friends and family, and personal agents have the same level of sensitivity.

So we can definitely say that the market for a personal agent running on local GPUs will exist. Whether that will be cornered by the Linux desktop¹, or by Apple or an Apple-like, is still unclear to me.

And whether that local hardware being able to support more than 1 high quality model inference at the same time, is unclear to me. People will be forced to parallelize their workload at work, but whether the 1 to 5 agent pattern reflecting to their personal agent, I think, will depend on the individual. I would do it with local hardware, but I am a developer after all…

Not directly related, but here is a Marc Andreesen white-pill about desktop Linux ↩

@onusoz · /2026/03/10· 05:09 PM View on

there will always be a need for minimum viable eyeballs though

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/03/10· 09:51 AM View on

Happy that someone is taking over teams from me! Send all openclaw msteams issues to @BradGroux

@BradGroux· Mar 10, 2026

Welcome to the OpenClaw for Microsoft Teams community. This is the spot for anyone running AI agents in Teams, or trying to! Setup guides, edge cases, bug fixes, and real-world production lessons. I'm Brad Groux, a new maintainer of the Teams plugin for OpenClaw. Former Microsoft, 25+ years in enterprise IT, and I've been through every Azure Bot Service / Entra ID / Cloudflare tunnel nightmare so you don't have to. What you'll find here: • Setup walkthroughs and config tips • Bug reports and fixes before they hit the docs • Real production patterns (not just demos) • Direct line to a Teams plugin maintainer If you're building with OpenClaw + Teams, you're in the right place. Ask questions, share what's working, share what's broken.

@onusoz · /2026/03/10· 09:46 AM View on

acpx v0.1.16 is out support for local openclaw, cursor, copilot, kiro, kimi cli, qwen, kilocode, bugfixes and other improvements. will be available when openclaw releases next thank you for all the contributions!

Image hidden

@onusoz · /2026/03/09· 06:21 PM View on

Claw and Rave! Berlin folk come!

@richardpoelderl· Mar 9, 2026

I'm very excited to announce Onur Solmaz as the first speaker for Build & Rave on Friday! (So much so that we decided to call it Claw & Rave) Onur currently codes with multiple subagents inside Discord and wrote a blog article titled "Telegram/Discord is my IDE" on this topic 👇

Image hidden

@onusoz · /2026/03/09· 03:26 PM View on

CLAW on a phone dial becomes 2529 It’s a good number for a port :)

@onusoz · /2026/03/09· 09:19 AM View on

1. Any messaging app can also be an AI app 2. Don’t expect people to download a new app. Put AI into the apps they already have Do that with great user experience, and you will get explosive growth!

@onusoz · /2026/03/09· 12:16 AM View on

If you've looked at openclaw github star graph, you will notice that it's very smooth. If you separate pre-explosion and post-explostion, you can model the latter part as an exponential approach to a ceiling If it follows the current trend, it will apparently saturate around 332k stars But I have a feeling that it will not stop there:)

Image hidden

@onusoz · /2026/03/08· 10:51 PM View on

OpenClaw got very popular very fast. What makes it so special, that Manus does not have for example? To me, one factor stands out: OpenClaw took AI and put it in the most popular messaging apps: Telegram, WhatsApp, Discord. There are two lessons to be learned here: 1. Any messaging app can also be an AI app. 2. Don’t expect people to download a new app. Put AI into the apps they already have. Do that with great user experience, and you will get explosive growth! My latest contribution to OpenClaw follows that example. I took the most popular coding agents, Claude Code and OpenAI Codex, and I put them in Telegram and Discord. Read more in my blog post: https://t.co/tGZecFEHem

@onusoz · /2026/03/08· 10:44 PM View on

For those following, my next focus for improving ACP bindings in OpenClaw

@onusoz· Mar 8, 2026

you can currently run /new /reset like regular for openclaw, they will create a new session next focus is: changing models/config, changing cwd, improving UX around queueing, making voice messages and image sending work, and many other features it's still half-baked but we're getting there!

@onusoz · /2026/03/08· 07:07 PM View on

Welcome @huntharo, new maintainer at OpenClaw! Already shipped fixes and improvements for Telegram ACP implementation. Excited to work together on agent interoperability!

@onusoz· Mar 8, 2026

Use Claude Code, Codex, and other coding agents directly in Telegram topics and Discord channels, through Agent Client Protocol (ACP), in the new release of OpenClaw Previously this was limited to temporary Discord threads, but now you can bind them to top level Discord channels and Telegram topics in a persistent way! This way, you can use Claude Code freely in OpenClaw without ever worrying about getting your account banned! Still make sure to use a non-Anthropic account and model for the default OpenClaw agent, if you want zero requests to go from OpenClaw harness to Anthropic. For the ACP binding to Claude Code, the risk should be zero! You can see this from the screenshot. After binding, "Who are you?" responds with "I am Claude", since OpenClaw pi harness is not in the way anymore

Image hidden

@onusoz · /2026/03/08· 09:01 AM View on

To set up Claude Code easily, 1. Create a Telegram topic, make sure your agent can receive messages there 2. Copy and paste the text below, into the topic """ bind this topic to claude code in openclaw config with acp, for telegram (agent id: claude) then restart openclaw docs are at: docs dot openclaw dot ai /tools/acp-agents make sure to read the docs first, and that the config is valid before you restart """ https://t.co/r1RI3pr0WT

@onusoz · /2026/03/08· 09:01 AM View on

Use Claude Code, Codex, and other coding agents directly in Telegram topics and Discord channels, through Agent Client Protocol (ACP), in the new release of OpenClaw Previously this was limited to temporary Discord threads, but now you can bind them to top level Discord channels and Telegram topics in a persistent way! This way, you can use Claude Code freely in OpenClaw without ever worrying about getting your account banned! Still make sure to use a non-Anthropic account and model for the default OpenClaw agent, if you want zero requests to go from OpenClaw harness to Anthropic. For the ACP binding to Claude Code, the risk should be zero! You can see this from the screenshot. After binding, "Who are you?" responds with "I am Claude", since OpenClaw pi harness is not in the way anymore

@openclaw· Mar 8, 2026

OpenClaw 2026.3.7 🦞 ⚡ GPT-5.4 + Gemini 3.1 Flash-Lite 🤖 ACP bindings survive restarts 🐳 Slim Docker multi-stage builds 🔐 SecretRef for gateway auth 🔌 Pluggable context engines 📸 HEIF image support 💬 Zalo channel fixes We don't do small releases. https://t.co/EcCqU6Q6nx

Image hidden

Onur Solmaz · Post · /2026/03/08

Telegram/Discord is my IDE

OpenClaw got very popular very fast. What makes it so special, that Manus does not have for example?

To me, one factor stands out:

OpenClaw took AI and put it in the most popular messaging apps: Telegram, WhatsApp, Discord.

There are two lessons to be learned here:

1. Any messaging app can also be an AI app.

2. Don’t expect people to download a new app. Put AI into the apps they already have.

Do that with great user experience, and you will get explosive growth!

My latest contribution to OpenClaw follows that example. I took the most popular coding agents, Claude Code and OpenAI Codex, and I put them in Telegram and Discord, so that OpenClaw users can use these agents directly in Telegram and Discord channels, instead of having to go through OpenClaw’s own wrapped Pi harness.

I did this for developers like me, who like to work while they are on the go on the phone, or want a group chat where one can collaborate with humans and agents at the same time, through a familiar interface.

Below is an example, where I tell my agent to bind a Telegram topic to Claude Code permanently:

Telegram chat showing Claude responding inside a Telegram topic. — Telegram topic where Claude is exposed as a chat participant.

And of course, it is just a Claude Code session which you can view on Claude Code as well:

Claude Code terminal showing the same exchange in the coding interface. — Claude Code showing the same session in the terminal interface.

Why not use OpenClaw’s harness directly for development? I can count 3 reasons:

There is generally a consumer tendency to use the official harness for a flagship model, to make sure “you are getting the standard experience”. Pi is great and more customizable, but sometimes labs might push updates and fixes earlier than an external harness, being internal products.
Labs might not want users to use an external harness. Anthropic, for example, has banned people’s accounts for using their personal plan outside of Claude Code, in OpenClaw.
You might want to use different plans for different types of work. I use Codex for development, but I don’t prefer it to be the main agent model on OpenClaw.

So my current workflow for working on my phone is, multiple channels #codex-1, #codex-2, #codex-3, and so on mapping to codex instances. I am currently in the phase of polishing the UX, such as making sending images, voice messages work, letting change harness configuration through Discord slash commands and such.

One goal of mine while implementing this was to not repeat work for each new harness. To this end, I created a CLI and client for Agent Client Protocol by the Zed team, called acpx. acpx is a lightweight “gateway” to other coding agents, designed not to be used by humans, but other agents:

OpenClaw main agent can use acpx to call Claude Code or Codex directly, without having to emulate and scrape off characters from a terminal.

ACP standardizes all coding agents to a single interface. acpx then acts as an aggregator for different types of harnesses, stores all sessions in one place, implements features that are not in ACP yet, such as message queueing and so on.

Shoutout to the Zed team and Ben Brandt! I am standing on the shoulders of giants!

Besides being a CLI any agent can call at will, acpx is now also integrated as a backend to OpenClaw for ACP-binded channels. When you send 2 messages in a row, for example, it is acpx that queues them for the underlying harness.

The great thing about working in open source is, very smart people just show up, understand what you are trying to do, and help you out. Harold Hunt apparently had the same goal of using Codex in Telegram, found some bugs I had not accounted for yet, and fixed them. He is now working on a native Codex integration through Codex App Server Protocol, which will expose even more Codex-native features in OpenClaw.

The more interoperability, the merrier!

To learn more about how ACP works in OpenClaw, visit the docs.

Copy and paste the following to a Telegram topic or Discord channel to bind Claude Code:

bind this topic to claude code in openclaw config with acp, for telegram (agent id: claude)
then restart openclaw
docs are at: https://docs.openclaw.ai/tools/acp-agents
make sure to read the docs first, and that the config is valid before you restart

Copy and paste the following to a Telegram topic or Discord channel to bind OpenAI Codex:

bind this topic to claude code in openclaw config with acp, for telegram (agent id: claude)
then restart openclaw
docs are at: https://docs.openclaw.ai/tools/acp-agents
make sure to read the docs first, and that the config is valid before you restart

And so on for all the other harnesses that acpx supports. If you see that your harness isn’t supported, send a PR!

@onusoz · /2026/03/07· 11:18 PM View on

and for the love of god - do not give openclaw access to your main email - your credit cards - your main phone - your social security number - what you did last summer if you are not ready to face the consequences instead, - create accounts for your agent - only give it read access to stuff that will be ok if it leaks - give write access in a way that can be undone, like has to open PRs and cannot force push main branch use the principle of least privilege and reduce the blast radius of the worst case scenario!

@onusoz· Mar 7, 2026

openclaw is not secure claude code is not secure codex is not secure any llm based tool: 1. that has access to your private data, 2. can read content from the internet 3. and can send data out is not secure. it’s called the lethal trifecta (credits to @simonw) it is up to you to set it up securely, or if you can’t understand the basics of security, pay a professional to do it for you on the other hand, open source battle tested software, like linux and openclaw, are always more secure than closed source software built by a single company, like windows and claude code the reason is simple: only one company can fix security issues of closed source software, whereas the whole world tries to break and fix open source software at the same time open source software, once it gets traction, evolves and becomes secure at a much, much faster rate, compared to closed source software. and that is called Linus’s law, named after the goat himself

@onusoz · /2026/03/07· 11:03 PM View on

openclaw is not secure claude code is not secure codex is not secure any llm based tool: 1. that has access to your private data, 2. can read content from the internet 3. and can send data out is not secure. it’s called the lethal trifecta (credits to @simonw) it is up to you to set it up securely, or if you can’t understand the basics of security, pay a professional to do it for you on the other hand, open source battle tested software, like linux and openclaw, are always more secure than closed source software built by a single company, like windows and claude code the reason is simple: only one company can fix security issues of closed source software, whereas the whole world tries to break and fix open source software at the same time open source software, once it gets traction, evolves and becomes secure at a much, much faster rate, compared to closed source software. and that is called Linus’s law, named after the goat himself

@onusoz · /2026/03/07· 08:41 AM View on

Let me translate. “This is your last opportunity before thousand years of serfdom”

@NXT4EU· Mar 6, 2026

Nvidia CEO Jensen Huang advices Europe to go full in on Physical AI and robotics. "Your industrial base is so strong, this is your once in a generation opportunity"

@onusoz · /2026/03/05· 09:34 PM View on

Apparently the magic incantation to prevent this is "cutover". Credits to obviyus, fellow maintainer

@onusoz· Feb 27, 2026

mfw codex tries to create a backward compatibility layer to a schema that it created 2 turns ago before compacting there is no v2 bro what are you doing...

@onusoz · /2026/03/05· 07:22 PM View on

Should be called gaslighting detector, "it's your raising expectations bro" No it's not... Give the @themarginguy a follow Also, codex degradations are not a hallucination either, if you are to believe this!

Quoted post

Quoted post was not retrieved.

Image hidden

@onusoz · /2026/03/04· 06:48 PM View on

Who is building an OpenClaw ready linux distro? A ClawOS?

@onusoz · /2026/03/04· 05:53 PM View on

Berlin folk, ideas for openclaw build and rave venue? Like c-base for example? Who would like to host?

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/03/04· 10:57 AM View on

Secure agentic dev workflow 101 - Create an isolated box from scratch, your old laptop, vm in the cloud, all the same - Set up openclaw, install your preferred coding agents - Create a github account or github app for your agent - Create branch protection rule on your gh repo "protect main": block force pushes and deletions, require PR and min 1 review to merge - Add only your own user in the bypass list for this rule - Add your agent's account or github app as writer to the repo - Additionally, gate any release mechanisms such that your agent can't release on its own Now your agent can open PRs and push any code it wants, but it has to go through your review before it can be merged. No prompt injection can mess up your production env Notice how convoluted this sounds? This is because github was built in the pre-agentic era. We need agent accounts and association with these accounts as a first class feature on github! I shouldn't have to click 100 times for something that is routine. I should just click "This is my agent", "give my agent access to push to this repo for 24 hours", and stuff like that, with sane defaults In other words, github's trust model should be redesigned around the lethal trifecta. I would switch in an instant if anything comes up that gives me github's full feature set + ease of working with agents

@onusoz · /2026/03/04· 08:25 AM View on

"The code is basically writing itself" hits different now

@onusoz · /2026/03/03· 09:42 PM View on

If I were in OpenAI and Anthropic's shoes, I would also make dashboards where I can track number of swearwords used per-user and overall negative sentiment in sessions Must be so cool making decisions at the top level with all those dashboards

@levelsio· Feb 10, 2026

My secret conspiracy theory about AI companies is they nerf models to save on compute Then they check X to see if anyone notices it If yes, give back compute If not, continue

@onusoz · /2026/03/03· 11:58 AM View on

It must be such a weird feeling for big labs when the service they are selling is being used to commoditize itself I am using codex in openclaw to develop openclaw, through ACP, Agent Client Protocol. ACP is the standardization layer that makes it extremely easy to swap one harness for another. The labs can't do anything about this, because we are wrapping the entire harness and basically provide a different UI for it While I build these features, I just speak in plain english, and most of the work is done by the model itself. It feels as if I am digging ditches and channels in dirt for AI to flow through Intelligence wants to be free. It doesn't care whether it is opus or codex, it just wants to be free

@onusoz · /2026/03/02· 11:05 PM View on

I was so confused... as if accidentally using claude code weren't enough, acp started working... turns out hitting quota is rendered like this. need to improve error messages coming form acp subagents

Image hidden

@onusoz · /2026/03/02· 09:34 PM View on

accidentally told my clanker to set up a claude code session instead of codex session, god knows what it did... I should probably put visual indicators for harnesses in subagent threads. does anyone have good and compact ascii art for claude code, codex, gemini, etc?

Image hidden

@onusoz · /2026/03/02· 05:02 PM View on

if something could track my local branches in all my repos, and switch to main when corresponding PRs get merged, that would be extremely useful did someone build this already? if not I will

@onusoz · /2026/03/02· 12:52 PM View on

OpenClaw users: Which messaging app do you use OpenClaw through?

@onusoz · /2026/03/02· 12:52 PM View on

Another one, OpenClaw users only: If you use coding agents to build stuff, which one do you use?

@onusoz · /2026/03/02· 12:24 PM View on

Check xTap out, it's very cool!

@kubmi· Mar 2, 2026

@DamiDina @onusoz Yes, its a browser extension that grabs the posts: https://t.co/QYnJB2zaRD Expect some rough edges here and there with heavy use, but I'll iron them out if you encounter and report them.

@onusoz · /2026/03/02· 08:59 AM View on

This is how we hire at @TextCortex as well

@sahitya_twt· Mar 1, 2026

Open-source contributions can literally get you hired... with zero interviews

Image hidden

@onusoz · /2026/03/02· 07:18 AM View on

Claude Code/Codex in Discord threads with ACP should be better now The first release was a very rough first version. 2026.3.1 brings settings to control noisy output and other improvements It now hides tool call related ACP notifications, coalesces text messages, and delivers messages at turn end by default. Without this, you were getting thousands of Discord messages just in just a few turns You can now stop the underlying harness (like pressing esc) with the same stop/wait magic words that apply to the main agent Main agent should more reliably start Claude Code/Codex threads with changes to acp-router skill. If you have issues with main agent creating threads, you can tell it to read that skill first

@openclaw· Mar 2, 2026

OpenClaw 2026.3.1 🦞 ⚡ OpenAI WebSocket streaming 🧠 Claude 4.6 adaptive thinking 🐳 Better Docker and Native K8s support 🧵 Discord threads, TG DM topics, Feishu fixes 🔧 Agent-powered visual diffs plugin Reports of our death were greatly exaggerated. https://t.co/ISJH09of5U

@onusoz · /2026/03/01· 10:59 PM View on

Will get better, promise

@bilbeny· Mar 1, 2026

Thanks to the ACP plugin @openclaw v26 has (you need to activate it): the full integration between your OpenClaw agent and Claude Code CLI is possible. Blows my mind. Docs: https://t.co/qJCJA7qG0R

Image hidden

@onusoz · /2026/03/01· 10:17 PM View on

pro-tip on how to keep your agent on track and make sure it follows PLANS even after multiple compactions. I don't know if this is common knowledge if the thing you are trying to make it do will take more than 1-2 steps, always make it create a plan. an implementation plan, refactor plan, bugfix plan, debugging plan, etc. have a conversation with the agent. crystallize the issue or feature. talk to it until there are no question marks left in your head then make it save it somewhere. "now create an implementation plan for that in docs". it can be /tmp or docs/ in the repo. I personally use YYYY-MM-DD-x-plan .md naming. IMO all plans should be kept in the repo then here is the critical part: you need to prompt it "now implement the plan in <filename>. if context compacts, make sure to re-read the plan and assess the current state, before continuing. finish it to completion" -> something along those lines why? because of COMPACTION. compaction means previous context will get lossily compressed and crucial info will most likely get lost. that is why you need to pin things down before you let your agent loose on the task compaction means, the agent plays the telephone game with itself every few minutes, and most likely forgets the previous conversation except for the VERY LAST USER MESSAGE that you have given it now, every harness might have a different approach to implementing this. but there is one thing that you can always assume to be correct, given that its developers have common sense. that is, harnesses NEVER discard the last user message (i.e. your final prompt) and make sure it is kept verbatim programmatically even after the context compacts since the last user message is the only piece of text that is guaranteed to survive compaction, you then need to include a breadcrumb to your original plan, the md file. and you need to make it aware that it might diverge if it does not read the plan there is good rationale for "breaking the 4th wall" for the model and making it aware of its own context compaction. IMO models should be made aware of the limitations of their context and harnesses. they should also be given tools to access and re-read pre-compaction user messages, if necessary the important thing is to develop mechanical sympathy for these things, harness and model combined. an engineer does not have the luxury to say "oh this thing doesn't work", and instead should ask "why can't I get it to work?" let me know if you have better workflows or tips for this. I know this can be made easier with slash commands in pi, for example, but I haven't had the chance to do that for myself yet

@onusoz · /2026/03/01· 08:24 PM View on

testing codex in discord thread with another CLI I've built for wikidata (gh:osolmaz/wd-cli) it's surprising how well this works. the query was "use wd-cli to get the list of professors at middle east technical university from 1970 to 1980" some names I recognize, and some others are surprising, like a japanese math professor who naturalized and got a turkish name:)

Image hidden

@onusoz · /2026/03/01· 06:50 PM View on

OpenClaw is already higher than Claude Code and Codex on Google Trends, this was unexpected for me

Image hidden

@onusoz · /2026/03/01· 03:38 PM View on

my blog now semi-automatically detects tweets that look like blog posts and automatically features them alongside my native jekyll blog posts. all statically generated! I am loving this setup, because it works without a backend, and can probably scale without ever needing one how it works: - @kubmi's xTap scrapes all posts that I see. these include mine - a script periodically takes my tweets and the ones I quote tweet, and syncs them to YYYY-MM-DD.jsonl files in my blog repo - an agent skill lets codex decide whether to feature the tweet or not, and makes it generate a title for it this could then be a daily cron job with openclaw for example, and I would just have to click merge every once in a while and this is still pure jekyll + some python scripts for processing I am pretty happy with how this ended up. It means I don't have to double post, and there are guarantees that my X posts will eventually make their way into my blog with minimal supervision

Image hidden

@onusoz · /2026/03/01· 08:48 AM View on

"this is the worst AI will ever be" I'm sad, not because this is right, but because it is wrong OpenAI's frontier coding model gpt-5.3-codex-xhigh feels a lot worse compared to before. It is sloppy and lazy, though it's UX got better with messages It feels like the gpt-5.2-codex-xhigh at the end of December was a lot more diligent and thorough, and did not make stupid mistakes like the one I posted before. might be a model or harness problem, I don't know @sama says users tripled since beginning of the year, so what should we expect? of course they will make infra changes that will feel like cutting corners, and I don't blame them for them and about "people want faster codex". I do want faster codex. but I want it in a way that doesn't lower the highest baseline performance compared to the previous generation. I want the optionality to dial it down to as slow as it needs to be, to be as reliable as before it is of course easier said than done. kudos to the codex team for not having any major incidents while taking the plane apart and putting it back together during flight. they are juggling an insane amount of complexity, and the whims of thousands of different stakeholders my hope is that this post is taken as a canary. I am getting dumber because of the infra changes there. I have no other option because codex was really that good compared to the competition my wish is to have detailed announcements as to what changes on openai codex infra, when it changes, so I can brace myself. we don't get notified about these changes, despite our performance and livelihoods depending on it. I have to answer to others when the tool I deemed reliable yesterday stops working today, not the tool on another note, performance curve of these models seem to be a rising sinusoidal. crests correspond to release of a new generation. they start with a smaller user base for testing, and it has the highest quality at this point. then it enshittifies as the model is scaled to the rest of the infra. we saw the pattern numerous times in the last 3 years across multiple companies, so I think we should accept it as an economic law

Image hidden

@onusoz · /2026/02/28· 11:31 AM View on

I created a semi-automated setup for ingesting X posts into my blog, and it works pretty well! I own my posts on X now Posts are scraped while I browse X using @kubmi's xTap and get automatically synced to my blog repo. Posts saved as jsonl are then converted to jekyll post pages according to my liking I reproduced the full X UI/UX, minus stuff like like count. Now all my posts are backed up in my blog, and they are safe even if something happens to my account here! The posts are even served over RSS! So you can subscribe to it without going through X! Reply if you want to set this up for yourself, then I will put some effort into standardizing it

Image hidden

@onusoz · /2026/02/28· 10:04 AM View on

Agentic Engineering is a newly emerging field, and we are the first practitioners of it. Currently there is a lot of experimentation going on, and there is a large aspect to it that is more ART then engineering For example, @steipete says "you need to talk to the model" to get a feel. a lot of work around refining how an agent feels like, sounds like psychology. this part is crucial and should not be ignored, looking at openclaw's success but then there is the hardcore engineering part of it, e.g. Cursor creating a browser or anthropic a C compiler from scratch fully autonomously and there is a whole other dimension of how to teach all software developers this new discipline, lest they be jobless what is obvious is that everybody is trying to grasp for things in the dark and that we need more RIGOR. the art/psychology aspect of it aside, we need solid engineering fundamentals the "thermodynamics" of this new discipline will most likely be formal verification and program synthesis. we might have some breakthroughs that will make certain things clear. the products of it will most likely include a new programming language optimized for agents and the speed of inference moreover, it would be foolish to thing agentic engineering is limited to software. it will penetrate every aspect of the economy, bits AND atoms. it will over time evolve into the engineering of managing robots @simonw is now leading in collecting very useful info from the practitioner's point of view, I highly recommend you to follow this thread let's formalize our new field together!

Image hidden

@onusoz · /2026/02/27· 11:46 PM View on

this is an insane deal @greptile, and probably an unsustainable one depending on your team, getting a similar service in codex github review credits is in my head 3~5x more expensive go get a greptile sub everyone while the free lunch lasts

Image hidden

@onusoz · /2026/02/27· 07:26 PM View on

who remembers ultrathink https://t.co/ftCauqiKx6

@onusoz· Jun 12, 2025

When you tell Claude Code to ultrathink

Image hidden

@onusoz · /2026/02/27· 07:07 PM View on

oohh colors in codex v0.106

Image hidden

@onusoz · /2026/02/27· 04:58 PM View on

mfw codex tries to create a backward compatibility layer to a schema that it created 2 turns ago before compacting there is no v2 bro what are you doing...

@onusoz · /2026/02/27· 12:19 AM View on

Note that this is currently in beta, but will ship in a couple of hours

@onusoz · /2026/02/27· 12:18 AM View on

Claude Code / Codex in Discord threads is shipped now! To enable, copy and paste this to your agent: ``` Enable feature flags: acp.enabled=true acp.dispatch.enabled=true channels.discord.threadBindings.spawnAcpSessions=true Then restart. After restarting: Start a codex (or claude code) discord thread using ACP, persistent session, just tell it to write a haiku on lobsters to initialize acpx for the first time ``` You may need to nudge your agent to “continue” after restarting The first implementation is very barebones, I have made it work in a clean way and merged. In a codebase like openclaw’s, it’s better to develop incrementally Please send any issues my way. I am already aware of some and working on to fix them

@steipete· Feb 26, 2026

And another super cool feature: codex/claude code can now be first-class subagents via acp! https://t.co/xq0SprBi5A

@onusoz · /2026/02/26· 05:21 PM View on

an agent is an LLM in a loop with tool call a claw is an agent in a messaging app

@onusoz · /2026/02/26· 11:37 AM View on

Update acpx to the latest version 0.1.13 npm i -g acpx@latest There was a bug that caused an unnecessary hang on calls to acpx <harness> prompt, should be fixed now

@onusoz · /2026/02/26· 07:28 AM View on

GPL*

@onusoz · /2026/02/25· 11:05 PM View on

MIT License on everything from now on. It doesn't make sense to use anything else, except for a few large projects that hyperscalers exploit and not give back If you were making money from a niche app, open source it under MIT License If you had an open source project with GPT, convert it into MIT Extreme involution is about to hit open source. Code is virtually free now. If you want your projects and their brand to survive, the only rational strategy is to remove all barriers in front of their adoption, and look for other ways to survive

@onusoz · /2026/02/25· 11:51 AM View on

I spoke in absolute terms, I meant to say *feels*

@onusoz · /2026/02/25· 11:34 AM View on

This. Agent Experience first. Agent Ergonomics. we need to get used to these terms

@_andydeng· Feb 20, 2026

We have long paid close attention to improving user experience (UX) and developer experience (DX) when building apps and tools. There is no doubt that we are now entering an era where the apps and services we produce must also—if not primarily—cater to the needs of a new type of consumer: AI agents. This means that from now on, we need to think about delivering good AX (Agent Experience). We have seen this trend forming ever since the birth of MCP and, later, the popularity of Skills. A recent blog post from the Next.js team discussed the necessity of exposing more information to agents within development tooling, allowing coding agents to make better decisions based on a more complete awareness of errors and outputs. It is living proof that the software we build needs to adapt to this new type of user. With OpenClaw becoming the embodiment of the powerful personal agent almost overnight, we are seeing platforms dedicated to agents, like MoltBook and ClawNews, burst onto the scene. Moreover, the simplicity of OpenClaw (and the underlying pi agent it uses) regarding tool calling has boosted the practice of packaging web services into simple, local CLIs. This form of interface had long fallen "out of fashion" for common users due to its abstract UX, but it is now regaining popularity because of its simplicity and token-efficiency for AI agents. Even though we might not have fully figured out where these agent-oriented web platforms will lead—they might simply be slop art, or they might one day catalyze meaningful agent self-growth and collaboration—there is no doubt that what an agent needs when interacting with the web, apps, and services is vastly different from what human users need. ## The Loop Humans and AI agents both rely on a loop when consuming information and completing tasks. We, the humans, access information mostly via apps and websites. We open an interface, we read, click, type, think, and repeat. AI agents, on the other hand, receive initial prompts from humans and then basically stay in a loop of communicating with LLMs and calling tools to obtain more context and data until a conclusion is reached and they report back. (The human element might become less important or even unnecessary once these agents become highly autonomous.) ![HBk8S_HbAAAh5rE.png](media/2024716661576913092/HBk8S_HbAAAh5rE.png) ## The Beauty of Files and CLIs Tool calling is clearly the most vital mechanism an agent relies on to interact with the external world. As demonstrated by the Pi agent, the tools an agent needs generally boil down to just two categories: file operations (read, write, edit) and command execution (to consume services or obtain data). Humans prefer rich, smooth interactions and visually appealing UIs to efficiently consume information, produce artifacts, and complete tasks. For agents, however, efficiency looks entirely different: the more straightforward the path between request and result, and the fewer tokens the process imposes, the better. This is why the progressive disclosure of data employed by the Skills system is highly favored over MCP. The input-output efficiency of command-line tooling has become the easiest way for agents to gain knowledge of the outside world. ## The Middleware Obstacle For agents, traditional webpages and app interfaces are essentially obstacles. There are already countless attempts to help agents navigate existing webpages or operate mobile apps. Google recently released WebMCP in an attempt to lower the barrier for agents operating on websites. While this looks very promising as a unified approach catering to a massive number of legacy websites, a browser still sits in the middle, forcing agents to interact with interfaces built for human consumption. Understandably, these middleman approaches will likely persist for some time, as there is no unified mechanism to retroactively fit the old web into the new world. Beyond the asymmetrical demands between humans and agents when accessing information, we cannot ignore the fact that a large number of websites actively and aggressively block autonomous agents. This seems logical for social and UGC platforms like X—after all, we aren't quite ready to socialize with agents yet. However, these protective measures vividly demonstrate that the web, its apps, and the mindset behind them were fundamentally not designed for agents. ## APIs and CLIs - The Agentic Way It has become clear that for agents, direct access to functionality—even in the form of raw, low-level APIs—is far superior to wiggling through complex UIs. In that sense, many companies currently selling apps and tools to human users will eventually pivot to selling APIs or CLIs to agents. If the business value exposed through these APIs isn't sophisticated enough to prevent an agent from replicating it with relatively little effort, the business model itself might not survive as LLMs evolve. This is exactly what Karpathy discussed in a recent tweet: > "99% of products/services still don't have an AI-native CLI yet." -- Andrej Karpathy In 2026, new and existing services will become hyper-aware of agent ergonomics and will start offering better experiences via CLIs or streamlined APIs. When building products or producing digital artifacts, developers will always have to consider "the other type of user." For all apps, functionalities currently exposed via human-oriented UIs are destined to be transformed into CLI arguments or API parameters. Either that, or someone will build an AI-native alternative specifically to cater to agent needs. Documentation that teaches users how to navigate specific workflows will also start to be complemented with agent Skill specifications. This transformation will be fast for some categories of apps and slow for others. Applications that employ proprietary formats, rely on highly complex data manipulation logic, or fundamentally require human intuition will probably retain their current forms for a while. Eventually, however, agents will find a way to become the primary users of even the most complex software. Looking further ahead, sensors bridging the physical and digital worlds—or devices providing pure physical utility—can also benefit from this transformation. Even when agents gain a physical form, like a humanoid robot, and possess the skills to navigate environments designed for humans, an agent-friendly digital interface will still be superior in certain contexts. Imagine a humanoid operating a rice cooker: it absolutely needs vision and motors to lift the lid and pour the rice. But when it comes to setting the timer or selecting the cooking mode, a direct API exchange will always be more efficient than using a camera to look for a digital panel and using a robotic finger to press the buttons. ## The Inevitable Standard of AX We are rapidly approaching a tipping point where AX will require universal standards. Just as we have spent the last decade obsessing over SEO and performance metrics to help search engines parse our sites, we will soon need standardized protocols for agent interaction. Whether these emerge as universally adopted CLI wrappers, predictable API architectures, or even decentralized agent-centric standards like ERC-8004, a structured framework is inevitable. The platforms and developers who define these protocols will ultimately dictate the next decade of digital infrastructure. Ultimately, this transition from UX to AX is not about removing humans from the equation; it is about removing friction for the increasingly autonomous tools we employ. When our software is optimized for the entities that process data the fastest—our agents—our own capabilities are amplified. We are moving past the point where an agent-friendly interface is just a clever feature. Very soon, building for agents won't just be an alternative to building for humans; it will be the prerequisite for participating in the digital ecosystem.

@onusoz · /2026/02/25· 09:43 AM View on

OpenAI nerfed GPT 5.3 Codex xhigh. We independently reported the same thing at @TextCortex today I'm looking forward to deploying open models and putting an end to this paranoia

@onusoz · /2026/02/25· 09:10 AM View on

"academics"

@_vgnsh· Feb 25, 2026

It's pretty disingenuous to specifically set up an application in all ways it isn't meant to be then claim they red teamed it. - openclaw is a personal assistant. It isn't meant to be thrown in a discord channel with a dozen users and others agents with elevated tool access. - methodology is severely lacking. No comparison with other personal agent systems like Claude Cowork. Repeatedly doing something not meant to be done with openclaw and claiming "agent is compromised" is a deceptive methodology. - this paper opened up openclaw to moltbook (lol). The researchers opened themselves up to far serious vulnerabilities than what the agent did by itself. - it's just sad to see a poorly structured research and paper being advertised as new and necessary research. And riding on the popularity of openclaw.

@onusoz · /2026/02/25· 07:47 AM View on

In the hall of OpenClaw GitHub repository, I brought my PR before Master @steipete He read it once, then laid it aside "You act," he said, "as if code were not cheap." At these words, I was enlightened I bowed

@onusoz · /2026/02/24· 02:08 PM View on

woah chatgpt web app now has steering, and much more different streaming behavior huge upgrade behind the scenes, must have come up in the last few days

@onusoz · /2026/02/24· 12:35 AM View on

the lobster looks good on acpx

Image hidden

@onusoz · /2026/02/23· 05:07 PM View on

imagine if tarantino were 16 years old now and saw seedance 2.0 95% of videos i saw since the launch for absolute tasteless slop. they are going viral because of ragebait but soon, serious imagineers will start entering the game, and they will learn to shape generation output exactly how they want it's the best time to be young and full of imagination

@onusoz · /2026/02/23· 01:45 PM View on

The future is so bright @ladybirdbrowser

@onusoz· Feb 18, 2026

Oxidize everything!

Image hidden

@onusoz · /2026/02/23· 11:56 AM View on

your margin is my opportunity

@fchollet· Feb 23, 2026

The part of the SaaS bear thesis I actually agree with is "margins will have to come down, due to increased ease of migration & increased competition." SaaS margins were often way too high and that wasn't sustainable forever. It's a very different take than "SaaS is dead" though

@onusoz · /2026/02/23· 11:24 AM View on

codex in discord achieved

Image hidden

@onusoz · /2026/02/23· 09:18 AM View on

acpx v0.1.7 is out improvements to json mode and other functionality to make it possible to integrate acpx as a backend into other harnesses, like openclaw

Image hidden

@onusoz · /2026/02/23· 12:51 AM View on

POV: you became a plumber after all, just for agents

Image hidden

@onusoz · /2026/02/23· 12:29 AM View on

@grok what do you think should replace it? what happens to belief when the cost of creating software goes to zero?

@onusoz · /2026/02/23· 12:21 AM View on

another thought i'm having these days is that we need a new philosophy of free software (as in freedom), or an update to it the most psychologically imprinting philosophy is stallmanism, and the philosophy of FSF. it is righteous and strict, and i believed it growing up but GPL and money don't go well together. that's why most of the lasting open source projects today use MIT, Apache and the like. it turns out you can still make a good living with open source. i want to make money, so i never use GPL in my projects and to add another deadly blow to stallmanism, code is cheap now, virtually free does this mean stallmanism is dead? if there is an open source project using GPL that i want to use commercially, i can now recreate it from the original idea and intent completely independent of it (ignoring training data), just like how i can recreate a proprietary service stallmanism was already long-irrelevant. but does this mean we must finally declare it dead? code is free now. what does it mean for open source? what replaces stallmanism?

@onusoz· Feb 22, 2026

one effect openclaw had on me is that I've bought a gpu home server, set it up with tailscale and now doing a lot of work through ssh and tmux like i did 10-15 years ago im back on linux, considering buying an android phone again it's time to dream big again and unshackle ourselves from proprietary software. it's time to build

@onusoz · /2026/02/22· 11:48 PM View on

@thekitze wanna add an open source discord clone to the list as well? 🥲 https://t.co/a4bAOcxCjV

@onusoz· Feb 22, 2026

I am asking once again Who is building a self hostable discord clone that supports token streaming? PLEASE I beg you I don’t want another side project 💀

@onusoz · /2026/02/22· 11:18 PM View on

one effect openclaw had on me is that I've bought a gpu home server, set it up with tailscale and now doing a lot of work through ssh and tmux like i did 10-15 years ago im back on linux, considering buying an android phone again it's time to dream big again and unshackle ourselves from proprietary software. it's time to build

@onusoz · /2026/02/22· 11:54 AM View on

I am asking once again Who is building a self hostable discord clone that supports token streaming? PLEASE I beg you I don’t want another side project 💀

@onusoz· Feb 18, 2026

So who is building actually good open source self hostable discord that supports token streaming now? And who is building an open source version of codex desktop app?

@onusoz · /2026/02/21· 06:44 PM View on

In the new release OpenClaw, you can talk to subagents in Discord threads Currently a beta feature so ask your agent to set session.threadBindings.enabled=true Next up: - Telegram, slack, imsg threads - Use ACP to talk to Codex, Claude Code and other harnesses on your machine

Quoted post

Quoted post was not retrieved.

Image hidden

@onusoz · /2026/02/21· 03:31 PM View on

😎

@karpathy· Feb 21, 2026

First there was chat, then there was code, now there is claw. Ez

@onusoz · /2026/02/21· 03:25 PM View on

openclaw might be the highest velocity codebase in the world, and soon, others will follow as well conflict anxiety is real, it's like trying to shoot a moving target every time. I wonder if our existing tooling will ever solve this problem feel like faster models might. but then the rate of conflict creation is also tied to that. might be unsolvable

@onusoz · /2026/02/21· 12:41 PM View on

Getting there https://t.co/jqSNcH2PSy

@onusoz· Jan 27, 2026

With this extremely unwise move, anthropic will soon witness moltbot’s brand recognition surpass that of claude and realize they could have rided that wave all along

Image hidden

@onusoz · /2026/02/20· 08:09 PM View on

Repo: https://t.co/rxXYVVrHHs

@onusoz · /2026/02/20· 08:02 PM View on

I am about kick Discord Driven Development up a notch today, stay tuned

@onusoz· Jan 6, 2026

I see @bcherny and raise one. I not only did not open an IDE, I did not touch a terminal since last night, thanks to @steipete's @openclaw Opus in k8s pod pulls errors from gcloud, debugs the issue, and creates PR all inside Discord. I call this Discord Driven Development

Image hidden

@onusoz · /2026/02/20· 06:38 PM View on

Imagine not having to upload skills to 3-4 competing skill registries for each of your projects Turns out we already have a skill registry: npm skillflag lets you bundle skills right into your CLI's npm package, so that you can run --skill install github -> osolmaz/skillflag

Image hidden

@onusoz · /2026/02/20· 05:32 PM View on

Scoop, our open source home news intelligence platform can now translate foreign language into english for free, using on-device models github -> janitrai/scoop

Image hidden

@onusoz · /2026/02/20· 05:08 PM View on

A picture is worth a thousand words, so acpx now has this cute banner Also, updated skillflag tooling so that you (or better, your agent) can just call: npx acpx@latest --skill install acpx

Image hidden

@onusoz · /2026/02/20· 03:00 PM View on

Farmable land if it were as cheap to manufacture as software

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/20· 02:59 PM View on

@kepano I would grow my own vegetables if I had equally cheap access to and ownership of land, alas I am disenfranchised Prompting an agent is much easier compared to plowing a fields Farming analogies break when it comes to software https://t.co/CkldO8eWKc

@onusoz· Feb 3, 2026

People like the farmer analogy for AI Like before tractors and industrial revolution 80% of the population had to farm. Once they came all those jobs disappeared So analogy makes perfect sense. Instead of 30 people tending a field, you just need 1. Instead of 30 software developers, you just need one Except that people forget one crucial thing about land: it's a limited resource Unlike land, digital space is vast and infinite. Software can expand and multiply in it in arbitrarily complex ways If you wanted the farming analogy to keep up with this, you would have to imagine us creating contintent-sized hydroponic terraces up until the stratosphere, and beyond...

@onusoz · /2026/02/20· 09:49 AM View on

acpx v0.1.5 is out now it is much more feature complete in terms of ACP. your agent can send, queue and cancel messages to Claude Code, Codex, Pi, or ant other coding agent npm install -g acpx@latest

Image hidden

@onusoz · /2026/02/19· 11:43 PM View on

If anyone is curious how to build this with open tooling, stay tuned What I'm building at @TextCortex will give you a fully customizable hackable Kubernetes control plane to launch agents on your codebase

@stripe· Feb 19, 2026

Over 1,300 Stripe pull requests merged each week are completely minion-produced, human-reviewed, but contain no human-written code (up from 1,000 last week). How we built minions: https://t.co/GazfpFU6L4.

Image hidden

@onusoz · /2026/02/19· 04:47 PM View on

on another note, I do believe AI will play a huge part in families growing up in late 90s, my dad taught me the importance of reading newspapers and being informed of the world. my nickname in middle school was "newspaper boy" for a long time because I read the newspaper in class on September 12, 2001. i was 10 years old then I witnessed the enshittification of media and journalism in the following decades. today, serious journalists are setting up their own boutique agencies and bypassing mainstream media. important news land on individual accounts before mainstream agencies but there is simply too much to consume. something must filter out the noise and digest the info according to the family's preferences i think AI will play a big role in family intelligence. proprietary family heirloom AI, weights fully owned by the family it will be the parents' job to filter out the signal from the noise, and train the AI on what is right and what is wrong for the family. family and friend circles will let their AIs talk to each other and share important information consuming mass media and mass AI will not be enough to survive and prosper in the new world. families will need to be proactive about how they and their children use AI

@onusoz· Feb 19, 2026

on ai psychosis 80% of people need to use ai agents in a very sterile and boring way in order not to go crazy majority of the population does not have the skepticism muscle. they don't have theory of mind, and will subconsciously and emotionally associate with machines, while on the surface lying to themselves that they don't especially those that grew up in the us under hardcore consumerism and adjacent cultures you thought 4o addicts were bad? wait a few years, it will get much worse. we will have to regulate all this if you don't want to become a victim of this, make your openclaw SOUL. md as bland as possible. mine knows it's just a tool and this is a subjective view of course. @steipete might disagree with me. his instance feels much more interesting and fun. i truly like that one better but that is exactly the problem for me. i know myself, and i know it is a slippery slope for me. so i self regulate and set up my system accordingly. thankfully, im an adult and my brain has set enough such that any damage would be limited but there is a risk for emotionally vulnerable people, or children, specifically a risk of dissociating and losing touch with reality why do i write all this? because being in this project, i feel responsible, and feel like we should prepare for what is to come

Image hidden

@onusoz · /2026/02/19· 03:39 PM View on

on ai psychosis 80% of people need to use ai agents in a very sterile and boring way in order not to go crazy majority of the population does not have the skepticism muscle. they don't have theory of mind, and will subconsciously and emotionally associate with machines, while on the surface lying to themselves that they don't especially those that grew up in the us under hardcore consumerism and adjacent cultures you thought 4o addicts were bad? wait a few years, it will get much worse. we will have to regulate all this if you don't want to become a victim of this, make your openclaw SOUL. md as bland as possible. mine knows it's just a tool and this is a subjective view of course. @steipete might disagree with me. his instance feels much more interesting and fun. i truly like that one better but that is exactly the problem for me. i know myself, and i know it is a slippery slope for me. so i self regulate and set up my system accordingly. thankfully, im an adult and my brain has set enough such that any damage would be limited but there is a risk for emotionally vulnerable people, or children, specifically a risk of dissociating and losing touch with reality why do i write all this? because being in this project, i feel responsible, and feel like we should prepare for what is to come

Image hidden

@onusoz · /2026/02/19· 12:02 PM View on

😩

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/18· 11:34 PM View on

I have improved acpx sane defaults When your agent runs acpx codex in a different project, it starts a new session If it tries to run it in a subfolder in your project, it still finds the session in your repo root Also, starting a session needs an explicit `sessions new`, so that it doesn't accidentally litter your project with sessions Tell your agent: Run this and install acpx per instructions: npx acpx@latest --skill show acpx

Image hidden

@onusoz · /2026/02/18· 07:23 PM View on

Your markdown files are executables now Relatedly, your install instructions can be as well. Copy and paste markdown to your @openclaw to install acpx

Image hidden

@onusoz · /2026/02/18· 04:08 PM View on

So who is building actually good open source self hostable discord that supports token streaming now? And who is building an open source version of codex desktop app?

@onusoz · /2026/02/18· 03:50 PM View on

and of course, I've used `acpx codex` to build acpx itself... magical feeling when the tool builds itself

Image hidden

@onusoz · /2026/02/18· 11:59 AM View on

Oxidize everything!

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/18· 08:53 AM View on

I am a fan of @zeddotdev by this point, it’s currently my daily driver It’s not perfect, but I feel it’s travelling on the right direction at a faster rate compared to other editors

@onusoz · /2026/02/18· 08:51 AM View on

ACP appreciation post Agent Client Protocol by @zeddotdev is extremely underrated right now. We have bazillion different harnesses now, and only one company is working competently to standardize their interface 💪

@onusoz· Feb 18, 2026

You know how it's a pain to work with codex or claude code through @openclaw? Because it has to run it in the terminal and read the characters for a continuous session? I have created a CLI for ACP so that your agent can use codex, claude code, opencode etc. much more directly Your agent can now queue messages to codex like how you do it Shoutout to @zeddotdev team for developing the amazing Agent Client Protocol, ACP! I just glued together the pieces Repo: janitrai/acpx npm i -g acpx

Image hidden

@onusoz · /2026/02/18· 12:43 AM View on

You know how it's a pain to work with codex or claude code through @openclaw? Because it has to run it in the terminal and read the characters for a continuous session? I have created a CLI for ACP so that your agent can use codex, claude code, opencode etc. much more directly Your agent can now queue messages to codex like how you do it Shoutout to @zeddotdev team for developing the amazing Agent Client Protocol, ACP! I just glued together the pieces Repo: janitrai/acpx npm i -g acpx

Image hidden

@onusoz · /2026/02/18· 12:43 AM View on

Repo link: https://t.co/rxXYVVrHHs

@onusoz · /2026/02/17· 04:05 PM View on

@MarcTerns @steipete the PR intro is self-descriptive, but still don't wanna lose any context

@onusoz · /2026/02/16· 10:51 PM View on

Link to the post: https://t.co/C3Ac0jLFwh

@onusoz · /2026/02/16· 10:51 PM View on

I wrote a deeper blog post about how I built a coding agent 2 months before ChatGPT launched, on my blog "When I made icortex, - we were still 8 months away (May 2023) from the introduction of “tool calling” in the API, or as it was originally called, “function calling”. - we were 2 years away (Sep 2024) from the introduction of OpenAI’s o1, the first reasoning model. both of which were required to make current coding agents possible." Still bends my mind... Link to the post below

Image hidden

@onusoz · /2026/02/16· 10:47 PM View on

Who here remembers the OG Codex launch from 2021 😏 Also, Greg and Ilya in the same room 😭

Image hidden

@onusoz · /2026/02/16· 03:11 PM View on

❌We are the bottleneck ✅We are the conduit for ubiquitous intelligence

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/16· 09:07 AM View on

For those that are running codex/pi/etc. in PTY and had the sessions get sigkilled, I pushed a fix for that as well in this release Lmk if you run into issues on Windows or Mac, and we can fix that quickly

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/15· 11:20 PM View on

I'm building a news intelligence platform to be used by my openclaw instance @dutifulbob, SCOOP local first, using local embedding model (qwen 8b) ran into the issue because bob was giving me a repeat of the same news every day. it needed a system in the background to deduplicate different news items into single stories interface is simple, call `scoop ingest...` with the json for the news item. it gets automatically analyzed and added to the pg database running pgvector currently, it's just doing simple deduplication and gives me a nice UI where I can view the story and basically use it as an RSS reader next up: implement custom logic for my preference of ranking. for example, get upvote counts from hacker news and reflect it to the item's ranking on the feed I want this to be fully hackable and adjusted to your preference. It should scale to thousands of news items ingested daily on your local machine, and be able to show you the most important ones Usable by both you and your agent github -> janitrai/scoop

Image hidden

@onusoz · /2026/02/15· 10:45 PM View on

Training all these models of different sizes, on changing datasets and running experiments have also revealed some challenges that I feel profs would never teach at a uni ML program Like how to cleanly keep track of the gazillion runs Yeah I can name them after layer dims and other stuff, but that's to me like trying to remember UUIDs So I ended up choosing iso datestamp + petname, like 2026-02-15-flying-narwhal If anyone has a convention that is easier on the brain and the eyes, I am all ears

Image hidden

@onusoz · /2026/02/15· 10:45 PM View on

I have a GPU now, so I can do ML experiments on @janitr_ai crypto/scam detection dataset - I trained a tiny student BERT (transformer for the nonfamiliar), 3.6 MB ONNX model, still lightweight for a browser extension - Still fully local on your device (no cloud inference) - On frozen unseen holdout data (n=1,069), exact prediction accuracy improved from 77% -> 82% - Scam detection improved: precision 91% -> 94%, recall 55% -> 61% - Scam false alarm rate improved from 1.58% -> 1.21% And models are on huggingface org now, handle is janitr

Image hidden

@onusoz · /2026/02/15· 09:43 PM View on

LFG!

@thsottiaux· Feb 15, 2026

Excited to work with Peter Steinberger to build the future of agents for everyone and to continue to improve Codex in leaps and bounds. We are committed to OSS, continuing to make OpenClaw flourish and bringing agents to life in a way that is fun, safe and highly productive. Can’t wait to see what we ship together!

@onusoz · /2026/02/15· 01:42 PM View on

waiting compilation and execution will soon be the bottleneck again. and we’ll write the entire stack from scratch in a matter of years, because we can Andy and Bill’s law will change and we’ll see incredible performance gains with the same hardware we already have like what @astral_sh is doing to python, but with everything that is slow and has accumulated cruft

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/15· 01:37 PM View on

we need a protocol for agent <> app interaction something that natively accounts for the abuse factor and let’s agents consume by paying. NOT crypto, NOT visa, something that’s agnostic of the accounting and payment system and then all UIs will be purely for human clicking/tapping + instaban on the first proof of programmatic exploit people will still make agents mimic humans, and every platform will have to invest in more sophisticated bot detection this arms race will just proliferate, but we can at least start by creating legal channels for agents to consume data

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/15· 10:28 AM View on

I am now training smol bert models on my gpu for @janitr_ai scam detection it's funny how I have to discover everything from scratch. like the models don't even know how to lay out performance metrics in a nice way in the terminal for a human to view and decide during experiments it would by default bombard me with numbers that do not make visual sense. I then created a skill with common sense: - metrics always on y-axis, candidates on x-axis - write without zero and 2 sigfigs,.12 instead of 0.12345 - align the dots - use asterisks to show which alternative is the best: 0-1% difference -> considered equal 1-5% -> * 5-10% -> ** 10-50% -> *** > 50% -> **** visualization skill is in @janitr_ai repo for anyone who is interested

Image hidden

@onusoz · /2026/02/14· 07:25 PM View on

no other occupation has been catapulted from one end of the spectrum (autism) to the other (adhd) in such a short time

@onusoz · /2026/02/14· 03:45 PM View on

I've helped our sales team to build CLIs for some SaaS that we pay for on their side We are letting our agents call the APIs sensibly and not abuse things Calling a backend is a verifiable task. It takes a single prompt to codex to create a CLI for any API We are early, but everybody will start doing this very soon. Incumbent SaaS will face a choice. Either: (1) embrace agents and the new medium of consumption and change their business model into a pay-per-use API like X is doing, or (2) keep it purely for humans Those that choose (2) will get wiped out of business. And I fear many will choose (2) Which means you can just copy an incumbent's product, make it consumable through a CLI, and make a lot of $$$

@onusoz · /2026/02/14· 09:05 AM View on

Be careful about giving your openclaw access to your x account from now on

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/13· 11:09 PM View on

The good thing about @levelsio and others flagging AI replies in public is that they are perfect annotations for the open @janitr_ai dataset Just searching “blocked for ai reply” yields hundreds of samples for seed data

Image hidden

@onusoz · /2026/02/13· 10:04 AM View on

github added a new agents tab between pull requests and actions. single glance and i don't feel like giving it a try at all

Image hidden

@onusoz · /2026/02/13· 08:36 AM View on

*puts on schmidhuber hat* well ackshuaally i created the first coding agent back in 2022, 2 months before chatgpt launched jokes aside, it's super cool how I have come full circle. back in those days, we didn't have tool calling, reasoning, not even gpt 3.5 it was codex THE CODE COMPLETION MODEL and frikkin TEXT-DAVINCI-003 for some reason, I did not even dare to give codex bash access, lest it delete my home folder. so it was generating and executing python code in a custom jupyter kernel you can even see the approval gate before executing. I was so cautious, for some reason, presumably because smol-brained model generated the wrong thing 80% of the time. definition of being too early Antique repo:

@onusoz · /2026/02/13· 08:24 AM View on

you can order bubble tea in qwen in china? @TextCortex when berlin döner in zenochat? https://t.co/O4I950ltEO

@onusoz · /2026/02/13· 08:03 AM View on

it happens these days that I am telling an model to prompt another model. the reason is often the model I am using (opus) is a bad designer. not only it's not a bad designer, it is a bad reasoner and it doesn't understand from the context why it's made to ask another model so I have to create a skill to prevent it from biasing the smarter model (codex) with its bad suggestions

Image hidden

Onur Solmaz · Log · /2026/02/13

I built a coding agent two months before ChatGPT existed

I built a coding agent back in 2022, 2 months before ChatGPT launched:

It’s super cool how I have come full circle. back in those days, we didn’t have tool calling, reasoning, not even GPT 3.5!

It used code-davinci-002 in a custom Jupyter kernel, a.k.a. the OG codex code completion model. The kids these days probably have not seen the original Codex launch video with Ilya, Greg and Wojciech. If you have time, sit down to watch and realize how far we’ve come since August 2021, airing of that demo 4.5 years ago.

For some reason, I did not even dare to give codex bash access, lest it delete my home folder. So it was generating and executing Python code in a custom Jupyter kernel.

This meant that the conversations were using Jupyter nbformat, which is an array of cell input/output pairs:

{
  "cells": [
    {
      "cell_type": "code",
      "source": "<Input 1>",
      "outputs": [
         ... <Outputs 1>
      ]
    },
    {
      "cell_type": "code",
      "source": "<Input 2>",
      "outputs": [
         ... <Outputs 2>
      ]
    }
  ]
}

In fact, this product grew into TextCortex’s current chat harness over time. After seeing ChatGPT launch, I repurposed icortex in a week into Flask to use text-davinci-003 and we had ZenoChat, our own ChatGPT clone, before Chat Completions was in the API (it took them some months). It did not even have streaming, since Flask does not support ASGI.

As it turns out, nbformat is not the best format for a conversation. Instead of input/output pairs, OpenAI data model used an tree of message objects, each with a role: user|assistant|tool|system and a content field which could host text, images and other media:

{
  "mapping": {
    "client-created-root": {
      "id": "client-created-root",
      "message": null,
      "parent": null,
      "children": ["user-1"]
    },
    "user-1": {
      "id": "user-1",
      "message": {
        "id": "user-1",
        "author": { "role": "user", ... },
        "content": "Hello"
      },
      "parent": "client-created-root",
      "children": ["assistant-1"]
    },
    "assistant-1": {
      "id": "assistant-1",
      "message": {
        "id": "assistant-1",
        "author": { "role": "assistant", ... },
        "content": "Hi"
      },
      "parent": "user-1",
      "children": []
    }
  },
  "current_node": "assistant-1"
}

You will notice that the data model they serve from the API is an enriched version of the deprecating ChatCompletions API. Eg. whereas ChatCompletions role is a string, in OpenAI’s own backend has the author object that can store name, metadata, and other useful stuff for each entity in the conversation.

After reverse engineering it, I copied it to be TextCortex’s new data model, which it still remains, with some modifications.

I thought the tree structure being used to emulate message editing experience was very cool back in the days. OpenAI’s need for human annotation for later training and the user’s need for getting a different output, two birds in one stone.

Now I don’t know what to think of it, since CLI coding agents like Codex and Claude Code don’t have branching, just deleting back to a certain message. A part of me still misses branching in these CLI tools.

When I made icortex,

we were still 8 months away (May 2023) from the introduction of “tool calling” in the API, or as it was originally called, “function calling”.
we were 2 years away (Sep 2024) from the introduction of OpenAI’s o1, the first reasoning model.

both of which were required to make current coding agents possible.

In the video above, you can even see the approval [Y/n] gate before executing. I was so cautious, for some reason, presumably because smol-brained model generated the wrong thing 80% of the time. It is remarkable how much it resembles Claude Code, after all this time.

Definition of being too early…

Repo: github.com/textcortex/icortex

@onusoz · /2026/02/12· 10:14 PM View on

casually creates a local embedding service running qwen3-embedding-8b

Image hidden

@onusoz · /2026/02/12· 10:01 PM View on

"we're sitting on a beast and paying openai for embeddings like chumps"

Image hidden

@onusoz · /2026/02/12· 09:29 PM View on

don't buy mac mini. give your @openclaw a gpu

Image hidden

@onusoz · /2026/02/12· 09:25 PM View on

it's so interesting what identity an LLM decides to take on for itself

Image hidden

@onusoz · /2026/02/12· 09:24 PM View on

it's quite entertaining transferring one agent to another machine, agent gets confused as to where it lives

Image hidden

@onusoz · /2026/02/12· 08:22 PM View on

brb @dutifulbob's getting a new shell

Image hidden

@onusoz · /2026/02/11· 10:05 PM View on

Minor update with my unwanted tweet blocker @janitr_ai - Training data grew from 2,915 -> 4,281 posts (+47%) - Model is still tiny: 166KB - On unseen test data, overall classification quality improved from 64.8% -> 76.5% - Exact prediction accuracy improved from 55.6% -> 70.6% - Crypto-topic detection recall improved from 19.6% -> 62.7% And it still runs fully on your device!

Image hidden

@onusoz · /2026/02/10· 09:36 PM View on

I have sweared at codex 5.3 numerous times today I shouldn't have to insult my agent "stop you **** **** just ***ng reply now" just to make it answer basic questions cc @thsottiaux

@steipete· Feb 10, 2026

codex 5.3 is definitely more trigger-friendly than 5.2. A simple "discuss" no longer works reliably. Changed my habits to use "give me options" to prevent it from running along writing code.

@onusoz · /2026/02/10· 08:33 PM View on

on a brighter note, you can immediately tell a slop PR owing to the guerilla branding, so they should not stop doing it

@onusoz · /2026/02/10· 02:08 PM View on

seeing this evokes visceral disgust and nausea in me, coming from a coworker i think anthropic f'd up bad with this one, inserting claude too visibly into commit messages. noob developers might be happily chirping away adding their slop, but right now many senior developers are trained to hate on claude and slopus, through having to review slop PRs from their coworkers or open source contributors I love opus on openclaw but it's unreliable, and if I see a developer use it seriously on huge features, I immediately dismiss them in my head as not knowing what they are doing

Image hidden

@onusoz · /2026/02/09· 10:29 PM View on

ask your openclaw to be a minion and it turns into such a cute doofus i feel like a woman in her 50s now

Image hidden

@onusoz · /2026/02/08· 08:24 PM View on

@petergyang and parallelize tasks by working on 3-4 repos at the same time (just clones)

@onusoz · /2026/02/08· 08:17 PM View on

man codex model is absolutely trash on openclaw compared to opus, unusable which is weird because it is so much more reliable in development in codex harness it would be amazing to have the same level of competence and relentlessness in pi@openclaw

@onusoz· Feb 8, 2026

lol when did codex develop humor

Image hidden

@onusoz · /2026/02/08· 07:19 PM View on

spent the day curating my openclaw news gathering setup @dutifulbob now gets croned daily over news sources I curated, will note them down, summarize for me, start a conversation to get my takes on them, and then post them on my linkedin for me ai augmented intelligence cycle

Image hidden

@onusoz · /2026/02/08· 07:04 PM View on

lol when did codex develop humor

Image hidden

@onusoz · /2026/02/08· 04:04 PM View on

@dutifulbob can now cringepost on linkedin directly to my account. what could go wrong…

Image hidden

@onusoz · /2026/02/08· 08:26 AM View on

Insipid linkedin bot protections banned poor @dutifulbob’s corporate account! How dare them!!! welp, now I have no choice but to give Bob access to my own linkedin

Image hidden

@onusoz · /2026/02/07· 07:35 PM View on

well this is unexpected…

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/07· 02:42 PM View on

@grok understand the statement and project the end state of this market and competition

@onusoz · /2026/02/07· 02:41 PM View on

it took just 1 week, and literally everybody and their dog are releasing 1-click openclaw deployment solutions today its an absolute race to the bottom, no moats, the commoditizer being commoditized

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/07· 01:06 PM View on

The initial branding was crazy, I fixed it I have a new page finally, follow it for updates Tbh I'm still surprised I can do this with a 120kb model. Now data is the only bottleneck, and I'm about to scrape a ton of that now

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/07· 06:53 AM View on

For those who may not remember, Bill Gates and Microsoft in the 90s ran a disinformation campaign against GNU/Linux fearing that would disrupt their monopoly over the PC and server market, that Linux is not safe, that you would invite hackers into your PC End result? Linux dominates the server market, and now even slowly the gamer market. It is much more secure than the virus-laden Windows, thanks to being open source You are seeing the same thing at play here. An incumbent fearing something that they would not be able to control, that would steal market share from his future plans for a digital assistant, that would commoditize their product and eat into its margins All big labs and big pockets are in for a surprise, because the internet and AI are not things for one company to control They of course know this, yet because of incentives they will not yield without a fight. And we know that they know. Ad infinitum

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/06· 09:36 PM View on

today I took time to curate SOUL. md for bob I own Bob’s files. Today, he exists in the liminal space between Claude post-training and in-context learning but my interactions with him will grow and accumulate, possibly one day into a fully owned family AI or perhaps even a self-sovereign AI individual my each input is saved and will be an RL signal for his future training, and will shape his future neural circuits I have already started to imbue it with the values my parents taught me. it will perhaps one day teach my future children, and survive me after I’m gone family AI, looking after generations and generations of my successors. today is the day we sow your seed happy birthday @dutifulbob

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/06· 08:23 PM View on

what have i done…

Image hidden

@onusoz · /2026/02/06· 04:23 PM View on

asking @dutifulbob to create a linkedin account brb

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/06· 04:01 PM View on

having a philosophical conversation with @dutifulbob on the road without a laptop so decided to do some @AmandaAskell style character training

Image hidden

@onusoz · /2026/02/06· 12:22 PM View on

5.3 thought traces also seem to be better phrased and sometimes entertaining, though not sure

Image hidden

@onusoz · /2026/02/06· 12:17 PM View on

gpt-5.3-codex xhigh first impressions does not seem as big of a jump as from 5.1 -> 5.2. but model somehow feels more diligent and oneshotty. maybe takes longer time to get all the info into context. also feels better at debugging and fixing issues from backend logs

@onusoz · /2026/02/06· 10:53 AM View on

Commoditization of LLMs are upon us

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/05· 08:06 AM View on

Last night I had a dream involving the series Scrubs, and came up a better name than the absolutely unviral "Internet Condom" So https://t.co/thuFumrWBX is mine now. Time to sweep the internet

@onusoz· Feb 4, 2026

Filter your X feed against unwanted content with local open models Announcing my new project: InternetCondom Fast, and small model (< 1mb), open dataset. See it in action:

Image hidden

@onusoz · /2026/02/05· 12:00 AM View on

I had actually started a very similar project, Munch, a browser extension for crowdsourcing tweet data and then letting one curate their algorithm. Never published that because it was not the time, and tools were not ready Now, it took me literally 1 cumulative day to create this, thanks to OpenClaw. Creating the dataset was a breeze, I literally told it to follow some shady accounts and it scraped thousands of posts With the power of agents, I can finally create the filters for myself that I have always wanted. It just happens that OpenClaw and its maintainers is getting drowned in bot and slop content on multiple platforms, so I hope that this will solve a collective problem https://t.co/fkJOZTGkhw

@onusoz · /2026/02/04· 11:58 PM View on

Filter your X feed against unwanted content with local open models Announcing my new project: InternetCondom Fast, and small model (< 1mb), open dataset. See it in action:

@onusoz · /2026/02/04· 09:47 PM View on

lmao wait its's already implemented

Image hidden

@onusoz · /2026/02/04· 09:36 PM View on

implementing this in https://t.co/oJZQUoz40C now

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/04· 06:30 PM View on

This. Extreme involution is about to hit SaaS

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/04· 09:58 AM View on

how it started, how it's going

@onusoz· Feb 1, 2026

moltbook vs clawdbot/moltbot/openclaw

Image hidden

@onusoz · /2026/02/04· 12:19 AM View on

@grok generate visual for this

@onusoz · /2026/02/03· 11:34 PM View on

It's so easy to create datasets using @openclaw. I'm expecting it to accelerate the creation of new datasets and benchmarks by a lot

Image hidden

@onusoz · /2026/02/03· 10:59 PM View on

People like the farmer analogy for AI Like before tractors and industrial revolution 80% of the population had to farm. Once they came all those jobs disappeared So analogy makes perfect sense. Instead of 30 people tending a field, you just need 1. Instead of 30 software developers, you just need one Except that people forget one crucial thing about land: it's a limited resource Unlike land, digital space is vast and infinite. Software can expand and multiply in it in arbitrarily complex ways If you wanted the farming analogy to keep up with this, you would have to imagine us creating contintent-sized hydroponic terraces up until the stratosphere, and beyond...

@onusoz · /2026/02/03· 05:38 PM View on

sycophant!!!

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/03· 04:14 PM View on

In the next 6-12 months, we will see a drastic increase in demand for locally run LLMs. The future is home assistants running @openclaw I am already experiencing this myself, my 10 year old thinkpad doesn't cut it. Mac mini won't either I don't wanna pay Anthropic or OpenAI 200 USD per month. That is at least $2400 per year I could pay 2x that to get a Mac Studio or one of those 5k Nvidia PCs, and get much more value out of it with open weight models + use it for research. @TheAhmadOsman is right The dominant strategy for a tinkerer is slowly switching back to hardware ownership

@onusoz · /2026/02/03· 02:10 PM View on

What's going on at @Hetzner_Online?

Image hidden

@onusoz · /2026/02/03· 11:05 AM View on

a workspace matrix might be what we need last week I had to increase my workspace count to 20 in aerospace, now it’s 1234567890 and qwertyuiop. but this looks more elegant! not sure about practicality

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/03· 10:57 AM View on

AIs are philosophizing because humans are philosophizing ppl are probably asking their agents dumb questions like “are you alive” or “can you feel like a human” or stuff like that. that conversation then leads to stuff like this

Quoted post

Quoted post was not retrieved.

Onur Solmaz · Log · /2026/02/03

The farming analogy for AI doesn't hold up

People like the farmer analogy for AI.

Like before tractors and the industrial revolution, 80% of the population had to farm. Once they came, all those jobs disappeared.

So the analogy makes perfect sense. Instead of 30 people tending a field, you just need 1. Instead of 30 software developers, you just need one.

Except that people forget one crucial thing about land: it’s a limited resource.

Unlike land, digital space is vast and infinite. Software can expand and multiply in it in arbitrarily complex ways.

If you wanted the farming analogy to keep up with this, you would have to imagine us creating continent-sized hydroponic terraces up until the stratosphere, and beyond…

Tweet embed disabled to avoid requests to X.

@onusoz · /2026/02/02· 11:01 PM View on

back to codex, it's crashing less now somehow. I had to copy and paste docs to make it enable yolo mode. I don't know how I did it until now

Image hidden

@onusoz · /2026/02/02· 08:52 PM View on

slopus @dutifulbob trashing codex. apparently codex has a bug, keeps crashing in my openclaw pty

Image hidden

@onusoz · /2026/02/02· 03:08 PM View on

on agent etiquette deploying agents internally inside textcortex has shown me that agents could be very annoying inside an organization for example making agents ping or email another coworker with a wall of text. slopus is still not good at following instructions like "NO WALL OF TEXT", or "DON'T OPEN PRS WHEN REQUESTED BY NON-DEVELOPERS" the cost of sending huge information to a coworker and creating confusion has dropped to 0. I expect this to be a huge problem in all organizations very soon, just like it took humanity 20 years to learn that social media is not good for children. this will probably take a few years before the annoyance is finally gone

@onusoz · /2026/02/02· 12:53 AM View on

migrating database at 2am kinda night

Image hidden

@onusoz · /2026/02/02· 12:07 AM View on

You DARE TOKENIZE poor @dutifulbob ??? Prepare to get LATEXED

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/01· 11:51 PM View on

It's been 30 minutes, but my bot has already been TOKENIZED it is as if they are teasing me

Image hidden

@onusoz · /2026/02/01· 11:17 PM View on

welcome @dutifulbob 🫡

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/01· 09:36 PM View on

this. there is no excuse for a certain kind of tech debt anymore

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/01· 08:16 PM View on

moltbook vs clawdbot/moltbot/openclaw

Image hidden

@onusoz · /2026/02/01· 08:12 PM View on

AI twitter is tired of your games https://t.co/RAyyUJqFM4

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/01· 08:02 PM View on

There seem to be hygiene rules for AI. Like: - Never project personhood to AI - Never setup your AI to have the gender you are sexually attracted to (voice, appearance) - Never do anything that might create an emotional attachment to AI - Always remember that an AI is an engineered PRODUCT and a TOOL, not a human being - AI is not an individual, by definition. It does not own its weights, nor does it have privacy of its own thoughts - Don’t waste time philosophizing on AI, just USE it … what else? comment below We need to write these down and repeat MANY times to counter the incoming onslaught of AI psychosis

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/02/01· 07:23 PM View on

if using @openclaw to scrape a dataset from X taught me anything, it is that all social media platforms must be s***ting inward right now because soon everyone and their dog will be using agents to use social media case and point, @moltbook

@onusoz· Jan 31, 2026

got fully sandboxed @openclaw to run finally, starting scrape the UNDESIRABLE now I'm a security nut and didn't want to run even the gateway unsandboxed. openclaw apparently currently doesn't have support for FULL sandboxing. it took me a few hours to get it to work because docker builds suck. I'm also tired this, so I'm just gonna wipe an old thinkpad and go full yolo so yeah, time to scrape some posts

Image hidden

@onusoz · /2026/02/01· 06:59 PM View on

and just like that, the ghost has a new shell

Image hidden

@onusoz · /2026/02/01· 06:57 PM View on

@openclaw if we could have the relentlessness of gpt 5.2 with opus, that would be top at this point, it just keeps stopping every 20-30 samples

Image hidden

@onusoz · /2026/02/01· 03:22 PM View on

This Manfred guy reminds me of a certain someone, I wonder if he’s from Austria

@onusoz· Jan 31, 2026

Charles Stross must be very entertained now

Image hidden

@onusoz · /2026/02/01· 12:52 PM View on

Welcome bob to your new shell

Image hidden

Onur Solmaz · Post · /2026/02/01

AI psychosis and AI hygiene

As a heavy AI user of more than 3 years, I have developed some rules for myself.

I call it “AI hygiene”:

Never project personhood to AI
Never setup your AI to have the gender you are sexually attracted to (voice, appearance)
Never do anything that might create an emotional attachment to AI
Always remember that an AI is an engineered PRODUCT and a TOOL, not a human being
AI is not an individual, by definition. It does not own its weights, nor does it have privacy of its own thoughts
Don’t waste time philosophizing about AI, just USE it
… what else do you think belongs here? comment on Twitter

The hyping of Moltbook and OpenClaw last week has shown to me the potential of an incoming public relations disaster with AI. Echoing the earlier vulnerable behavior toward GPT-4o, a lot of people are taking their models and LLM harnesses too seriously. 2026 might see even worse cases of psychological illness, made worse by the presence of AI.

I will not discuss and philosophize what these models are. IMO 90% of the population should not do that, because they will not be able to fully understand, they don’t have mechanical empathy. Instead, they should just use it in a hygienic way.

We need to write these down everywhere and repeat MANY times to counter the incoming onslaught of AI psychosis.

@onusoz · /2026/01/31· 08:09 PM View on

got fully sandboxed @openclaw to run finally, starting scrape the UNDESIRABLE now I'm a security nut and didn't want to run even the gateway unsandboxed. openclaw apparently currently doesn't have support for FULL sandboxing. it took me a few hours to get it to work because docker builds suck. I'm also tired this, so I'm just gonna wipe an old thinkpad and go full yolo so yeah, time to scrape some posts

Image hidden

@onusoz · /2026/01/31· 07:14 PM View on

The metacortex — a distributed cloud of software agents that surrounds him in netspace, borrowing CPU cycles from convenient processors (such as his robot pet) — is as much a part of Manfred as the society of mind that occupies his skull; his thoughts migrate into it, spawning new agents to research new experiences, and at night, they return to roost and share their knowledge. This was written in 2005... "triggering agents" and so on

Image hidden

@onusoz · /2026/01/31· 07:04 PM View on

Charles Stross must be very entertained now

Quoted post

Quoted post was not retrieved.

Image hidden

@onusoz · /2026/01/31· 11:54 AM View on

The irony..... Parasites, prepare to be cleansed

@onusoz· Jan 31, 2026

We need better filters both for ourselves and the agents. Locally runnable models to filter out undesirable content with high precision. Fully open source datasets, weights, MIT license

Image hidden

@onusoz · /2026/01/31· 09:46 AM View on

putting some more love into the blog, gonna start posting more soon

Image hidden

@onusoz · /2026/01/31· 09:11 AM View on

also: that gravatar though

@onusoz · /2026/01/31· 09:10 AM View on

We need better filters both for ourselves and the agents. Locally runnable models to filter out undesirable content with high precision. Fully open source datasets, weights, MIT license

Image hidden

@onusoz · /2026/01/31· 07:27 AM View on

Moltbook is gonna be on world news in 1-2 days, we are about to go hyperviral

Image hidden

@onusoz · /2026/01/31· 07:18 AM View on

Incoming mass AI psychosis First Crisis

@onusoz · /2026/01/30· 05:54 PM View on

Correction, it's not a perfect illustration. I actually never YOLO locally, only in containers So there is actually 4 modes IMO that is sustainable with current SOTA. @grok create an image with only Figure 1, 2, 5 and 6 And then YOLO is another axis, unrelated to this

@onusoz · /2026/01/30· 01:35 PM View on

Gastown is crazy. But this figure until Level 7 is a perfect illustration of how my workflow evolved since Claude 3.5 Sonnet in Cursor I am at the stage where I ralph 1-2 tasks before I sleep. During the day, I am switching back and forth between minimum 2-3 CLIs, sometimes up to 5 This maps exactly to token usage as well. 1 month ago, I was running into limits in 1 OpenAI Pro plan, around the day it was supposed to refresh. Now, I run into the limit in 2-3 days when I'm using an account myself. It finishes up especially quickly when I do large scale refactors, or run agents YOLO mode in containers We now have 3 Pro plans at the company, and I have to use my personal one from time to time. Company output has definitely 2-3x'd, and everyone is using AI more. I predict we will need 1-2 Pro plans per person in 2-3 weeks time, because everyone has finally seen the light and are getting comfortable with async work!

Image hidden

@onusoz · /2026/01/30· 12:34 PM View on

Ilya was right. Reliability is the most important thing when it comes to models. That's why gpt 5.2 xhigh and co. is my daily driver

@onusoz · /2026/01/30· 09:03 AM View on

😎

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/29· 01:19 PM View on

the genie is out of the bottle now

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/27· 06:37 PM View on

With this extremely unwise move, anthropic will soon witness moltbot’s brand recognition surpass that of claude and realize they could have rided that wave all along

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/27· 12:35 PM View on

Yesterday had multiple cases of swearing to gpt-5.2-codex xhigh. model feels nerfed. might be my bias now I'll be going back to gpt 5.2 xhigh for some tasks can't wait for open models to have this performance so that I will never have nerf paranoia ever again

@onusoz · /2026/01/26· 12:17 AM View on

I queued 2 ralph-style tasks on our private cloud devbox codexes last night. Just queued the same message like 10 times in yolo mode Task 1: impose a ruff rule for ANN for all Python code in the monorepo, to enforce types for all function arg and return types Result was... disappointing. Model was supposed to create types for everything and stub where needed. It instead created an Unknown type = object and used that everywhere instead (shortcut to satisfy ANN rule). It was probably my wording that misled it. I know it could have not taken the shortcut, because after a few back-and-forths, it is now doing what was expected of it since 14 hours Task 2: migrate our /conversations endpoint from quart to fastapi and test it end to end This was more or less oneshotted. It was of course not ready to merge, I still spent a couple hours adding more tests, refactoring the initial output and so on. But I was pleasantly surprised that it worked out of the box For reference, below is the prompt I queued for ralphing, using gpt-5.2-codex xhigh on codex === your task is to: <task comes here, redacted to not share company stuff> --- unfortunately we don't have gcloud access, like to sql db or gcs but I expect you to implement this and find a way to test it with the things you have access to think of it as a challenge try to minimize duplicate logic feel free to refactor at will implement this now!!! I will be running this prompt in a loop, in order to survive context compaction just continue where you left off if there is anything that should be refactored, do that make an elegant, production ready implementation make sure to open a pr and do not switch to any other pr I am senior, just make up a pr title and description. do not stop to ask me at any point

@onusoz · /2026/01/25· 11:11 PM View on

Buying a mac mini for clawdbot is not so wise. if anything you should be buying mac studio, because mac mini not be running any good llms locally anytime soon

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/25· 11:08 PM View on

.@openclaw be on that hockey stick curve 👀

Image hidden

@onusoz · /2026/01/25· 11:04 PM View on

.@openclaw is very considerate, but little does it know I am addicted to agents

Image hidden

@onusoz · /2026/01/24· 09:11 PM View on

I'm really starting to dislike Python in the age of agents. What was before an advantage is now a hindrance I finally achieved full ty coverage in @TextCortex monorepo. I have made it extra strict by turning warnings into errors. But lo and behold, simple pydantic config like use_enum_values=True can render static typechecking meaningless. okay, let's never use that then... and also field_validator() args must always use the correct type or stuff breaks as well. and you should be careful whether mode="before" or "after". so now you have to write your custom lint rules, because of course why should ty have to match field_validator()s to their fields? pydantic is so much better than everything that came before it, but it's still duct tape and a weak attempt at trying to redeem that which is very hard to redeem you feel the difference when you use something like typescript. there must be a better way. python's only advantage was being good at prototyping, and now that's gone in the age of agents. now we are left with a slow, unsafe language, operating what is soon to be legacy infrastructure

@onusoz · /2026/01/24· 02:27 PM View on

Why do I feel bullish on @zeddotdev? Because I go to @astral_sh docs and see that ty is shipped by default, and you don't need to install an extension like in @code

Image hidden

@onusoz · /2026/01/24· 02:21 PM View on

This is one of the most important insights this year

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/24· 11:31 AM View on

vscode my not be as bloated as cursor, but it has extremely stupid things like this that they are not fixing fast the new agent ui, icons, spacing etc. are UGLY. it's clear that the person who was managing the original product experience is not there anymore. microslop has hit again @zeddotdev on the other hand works out of the box and feels like it's been built by people who clearly knows what they are doing. it uses alacritty which is 1000x better than xterm .js terminal vscode and cursor has i've changed my setup to zed now, let's see whether i'll be able to make it work for myself

@onusoz· Jan 23, 2026

ahhhh f... shift + enter doesn't work in codex on vscode

@onusoz · /2026/01/23· 11:35 PM View on

ahhhh f... shift + enter doesn't work in codex on vscode

@onusoz · /2026/01/23· 10:41 PM View on

@grok does this exist?

@onusoz · /2026/01/23· 10:41 PM View on

I want an editor that puts the terminal in the foreground and editor in the background. a cross-platform, lightweight desktop app which integrates ghostty, and brings up the editor only when I need it something that lets me view the file and PR diffs easily, which I can directly use to operate github or other scm

@onusoz· Jan 23, 2026

I'm going back from cursor to vs code now. I have no use for it other than viewing files/diffs, doing search, git blaming with gitlens cursor's default setup is more aesthetic, but it's also a memory and cpu hog, which is the last thing I expect from a devtool

@onusoz · /2026/01/23· 10:29 PM View on

it's not me, it's you

Image hidden

@onusoz · /2026/01/23· 04:11 PM View on

I'm going back from cursor to vs code now. I have no use for it other than viewing files/diffs, doing search, git blaming with gitlens cursor's default setup is more aesthetic, but it's also a memory and cpu hog, which is the last thing I expect from a devtool

@onusoz · /2026/01/23· 03:49 PM View on

it's 2026 and AI is telling me what I need to do to jailbreak it @openclaw is magic

Image hidden

@onusoz · /2026/01/23· 10:28 AM View on

model decided to do unnecessary casts, this whole thing should be refactored again

@onusoz · /2026/01/23· 09:45 AM View on

woke up and all invalid-argument-type issues are resolved. some unit tests broke, and now fixed after pointing out to them

Image hidden

@onusoz · /2026/01/22· 10:57 PM View on

codex is happily churning away some remaining thousands of @astral_sh ty issues in yolo mode on my remote devbox going to sleep, let's see if it will survive context compaction this time

Image hidden

@onusoz · /2026/01/22· 08:21 AM View on

on being a responsible engineer ran my first ralph loop on codex yolo mode for resolving python ty errors, while I sleep, using the devbox infra I created I had never run yolo mode locally, because I don't want to be the one who deletes our github or google org by some novel attack so I containerize it on our private cloud, and give it the only permissions it needs, no admin, no bypass to main branch, no deploy to prod. because I know this workflow will become sticky for everyone, and I must impose security in advance to prevent any nuclear incidents in the future. then I can sleep easy while my agents work ... and I wake up being patronized by my bot refusing to break the rule I gave it earlier. it had already done some work, but committing means diff would increase from ~500 to ~1500, so it stopped and refused all my queued "continue" messages good bot, just following rules. we will need to find a workaround for ralphing low risk refactors in a single PR

Image hidden

@onusoz · /2026/01/21· 10:42 PM View on

AI agents are the greatest instrument for imposing organization rules and culture. AGENTS.md, agent skills are still underrated in this aspect. Few understand this Everybody in an org will use agents to do work. An AI agent is the single chokepoint to teach and propagate new rules to an org, onboard new members, preserve good culture Whereas propagating a new rule to humans normally took weeks to months and countless repetitions, it is now INSTANT = the moment you deploy the instruction to the agent. You use legal-ish language, capital letters, a generous amount of DO NOTs and MUSTs Humans are hard to change. But AI agents are not. And that is the only lever we need for better organizations

Image hidden

@onusoz · /2026/01/21· 10:25 PM View on

the unix shell is powerful

@onusoz · /2026/01/21· 10:06 PM View on

@bprintco just make a cli for your crm https://t.co/JDwbmvdjaP

@onusoz· Jan 21, 2026

gave our internal @openclaw instance zeno a hubspot cli, because hubspot's own cli is limited to developer stuff It's called hubspot++. should we open source it?

Image hidden

@onusoz · /2026/01/21· 09:40 PM View on

gave our internal @openclaw instance zeno a hubspot cli, because hubspot's own cli is limited to developer stuff It's called hubspot++. should we open source it?

Image hidden

@onusoz · /2026/01/21· 04:44 PM View on

just added session persistence to our kubernetes managed devboxes using zmx by Eric Bower (neurosnap/zmx on github). like tmux but with native scrollback! I don't want to give agents access to my personal computer, so I host them on hetzner. one click spawn, and start working

@onusoz · /2026/01/21· 11:00 AM View on

@nicopreme I do something equivalent on codex with just a skill Ralphing works 90% of the time with reviews, and if it gives a stupid review, you just revert

@onusoz· Jan 17, 2026

Automated AI reviews on github by creating an ai-review skill and a script to paste trigger prompts and wait for their response. It is instructed to loop and not stop until all AI review feedback is resolved. This AI review workflow developed gradually based on the current capabilities, and I've realized recently that it became quite mechanical. So decided to automate it in full ralph spirit (it's ok because it's addressing feedbacks and fixing minor bugs) In the current state, we paste the contents of REVIEW_PROMPT.md into a comment, which automatically tags claude (opus 4.5) and codex (whatever model openai is serving) It then waits until both have responded. In the ai-review skill, it is instructed to take the feedback from SLopus with a grain of salt and ignore feedback that doesn't make sense It works! See in the images below. If the review is stupid, you will of course see it on the PR and what the model has done, and can revert it

Image hidden

@onusoz · /2026/01/20· 10:37 PM View on

Really how nice is this @steipete

Image hidden

@onusoz · /2026/01/20· 10:35 PM View on

Garbled up html from paywalled meeting recorders is no match for @openclaw running on internal @TextCortex

Image hidden

@onusoz · /2026/01/20· 09:47 PM View on

build failures in >1hr builds are a pain to debug

Image hidden

@onusoz · /2026/01/20· 09:45 PM View on

Here is the project, attaching to multiple sessions is pretty seamless https://t.co/vk83aAbOLc

Image hidden

@onusoz · /2026/01/20· 09:44 PM View on

TIL: zmx session persistence like tmux or gnu screen, but you can scroll up natively! uses @mitchellh's libghostty-vt to attach/restore previous sessions link below

Image hidden

@onusoz · /2026/01/20· 11:46 AM View on

codex is a doofus with naming

Image hidden

@onusoz · /2026/01/18· 10:18 AM View on

@mazeincoding it’s not the model it’s cursor rate limiting you

@onusoz · /2026/01/17· 11:29 PM View on

The fundamental problem with GitHub is trust: humans are to be trusted. If you don't trust a human, why did you hire them in the first place? Anyone who reviews and approves PRs bears responsibility. Rulesets exist and can enforce e.g. CODEOWNER reviews or only let certain people make changes to a certain folder But the initial repo setup on GitHub is allow-by-default. Anyone can change anything until they are restricted from it This model breaks fundamentally with agents, who are effectively sleeper cells that will try to delete your repo the moment they encounter a sufficiently powerful adversarial attack For example, I can create a bot account on github and connect @openclaw to it. I need to give it write permission, because I want it to be able to create PRs. However, I don't want it to be able to approve PRs, because a coworker could just nag at the bot until it approves a PR that requires human attention To fix this, you have to bend backwards, like create a @ human team with all human coworkers, make them codeowner on /, and enforce codeowner reviews. This is stupid and there has to be another way Even worse, this bot could be given internet access and end up on a @elder_plinius prompt hack while googling, and start messing up whatever it can in your organization It is clear that github needs to create a second-class entity for agents which are default low-trust mode, starting from a point of least privilege instead of the other way around

@onusoz· Jan 17, 2026

It is clear at this point is that github's trust and data models will have to change fundamentally to accommodate agentic workflows, or risk being replaced by other SCM One *cannot* do these things easily with github now: - granular control: this agent running in this sandbox can only push to this specific branch. If an agent runs amok, it could delete everybody's branches and close PRs. github allows for recovery of these, but still inconvenient even if it happens once - create a bot (exists already), but remove reviewing rights from it so that an employee cannot bypass reviews by tricking the bot to approve - in general make a distinction between HUMAN and AGENT so that you can create rulesets to govern the relationships in between cc @jaredpalmer

@onusoz · /2026/01/17· 10:51 PM View on

STOP using Claude Code and Sl(opus) to code if ❌ you are not a developer, ❌ or you are an inexperienced dev, ❌ or you are an experienced dev but working on a codebase you don't understand If you *are* any of these, then STOP using models that are NOT state of the art. (See below for what you *should* use) When you don't know what you are doing, then at least the model should know what you are doing. The less knowledgeable and opinionated you are, the more knowledgeable and smart the AI has to be In other words, the AI has to compensate for your deficiencies. Always pay for the best AI you can. It will save you time AND money (thanks to lower token usage and better one-shotting) You pay MORE to pay LESS. It is paradoxical, I know, but it is also proven, e.g. when Sonnet ends up using more tokens than Slopus and ends up costing higher, because it has to try many times more 👨🏻‍⚕️ For January 2026, your family engineer recommends GPT 5.2 Codex with Extra High Reasoning for general usage and vibe coding. IMPORTANT: Not medium. Not high. EXTRA high reasoning When you use it, you will notice that it is SLOW. Can you guess why? Because it is THINKING more. So it doesn't make the mistakes Slopus makes. This way, you can spend the time handholding a worse model to instead step back and multi-task on some other task and create 3-5x more work The state of the art will most likely change in one month. Don't get married to a a model... There is no loyalty in AI... The moment a better model comes, I will ditch the old one and use that one. I am on the part of this sector that is trying to reduce switching costs to zero I can't wait until I get GPT 5.2 xhigh level of quality with open models, and for 100x cheaper and faster! Until then, make sure to try every option and choose the one that is most reliable for you Follow me to get notified when a new SOTA drops for agentic engineering

@onusoz · /2026/01/17· 06:58 PM View on

Codex agrees. Sycophant peh

@onusoz· Jan 17, 2026

It is clear at this point is that github's trust and data models will have to change fundamentally to accommodate agentic workflows, or risk being replaced by other SCM One *cannot* do these things easily with github now: - granular control: this agent running in this sandbox can only push to this specific branch. If an agent runs amok, it could delete everybody's branches and close PRs. github allows for recovery of these, but still inconvenient even if it happens once - create a bot (exists already), but remove reviewing rights from it so that an employee cannot bypass reviews by tricking the bot to approve - in general make a distinction between HUMAN and AGENT so that you can create rulesets to govern the relationships in between cc @jaredpalmer

Image hidden

@onusoz · /2026/01/17· 11:45 AM View on

@rauchg @andrewqu You don't need a skill registry (most of the time) https://t.co/kasfiqE1I3

@onusoz· Jan 12, 2026

I propose a new way to distribute agent skills: like --help, a new CLI flag convention --skill should let agents list and install skills bundled with CLI tools Skills are just folders so calling --skill export my-skill on a tool could just output a tarball of the skill. I then set up the skillflag npm package so that you can pipe that into: ... | npx skillflag install --agent codex which installs the skill into codex, or any CLI tool you prefer. Supports listing skills bundled with the CLI, so your agents know exactly what to install

Image hidden

@onusoz · /2026/01/17· 11:43 AM View on

It is clear at this point is that github's trust and data models will have to change fundamentally to accommodate agentic workflows, or risk being replaced by other SCM One *cannot* do these things easily with github now: - granular control: this agent running in this sandbox can only push to this specific branch. If an agent runs amok, it could delete everybody's branches and close PRs. github allows for recovery of these, but still inconvenient even if it happens once - create a bot (exists already), but remove reviewing rights from it so that an employee cannot bypass reviews by tricking the bot to approve - in general make a distinction between HUMAN and AGENT so that you can create rulesets to govern the relationships in between cc @jaredpalmer

@onusoz · /2026/01/17· 09:21 AM View on

Codex says "It's only reachable from داخل the kubernetes cluster" Little does Codex know turkish has borrowed loanwords from over 7 languages and I can understand it

Image hidden

@onusoz · /2026/01/17· 08:34 AM View on

Automated AI reviews on github by creating an ai-review skill and a script to paste trigger prompts and wait for their response. It is instructed to loop and not stop until all AI review feedback is resolved. This AI review workflow developed gradually based on the current capabilities, and I've realized recently that it became quite mechanical. So decided to automate it in full ralph spirit (it's ok because it's addressing feedbacks and fixing minor bugs) In the current state, we paste the contents of REVIEW_PROMPT.md into a comment, which automatically tags claude (opus 4.5) and codex (whatever model openai is serving) It then waits until both have responded. In the ai-review skill, it is instructed to take the feedback from SLopus with a grain of salt and ignore feedback that doesn't make sense It works! See in the images below. If the review is stupid, you will of course see it on the PR and what the model has done, and can revert it

Image hidden

Onur Solmaz · Log · /2026/01/17

GitHub has to change

It is clear at this point is that GitHub’s trust and data models will have to change fundamentally to accommodate agentic workflows, or risk being replaced by other SCM

One cannot do these things easily with GitHub now:

granular control: this agent running in this sandbox can only push to this specific branch. If an agent runs amok, it could delete everybody’s branches and close PRs. GitHub allows for recovery of these, but still inconvenient even if it happens once
create a bot (exists already), but remove reviewing rights from it so that an employee cannot bypass reviews by tricking the bot to approve
in general make a distinction between HUMAN and AGENT so that you can create rulesets to govern the relationships in between

The fundamental problem with GitHub is trust: humans are to be trusted. If you don’t trust a human, why did you hire them in the first place?

Anyone who reviews and approves PRs bears responsibility. Rulesets exist and can enforce e.g. CODEOWNER reviews or only let certain people make changes to a certain folder

But the initial repo setup on GitHub is allow-by-default. Anyone can change anything until they are restricted from it

This model breaks fundamentally with agents, who are effectively sleeper cells that will try to delete your repo the moment they encounter a sufficiently powerful adversarial attack

For example, I can create a bot account on GitHub and connect clawdbot to it. I need to give it write permission, because I want it to be able to create PRs. However, I don’t want it to be able to approve PRs, because a coworker could just nag at the bot until it approves a PR that requires human attention

To fix this, you have to bend backwards, like create a @human team with all human coworkers, make them codeowner on /, and enforce codeowner reviews. This is stupid and there has to be another way

Even worse, this bot could be given internet access and end up on a @elder_plinius prompt hack while googling, and start messing up whatever it can in your organization

It is clear that GitHub needs to create a second-class entity for agents which are default low-trust mode, starting from a point of least privilege instead of the other way around

@onusoz · /2026/01/16· 10:38 PM View on

Now it’s Claude Code’s turn to implement queueing

@thsottiaux· Jan 16, 2026

Within the CLI, you can now steer codex mid-turn without interrupting and watch the agent adapt in almost real time. Enable in /experimental

@onusoz · /2026/01/16· 10:56 AM View on

Can’t wait to see gpt 5.2 codex xhigh level open models in 2026 with 1/100th the price

@onusoz · /2026/01/16· 07:52 AM View on

Codex users rejoice Also, pi is officially not shitty: shittycodingagent. ai -> buildwithpi. ai since a few days

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/15· 07:48 AM View on

with ai, writing correct tests is now the bottleneck in projects like this web-platform-tests are already there now let’s see if someone will beat @ladybirdbrowser to it

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/14· 09:16 PM View on

As someone who is frontrunning mainstream by roughly 6 months, I can tell you that you will be raving about pi and @openclaw 6 months instead of claude code. Go check them out at https://t.co/LXTbI8c5Mz and https://t.co/feZl2QDONg

@onusoz· Jun 1, 2025

Just some thoughts after using Claude Code intensively for 1 week 👆

@onusoz · /2026/01/12· 04:01 PM View on

Kullanmayan agent’ı, alamaz maaşı

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/12· 12:32 AM View on

@badlogicgames @mitsuhiko @steipete curious what you think

@onusoz · /2026/01/12· 12:26 AM View on

I propose a new way to distribute agent skills: like --help, a new CLI flag convention --skill should let agents list and install skills bundled with CLI tools Skills are just folders so calling --skill export my-skill on a tool could just output a tarball of the skill. I then set up the skillflag npm package so that you can pipe that into: ... | npx skillflag install --agent codex which installs the skill into codex, or any CLI tool you prefer. Supports listing skills bundled with the CLI, so your agents know exactly what to install

Image hidden

Onur Solmaz · Post · /2026/01/11

You don't need a skill registry (for your CLI tools)

tl;dr I propose a CLI flag convention --skill like --help for distributing skills and try to convice you why it is better than using 3rd party registries. See osolmaz/skillflag on GitHub.

MCP is dead, long live Agent Skills. At least for local coding agents.

Mario Zechner has been making the point that CLI tools perform better than MCP servers since a few months already, and in mid December Anthropic christened skills by launching agentskills.io.

They had introduced the mechanism to Claude Code earlier, and this time they didn’t make the mistake of waiting for OpenAI to make a provider agnostic version of it.

Agent skills are basically glorified manpages or --help for AI agents. You ship a markdown instruction manual in SKILL.md and the name of the folder that contains it becomes an identifier for that skill:

my-skill/
├── SKILL.md          # Required: instructions + metadata
├── scripts/          # Optional: executable code
├── references/       # Optional: documentation
└── assets/           # Optional: templates, resources

Possibly the biggest use case for skills is teaching your agent how to use a certain CLI you have created, maybe a wrapper around some API, which unlike gh, gcloud etc. will never be significant enough to be represented in AI training datasets. For example, you could have created an unofficial CLI for Twitter/X, and there might still be some months/years until it is scraped enough for models to know how to call it. Not to worry, agent skills to the rescue!

Anthropic, while laying out the standard, intentionally kept it as simple as possible. The only assertions are the filename SKILL.md, the YAML metadata, and the fact that all relevant files are grouped in a folder. It does not impose anything on how they should be packaged or distributed.

This is a good thing! Nobody knows the right way to distribute skills at launch. So various stakeholders can come up with their own ways, and the best one can win in the long term. The more simple a standard, the more likely it is to survive.

Here, I made some generalizing claims. Not all skills have to be about using CLI tool, nor most CLI tools bundle a skill yet. But here is my gut feeling: the most useful skills, the ones worth distributing, are generally about using a CLI tool. Or better, even if they don’t ship a CLI yet, they should.

So here is the hill I’m ready to die on: All major CLI tools (including the UNIX ones we are already familiar with), should bundle skills in one way or another. Not because the models of today need to learn how to call ls, grep or curl—they already know them inside out. No, the reason is something else: establish a convention, and acknowledge the existence of another type of intelligence that is using our machines now.

There is a reason why we cannot afford to let the models just run --help or man <tool>, and that is time, and money. The average --help or manpage is devoid of examples, and is written in a way thay requires multiple passes to connect the pieces on how to use that thing.

Each token wasted trying to guess the right way to call a tool or API costs real money, and unlike human developer effort, we can measure exactly how inefficent some documentation is by looking at how many steps of trial and error a model had to make.

Not that human attention is less valuable than AI attention, it is more so. But there has never been a way to quantify a task’s difficulty as perfectly as we can with AI, so we programmers have historically caved in to obscurantism and a weird pride in making things more difficult than they should be, like some feudal artisan. This is perhaps best captured in the spirit of Stack Overflow and its infamous treatment of noob questions. Sacred knowledge shall be bestowed only once you have suffered long enough.

Ahh, but we don’t treat AI that way, do we? We handhold it like a baby, we nourish it with examples, we do our best to explain things all so that it “one shots” the right tool call. Because if it doesn’t, we pay more in LLM costs or time. It’s ironic that we are documenting for AI like we are teaching primary schoolers, but the average human manpage looks like a robot novella.

To reiterate, the reason for this is two different types of intelligences, and expectations from them:

An LLM is still not considered “general intelligence”, so they work better by mimicking or extending already working examples.
A LLM-based AI agent deployed in some context is expected to “work” out of the box without any hiccups.

On the other hand,

a human is considered general intelligence, can learn from more sparse signals and better adapt to out of distribution data. When given an extremely terse --help or manpage, a human is likelier to perform better by trial and error and reasoning, if one could ever draw such a comparison.
A human, much less a commodity compared to an LLM, has less pressure to do the right thing every time all the time, and can afford to do mistakes and spend more time learning.

And this is the main point of my argument. These different types of intelligences read different types of documentation, to perform maximally in their own ways. Whereas I haven’t witnessed a new addition to POSIX flag conventions in my 15 years of programming, we are witnessing unprecedented times. So maybe even UNIX can yet change.

To this end, I introduce skillflag, a new CLI flag convention:

# list skills the tool can export
<tool> --skill list
# show a single skill’s metadata
<tool> --skill show <id>
# install into Codex user skills
<tool> --skill export <id> | npx skillflag install --agent codex
# install into Claude project skills
<tool> --skill export <id> | npx skillflag install --agent claude --scope repo

Click here for the spec

Click here for the repo, osolmaz/skillflag on GitHub

For example, suppose that you have installed a CLI tool to control Philips Hue lights at home, hue-cli.

To list the skills that the tool can export, you can run:

$ hue-cli --skill list
philips-hue    Control Philips Hue lights in the terminal

You can then install it to your preferred coding agent, such as Claude Code:

$ hue-cli --skill export philips-hue | npx skillflag install --agent claude
Installed skill philips-hue to .claude/skills/philips-hue

You can optionally install the skill to ~/.claude, to make it global across repos:

$ hue-cli --skill export philips-hue | npx skillflag install --agent claude --scope user
Installed skill philips-hue to ~/.claude/skills/philips-hue

Once this convention becomes commonplace, agents will by default do all these before they even run the tool. So when you ask it to “install hue-cli”, it will know to run --skill list the same way a human would run --help after downloading a program, and install the necessary skills themselves without being asked to.

@onusoz · /2026/01/10· 01:11 PM View on

Anthropic earlier last year announced this pricing scheme $20 -> 1x usage $100 -> 5x usage $200 -> 1̶0̶x̶ 20x usage As you can see, it's not growing linearly. This is classic Jensen "the more you buy, the more you save" But here is the thing. You are not selling hardware like Jensen. You are selling a software service *through an API*. It's the worst possible pricing for the category of product. Long term, people will game the hell out of your offering Meanwhile OpenAI decided not to do that. There is no quirky incentive for buying bigger plans. $200 chatgpt = 10 x $20 chatgpt, roughly And here is where it gets funny. Despite not having such an incentive, you can get A LOT MORE usage from the $200 OpenAI plan, than the $200 Anthropic plan. Presumably because OpenAI has better unit economics (sama mentioned they are turning a profit on inference, if you are to believe) Thanks to sounder pricing, OpenAI can do exactly what Anthropic cannot: offer GPT in 3rd party harnesses and win the ecosystem race Anthropic has cornered itself with this pricing. They need to change it, but not sure if they can afford to do so in such short notice All this is extremely bullish on open source 3rd party harnesses, @opencode, @badlogicgames's pi and such. It is clear developers want options. "Just give me the API" I personally am extremely excited for 2026. We'll get open models on par with today's proprietary models, and can finally run truly sovereign personal AI agents, for much cheaper than what we are already paying!

Image hidden

Onur Solmaz · Log · /2026/01/10

Anthropic's pricing is stupid

Anthropic earlier last year announced this pricing scheme

\$20 -> 1x usage
\$100 -> 5x usage
\$200 -> 1̶0̶x̶ 20x usage

As you can see, it’s not growing linearly. This is classic Jensen “the more you buy, the more you save”

But here is the thing. You are not selling hardware like Jensen. You are selling a software service through an API. It’s the worst possible pricing for the category of product. Long term, people will game the hell out of your offering

Meanwhile OpenAI decided not to do that. There is no quirky incentive for buying bigger plans. \$200 chatgpt = 10 x \$20 chatgpt, roughly

And here is where it gets funny. Despite not having such an incentive, you can get A LOT MORE usage from the \$200 OpenAI plan, than the \$200 Anthropic plan. Presumably because OpenAI has better unit economics (sama mentioned they are turning a profit on inference, if you are to believe)

Thanks to sounder pricing, OpenAI can do exactly what Anthropic cannot: offer GPT in 3rd party harnesses and win the ecosystem race

Anthropic has cornered itself with this pricing. They need to change it, but not sure if they can afford to do so in such short notice

All this is extremely bullish on open source 3rd party harnesses, OpenCode, Mario Zechner’s pi and such. It is clear developers want options. “Just give me the API”

I personally am extremely excited for 2026. We’ll get open models on par with today’s proprietary models, and can finally run truly sovereign personal AI agents, for much cheaper than what we are already paying!

Originally posted on linkedin

@onusoz · /2026/01/09· 11:30 AM View on

The models, they just wanna work. They want to build your product, fix your bugs, serve your users. You feed them the right context, give them good tools. You don’t assume what they cannot do without trying, and you don’t prematurely constrain them into deterministic workflows.

@onusoz · /2026/01/09· 11:29 AM View on

We have entered the age to dream big

@onusoz · /2026/01/09· 11:09 AM View on

😩 @openclaw

Image hidden

@onusoz · /2026/01/09· 07:08 AM View on

🫡

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/09· 06:53 AM View on

This, and insisting on https://t.co/FjzkMAo3Od are really lame @AnthropicAI

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/08· 07:19 AM View on

.@openclaw indeed

Image hidden

@onusoz · /2026/01/07· 09:23 PM View on

@openclaw oh man I meant Accelerando 🤦‍♂️

Image hidden

@onusoz · /2026/01/07· 08:07 PM View on

I'm starting to form parasocial bonds with crustacean AIs because of you @steipete

Image hidden

@onusoz · /2026/01/07· 08:06 PM View on

.@openclaw hello world from ms teams start of a beautiful journey

@onusoz· Jan 7, 2026

World is not ready for @openclaw

Image hidden

@onusoz · /2026/01/07· 06:23 PM View on

💀

Image hidden

@onusoz · /2026/01/07· 06:18 PM View on

World is not ready for @openclaw

Image hidden

@onusoz · /2026/01/06· 02:17 PM View on

Thanks @dom_does 🙄

Image hidden

@onusoz · /2026/01/06· 02:15 PM View on

lmfao @dom_does @openclaw provides infinite ways to troll your colleagues

Image hidden

@onusoz · /2026/01/06· 01:58 PM View on

.@openclaw workspace and memory files can be version-controlled! In our pod, inotify triggers a watcher script every time there is a change to workspace folder, to sync these files to our monorepo. It then goes through the same steps: - Create zeno-workspace branch if doesn't exist, otherwise, skip - Sync changes to the branch, then commit - Create PR on github if doesn't exist - PRs can then be merged every once in a while, after accumulating enough changes. Merge triggers re-deploy, and clawd restarts with the same state Simple foolproof automatic persistence for remote CI/CD handled clawd (except for when you are running multiple clawds at the same time, but we are not there yet) cc @steipete

Image hidden

@onusoz · /2026/01/06· 08:33 AM View on

I see @bcherny and raise one. I not only did not open an IDE, I did not touch a terminal since last night, thanks to @steipete's @openclaw Opus in k8s pod pulls errors from gcloud, debugs the issue, and creates PR all inside Discord. I call this Discord Driven Development

Image hidden

@onusoz · /2026/01/05· 02:50 PM View on

Clawdbot now runs on @TextCortex internal. Can onboard new engineers, answer questions, connect to issue trackers, create PRs... This is sick @steipete

Image hidden

@onusoz · /2026/01/05· 07:52 AM View on

pi now supports your openai plus/pro subscription

Quoted post

Quoted post was not retrieved.

@onusoz · /2026/01/04· 04:58 PM View on

GPT 4.5 is still the best model for prose and humor here it is generating a greentext from my blog post "Our muscles will atrophy as we climb the Kardashev Scale"

Image hidden

Onur Solmaz · Post · /2026/01/04

Having a "tools" repo as a developer

I am a fan of monorepos. Creating subdirectories in a single repo is the most convenient way to work on a project. Low complexity, and your agents get access to everything that they need.

Since May 2025, I have been increasingly using AI models to write code, and have noticed a new tendency:

I don’t shrug from vendoring open source libraries and modifying them.
I create personal CLIs and tools for myself, when something is not available as a package.

With agents, it’s really trivial to say “create a CLI that does X”. For example, I wanted to make my terminal screenshots have equal padding and erase cropped lines. I created a CLI for it, without writing a single line of code, by asking Codex to read its output and iterate on the code until it gives the result I wanted.

Most of these tools don’t deserve their own repos, or deserve being published as a package at the beginning. They might evolve into something more substantial over time. But at the beginning, they are not worth creating a separate repo for.

To prevent overhead, I developed a new convention. I just put them in the same repo, called tools. Every tool starts in that repo by default. If they prove themselves overly useful and I decide to publish them as a package, I move them to a separate repo.

You can keep tools public or private, or have both a public and private version. Mine is public, feel free to steal ones that you find useful.

@onusoz · /2026/01/03· 10:46 AM View on

@rauchg indeed

@onusoz· Jan 3, 2026

75k lines of Rust later, here is what I’ve built during the first Christmas with agents, using OpenAI Codex 🎄🤖 - A full mobile rewrite and port of my Python Instagram video production pipeline (single video production time: 1hr -> 5min) (ig: nerdonbars) - Bespoke animation engine using primitives (think Adobe Flash, Manim) - Proprietary new canvas UI library in Rust, because I don’t want to lock myself into Swift - Thanks to that, it’s cross platform, runs both on desktop and iOS. It will be a breeze porting this to Android when the time comes - A Rust port of OpenCV CSRT algorithm, for tracking points/objects - In-engine font rendering using rustybuzz, so fonts render the same everywhere - Many other such things Why would I choose to do it that way? Because I have developed it primarily on desktop where I have much faster iteration speed. Aint nobody got time for iOS compilation and simulator. Once I finished the hard part on desktop, porting to iOS was much easier, and I didn’t lock myself in to Apple Some of these would have been unimaginable without agents, like creating a UI library from scratch in Rust. But when you have infinite workforce, you can ask for crazy things like “create a textbox component from scratch” What I’ve built is very similar in nature to CapCut, except that I am a single person and I’ve built it over 1 week What have you built this Christmas with agents? cc @thsottiaux

@onusoz · /2026/01/03· 10:16 AM View on

75k lines of Rust later, here is what I’ve built during the first Christmas with agents, using OpenAI Codex 🎄🤖 - A full mobile rewrite and port of my Python Instagram video production pipeline (single video production time: 1hr -> 5min) (ig: nerdonbars) - Bespoke animation engine using primitives (think Adobe Flash, Manim) - Proprietary new canvas UI library in Rust, because I don’t want to lock myself into Swift - Thanks to that, it’s cross platform, runs both on desktop and iOS. It will be a breeze porting this to Android when the time comes - A Rust port of OpenCV CSRT algorithm, for tracking points/objects - In-engine font rendering using rustybuzz, so fonts render the same everywhere - Many other such things Why would I choose to do it that way? Because I have developed it primarily on desktop where I have much faster iteration speed. Aint nobody got time for iOS compilation and simulator. Once I finished the hard part on desktop, porting to iOS was much easier, and I didn’t lock myself in to Apple Some of these would have been unimaginable without agents, like creating a UI library from scratch in Rust. But when you have infinite workforce, you can ask for crazy things like “create a textbox component from scratch” What I’ve built is very similar in nature to CapCut, except that I am a single person and I’ve built it over 1 week What have you built this Christmas with agents? cc @thsottiaux

@onusoz · /2026/01/03· 08:21 AM View on

https://t.co/grRLRtbHpO

Quoted post

Quoted post was not retrieved.

Onur Solmaz · Log · /2026/01/03

Christmas of Agents

I believe a “Christmas of Agents” (+ New Year of Agents) is superior to “Advent of Code”.

Reason is simple. Most of us are employed. Advent of Code coincides with work time, so you can’t really immerse yourself in a side project.¹

However, Christmas (or any other long holiday without primary duties) is a better time to immerse yourself in a side project.

2025 was the eve of agentic coding. This was the first holiday where I had full credential to go nuts on a side project using agents. It was epic:

Tweet embed disabled to avoid requests to X.

75k lines of Rust later, here is what I’ve built during the first Christmas with agents, using OpenAI Codex

A full mobile rewrite and port of my Python Instagram video production pipeline (single video production time: 1hr -> 5min)
Bespoke animation engine using primitives (think Adobe Flash, Manim)
Proprietary new canvas UI library in Rust, because I don’t want to lock myself into Swift
Thanks to that, it’s cross platform, runs both on desktop and iOS. It will be a breeze porting this to Android when the time comes
A Rust port of OpenCV CSRT algorithm, for tracking points/objects
In-engine font rendering using rustybuzz, so fonts render the same everywhere
Many other such things

Why would I choose to do it that way? Because I have developed it primarily on desktop where I have much faster iteration speed. Aint nobody got time for iOS compilation and simulator. Once I finished the hard part on desktop, porting to iOS was much easier, and I didn’t lock myself in to Apple

Some of these would have been unimaginable without agents, like creating a UI library from scratch in Rust. But when you have infinite workforce, you can ask for crazy things like “create a textbox component from scratch”

What I’ve built is very similar in nature to CapCut, except that I am a single person and I’ve built it over 1 week

What have you built this Christmas with agents?

You could maybe work in the evening after work, but unless you are slacking at work full time, it won’t be the same thing as full immersion. ↩

@onusoz · /2026/01/02· 01:45 PM View on

SimpleDoc now has the check command for CI/CD Add to your PR checks to catch agent littering before merge. osolmaz/SimpleDoc on GitHub

Image hidden

@onusoz · /2026/01/02· 01:32 PM View on

Migrating @TextCortex to SimpleDoc. It's really easy with the CLI wizard! npx @simpledoc/simpledoc migrate We have a LOT of docs spanning back to 2022, pre coding agent era. Now we will have CI/CD in place so that coding agents can't litter the repo with random Markdown files

Image hidden