Entries for December 31, 2025

@onusoz · /2025/12/31· 04:37 PM View on

Anyone created an agent skill for splitting PRs for good review culture?

@onusoz · /2025/12/31· 03:17 PM View on

GPT 5.2 xhigh feels like a much more careful architecter and debugger, when it comes to complex systems But most people here think Opus 4.5 is the best model in that category There are 2 reasons AFAIS: - xhigh reasoning consumes significantly more tokens. You need to pay for ChatGPT Pro (200 usd) to be able to use it as a daily driver - It takes like 5x longer to finish a task, and most people lack the patience to wait for it. (But then it's more correct/doesn't need fixing) Opus 4.5 is good too, I think better in e.g. frontend design. But if you think it beats GPT 5.2 in every category, you are either too poor/stingy or have ADHD

Quoted post

Quoted post was not retrieved.

@onusoz · /2025/12/31· 11:44 AM View on

Just 5 months ago, I was swearing at Claude 4 Sonnet like a Balkan uncle Models one-shotted the right thing only 20-30% of the time but did really stupid things the rest, and had to be handheld tightly Today they are much, much better. My psychology is a lot more at ease, and instead of swearing, I want to kiss them on the forehead most of the time Now I trust agents so much that I queue up 5-10 tasks before going to sleep. They work the whole night while I sleep and I wake up to resolved issues GPT 5.2 xhigh and Claude 4.5 Opus are already goated (GPT more so), can't wait for them to get even faster

Image hidden

@onusoz · /2025/12/31· 09:58 AM View on

Codex does not have support for subagents. I tried to use Claude Code to launch 8 Codex instances in parallel on separate tasks, but Opus 4.5 had difficulty following instructions So created a CLI tool to scan pending TODOs from a markdown file, and let me launch as many harnesses as I want (osolmaz/spawn on github) I currently use this for relatively read-only tasks like planning and finding root causes of bugs, because it's launching all the agents on the same repo and they might conflict Ideas: - Use @mitsuhiko's gh-issue-sync and run parallel agents directly on github issues - Create any new clones or worktrees for each task. I currently don't do this because I don't dare duplicate rust target dir 10x on my measly macbook air - Support modes other than tmux, e.g. launching a terminal like Ghostty - TUI for easy selection of issues/TODOs Other ideas are welcome!

Image hidden