Post
"this is the worst AI will ever be" I'm sad, not because this is right, but because it is wrong OpenAI's frontier coding model gpt-5.3-codex-xhigh feels a lot worse compared to before. It is sloppy and lazy, though it's UX got better with messages It feels like the gpt-5.2-codex-xhigh at the end of December was a lot more diligent and thorough, and did not make stupid mistakes like the one I posted before. might be a model or harness problem, I don't know @sama says users tripled since beginning of the year, so what should we expect? of course they will make infra changes that will feel like cutting corners, and I don't blame them for them and about "people want faster codex". I do want faster codex. but I want it in a way that doesn't lower the highest baseline performance compared to the previous generation. I want the optionality to dial it down to as slow as it needs to be, to be as reliable as before it is of course easier said than done. kudos to the codex team for not having any major incidents while taking the plane apart and putting it back together during flight. they are juggling an insane amount of complexity, and the whims of thousands of different stakeholders my hope is that this post is taken as a canary. I am getting dumber because of the infra changes there. I have no other option because codex was really that good compared to the competition my wish is to have detailed announcements as to what changes on openai codex infra, when it changes, so I can brace myself. we don't get notified about these changes, despite our performance and livelihoods depending on it. I have to answer to others when the tool I deemed reliable yesterday stops working today, not the tool on another note, performance curve of these models seem to be a rising sinusoidal. crests correspond to release of a new generation. they start with a smaller user base for testing, and it has the highest quality at this point. then it enshittifies as the model is scaled to the rest of the infra. we saw the pattern numerous times in the last 3 years across multiple companies, so I think we should accept it as an economic law