Agentic coding tools should give more control over message queueing
Below: Why agentic coding tools like Cursor, Claude Code, OpenAI Codex, etc. should implement more ways of letting users queue messages.
See Peter Steinberger’s tweet where he queues continue 100 times to nudge the GPT-5-Codex model to not stop while working on a predictable, boring and long-running refactor task:
This is necessary while working with a model like GPT-5-Codex. The reason is that the model has a tendency to stop generating at certain checkpoints, due to the way it has been trained, even when you instruct it to FINISH IT UNTIL COMPLETION!!1!. So the only way to get it to finish something is to use the message queue.1
But this isn’t the only use case for queued messages. For example, you can use the model to retrieve files into its context, before starting off a related task. Say you want to find the root cause of a <bug in component X>. Then you can queue
Explain how <component X> works in plain language. Do not omit any details.Find the root cause of <bug> in <component X>.
This will generally help the model to find the root cause easier, or make more accurate predictions about the root cause, by having the context about the component.
Another example: After exploring a design in a dialogue, you can queue the next steps to implement it.
<Prior conversation exploring how to design a new feature>
Create an implementation plan for that in the docs/ folder. Include all the details we discussedCommit and push the docImplement the feature according to the plan.Continue implementing the feature until it is done. Ignore this if the task is already completed.Continue implementing the feature until it is done. Ignore this if the task is already completed.… you get the idea.
I generally queue like this when the feature is specified enough in the conversation already. If it’s underspecified, then the model will make up stuff.
When I first moved from Claude Code to Codex, the way it implemented queued messages was annoying (more on the difference below). But as I grew accustomed to it, it started to feel a lot like something I saw elsewhere before: chess premoves.
Chess???
A premove is a relatively recent invention in chess which is made possible by digital chess engines. When the feature is turned on, you don’t need to wait for your opponent to finish their move, and instead can queue your next move. It then gets executed automatically if the queued move is valid after your opponent’s move:
If you are fast enough, this let’s you move without using up your time in bullet chess, and even lets you queue up entire mate-in-N sequences, resulting in highly entertaining cases like the video above.
I tend to think of message queueing as the same thing: when applied effectively, it saves you a lot of time, when you can already predict the next move.
In other words, you should queue (or premove) when your next choice is decision-insensitive to the information you will receive in the next turn—so waiting wouldn’t change what you do, it would only delay doing it.
With this perspective, some obvious candidates for queuing in agentic codeing are rote tasks that come before and after “serious work”, e.g:
- making the agent explain the codebase,
- creating implementation plans,
- fixing linting errors,
- updating documentation during work before starting off a subsequent step,
- committing and pushing,
- and so on.
Different ways CLI agents implement queued messages
As I have mentioned above, Claude Code implements queued messages differently from OpenAI Codex. In fact, there are three main approaches that I can think of in this design space, which is based on when a user’s new input takes effect:
-
Post-turn queuing (FIFO2): User messages wait until the current action finishes completely before they’re handled. Example: OpenAI Codex CLI.
-
Boundary-aware queuing (Soft Interrupt): New messages are inserted at natural breakpoints, like after finishing a tool call, assistant reply or a task in the TODO list. This changes the model’s course of action smoothly, without stopping ongoing generation. Example: Claude Code, Cursor.
-
Immediate queuing (Hard Interrupt): New user messages immediately stop the current action/generation, discarding ongoing work and restarting the assistant’s generation from scratch. I have not seen any tool that implements this yet, but it could be an option for the impatient.
Why not implement all of them?
And here is my title-sake argument: When I move away from Claude Code, I miss boundary-aware queuing. When I move away from OpenAI Codex, I miss FIFO queueing.
I don’t see a reason why we could not implement all of them in all agentic tools. It could be controlled by a key combo like Ctrl+Enter, a submenu, or a button, depending on whether you are in the terminal or not.
Having the option would definitely make a difference in agentic workflows where you are running 3-4 agents in parallel.
So if you are reading this and are implementing an agentic coding tool, I would be happy if you took all this into consideration!
-
Pro tip: Don’t just queue
continueby itself, because the model might get loose from its leash and start to make up and execute random tasks, especially after context compaction. Always specify what you want it to continue on, e.g.Continue handling the linting errors until none remain. Ignore this if the task is already completed.↩ -
First-in, first-out. ↩