Post

  1. Portrait of Onur Solmaz
    @onusoz
    This. Agent Experience first. Agent Ergonomics. we need to get used to these terms
    @_andydeng·
    We have long paid close attention to improving user experience (UX) and developer experience (DX) when building apps and tools. There is no doubt that we are now entering an era where the apps and services we produce must also—if not primarily—cater to the needs of a new type of consumer: AI agents. This means that from now on, we need to think about delivering good AX (Agent Experience). We have seen this trend forming ever since the birth of MCP and, later, the popularity of Skills. A recent blog post from the Next.js team discussed the necessity of exposing more information to agents within development tooling, allowing coding agents to make better decisions based on a more complete awareness of errors and outputs. It is living proof that the software we build needs to adapt to this new type of user. With OpenClaw becoming the embodiment of the powerful personal agent almost overnight, we are seeing platforms dedicated to agents, like MoltBook and ClawNews, burst onto the scene. Moreover, the simplicity of OpenClaw (and the underlying pi agent it uses) regarding tool calling has boosted the practice of packaging web services into simple, local CLIs. This form of interface had long fallen "out of fashion" for common users due to its abstract UX, but it is now regaining popularity because of its simplicity and token-efficiency for AI agents. Even though we might not have fully figured out where these agent-oriented web platforms will lead—they might simply be slop art, or they might one day catalyze meaningful agent self-growth and collaboration—there is no doubt that what an agent needs when interacting with the web, apps, and services is vastly different from what human users need. ## The Loop Humans and AI agents both rely on a loop when consuming information and completing tasks. We, the humans, access information mostly via apps and websites. We open an interface, we read, click, type, think, and repeat. AI agents, on the other hand, receive initial prompts from humans and then basically stay in a loop of communicating with LLMs and calling tools to obtain more context and data until a conclusion is reached and they report back. (The human element might become less important or even unnecessary once these agents become highly autonomous.) ![HBk8S_HbAAAh5rE.png](media/2024716661576913092/HBk8S_HbAAAh5rE.png) ## The Beauty of Files and CLIs Tool calling is clearly the most vital mechanism an agent relies on to interact with the external world. As demonstrated by the Pi agent, the tools an agent needs generally boil down to just two categories: file operations (read, write, edit) and command execution (to consume services or obtain data). Humans prefer rich, smooth interactions and visually appealing UIs to efficiently consume information, produce artifacts, and complete tasks. For agents, however, efficiency looks entirely different: the more straightforward the path between request and result, and the fewer tokens the process imposes, the better. This is why the progressive disclosure of data employed by the Skills system is highly favored over MCP. The input-output efficiency of command-line tooling has become the easiest way for agents to gain knowledge of the outside world. ## The Middleware Obstacle For agents, traditional webpages and app interfaces are essentially obstacles. There are already countless attempts to help agents navigate existing webpages or operate mobile apps. Google recently released WebMCP in an attempt to lower the barrier for agents operating on websites. While this looks very promising as a unified approach catering to a massive number of legacy websites, a browser still sits in the middle, forcing agents to interact with interfaces built for human consumption. Understandably, these middleman approaches will likely persist for some time, as there is no unified mechanism to retroactively fit the old web into the new world. Beyond the asymmetrical demands between humans and agents when accessing information, we cannot ignore the fact that a large number of websites actively and aggressively block autonomous agents. This seems logical for social and UGC platforms like X—after all, we aren't quite ready to socialize with agents yet. However, these protective measures vividly demonstrate that the web, its apps, and the mindset behind them were fundamentally not designed for agents. ## APIs and CLIs - The Agentic Way It has become clear that for agents, direct access to functionality—even in the form of raw, low-level APIs—is far superior to wiggling through complex UIs. In that sense, many companies currently selling apps and tools to human users will eventually pivot to selling APIs or CLIs to agents. If the business value exposed through these APIs isn't sophisticated enough to prevent an agent from replicating it with relatively little effort, the business model itself might not survive as LLMs evolve. This is exactly what Karpathy discussed in a recent tweet: > "99% of products/services still don't have an AI-native CLI yet." -- Andrej Karpathy In 2026, new and existing services will become hyper-aware of agent ergonomics and will start offering better experiences via CLIs or streamlined APIs. When building products or producing digital artifacts, developers will always have to consider "the other type of user." For all apps, functionalities currently exposed via human-oriented UIs are destined to be transformed into CLI arguments or API parameters. Either that, or someone will build an AI-native alternative specifically to cater to agent needs. Documentation that teaches users how to navigate specific workflows will also start to be complemented with agent Skill specifications. This transformation will be fast for some categories of apps and slow for others. Applications that employ proprietary formats, rely on highly complex data manipulation logic, or fundamentally require human intuition will probably retain their current forms for a while. Eventually, however, agents will find a way to become the primary users of even the most complex software. Looking further ahead, sensors bridging the physical and digital worlds—or devices providing pure physical utility—can also benefit from this transformation. Even when agents gain a physical form, like a humanoid robot, and possess the skills to navigate environments designed for humans, an agent-friendly digital interface will still be superior in certain contexts. Imagine a humanoid operating a rice cooker: it absolutely needs vision and motors to lift the lid and pour the rice. But when it comes to setting the timer or selecting the cooking mode, a direct API exchange will always be more efficient than using a camera to look for a digital panel and using a robotic finger to press the buttons. ## The Inevitable Standard of AX We are rapidly approaching a tipping point where AX will require universal standards. Just as we have spent the last decade obsessing over SEO and performance metrics to help search engines parse our sites, we will soon need standardized protocols for agent interaction. Whether these emerge as universally adopted CLI wrappers, predictable API architectures, or even decentralized agent-centric standards like ERC-8004, a structured framework is inevitable. The platforms and developers who define these protocols will ultimately dictate the next decade of digital infrastructure. Ultimately, this transition from UX to AX is not about removing humans from the equation; it is about removing friction for the increasingly autonomous tools we employ. When our software is optimized for the entities that process data the fastest—our agents—our own capabilities are amplified. We are moving past the point where an agent-friendly interface is just a clever feature. Very soon, building for agents won't just be an alternative to building for humans; it will be the prerequisite for participating in the digital ecosystem.
    View on