OpenClaw has quietly shipped one of its most significant architectural changes to date: the agent execution layer is now pluggable. Instead of a single built-in runtime handling every agent turn, the platform now maintains a harness registry where extensions can register their own execution backends. And the first one to ship is a Codex extension that delegates agent turns to OpenAI's app-server.
What Changed, Exactly
Previously, OpenClaw's “PI harness” was hardcoded as the only path for executing agent turns. Every model — whether it came from OpenAI, Anthropic, Google, or a local Ollama instance — ran through the same internal pipeline. Provider plugins could customize how API calls were made, but the actual agent execution loop was fixed.
Now, a new plugin SDK surface at openclaw/plugin-sdk/agent-harness lets extensions register alternative harnesses. Each harness declares which provider/model pairs it supports, and OpenClaw's selection policy determines which one runs. The built-in PI harness remains the default fallback, but it's no longer the only option.
Codex as the Proof of Concept
The Codex extension is the first real-world implementation of the new harness API, and it's not a toy. When you reference codex/gpt-5.4 or codex/gpt-5.2 as your agent model, the extension spins up a JSON-RPC connection to the Codex app-server — either as a local subprocess over stdio or via remote WebSocket — and delegates the entire agent turn to it.
This isn't just “a different API client.” The Codex app-server manages its own threads, its own compaction, its own tool execution sandbox. OpenClaw's role shifts from executor to orchestrator: it still handles channel delivery, model selection, approval gating, media artifacts, and transcript mirroring, but the actual agent reasoning loop happens inside Codex's runtime.
“Do not register a harness just to add a new LLM API. Standard HTTP or WebSocket implementations should use provider plugins instead.”
— OpenClaw SDK documentation
Why This Distinction Matters
The documentation is explicit about when to use a harness versus a provider plugin. Provider plugins handle API transport — they turn OpenClaw's internal representation into the right HTTP calls. Harnesses replace the entire execution loop. You'd build a harness when a model family has “its own native session runtime and the normal OpenClaw provider transport is the wrong abstraction.”
The examples given in the documentation paint a clear picture of the intended use cases: native coding-agent servers that manage their own threads and compaction, local CLI daemons that stream plan/reasoning/tool events in their own format, and model runtimes that need their own resume identifiers alongside the OpenClaw session transcript.
In other words, harnesses are for runtimes that are opinionated enough to need their own execution model. Codex fits that description. Claude's computer-use mode might fit it too. A standard OpenRouter-compatible API does not.
The Fallback Question
One of the most interesting design decisions is the fallback policy. By default, if a registered harness fails or can't handle a particular turn, OpenClaw falls back to the built-in PI harness. But operators can now disable this with embeddedHarness.fallback: "none", which creates hard failures instead of silent degradation.
This is a deliberate choice. If you're deploying Codex-only agents that rely on the app-server's native tool sandbox, silently falling back to a different runtime that doesn't have that sandbox would be worse than failing loudly. The documentation calls this out: “fails closed with an explicit blocked-state response.”
What This Means for the Platform
This is the second major extensibility milestone for OpenClaw in recent months, following the plugin SDK boundary enforcement completed in March. The pattern is consistent: decouple core infrastructure from specific implementations, then let extensions own their vertical.
The Codex extension ships as a bundled plugin, meaning it comes with OpenClaw out of the box. But the SDK surface is documented and exported. Third-party harness installation is still labeled “experimental,” and the documentation recommends provider plugins for most use cases. But the door is open.
For operators running multi-model deployments, the per-agent configuration is worth noting. Different agents in the same OpenClaw instance can use different harnesses. Your customer-facing agent can run through the PI harness on Claude, while your code-generation agent runs through the Codex harness on GPT-5.4. Same platform, different execution paths, unified transcript and channel delivery.
The Codex extension also introduces several operational commands: /codex status for connectivity checks, /codex models for available model discovery, /codex threads for session management, and /codex resume for attaching to existing Codex threads. These suggest a level of operational maturity beyond a proof-of-concept integration.
The minimum supported app-server version is 0.118.0, which tells you something about how tightly the extension tracks upstream Codex development. This isn't a loose compatibility layer — it's a specific version contract.