OpenClaw Resets Codex Code Mode Defaults, Ships Global Skill Installation, and Gives Telegram Agents Disappearing Progress Updates

Codex Was Silently Breaking Every Dynamic Tool Call

A hardcoded code_mode_only=true default prevented OpenClaw's dynamic tools from reaching agents through the Codex app-server bridge

Three Fixes to Undo One Bad Default

OpenClaw's Codex integration hit a wall that took three separate fixes to dismantle. The root cause was a single hardcoded configuration: features.code_mode_only=true, set globally in the Codex extension. The flag was supposed to keep Codex in its native code execution mode. What it actually did was shut the door on every dynamic tool call that needed to flow through the app-server bridge.

The first fix reversed the default. Codex threads now ship with code_mode=true and code_mode_only=false, preserving native code capabilities while reopening the bridge for dynamic tools. Operators who genuinely want code-mode-only behavior can still opt in through explicit configuration, and restricted or denied tool policies continue to force native code mode off when appropriate.

The second fix went further by introducing appServer.codeModeOnly as a proper configuration option in the Codex plugin. Instead of a hidden global flag, the behavior is now an explicit, documented choice. Dynamic tool registration is preserved through the app-server bridge even when code-mode-only is enabled, so nested tool calls still return correctly. Live harness testing confirmed that tools like sessions_list execute successfully under both configurations.

The third fix addressed a downstream consequence. Codex-backed direct chats were suppressing final assistant replies because the system defaulted to message-tool-only delivery without explicit configuration. The fix restores automatic final delivery for direct chats, with a new hierarchical resolution chain that checks one-turn overrides, session overrides, channel overrides, harness fallbacks, and configured defaults in that order. Wildcard channel model overrides now apply to direct chats before harness defaults activate.

The combined effect: Codex agents can once again use OpenClaw's full tool ecosystem without stalling, and their replies actually reach the user.

Skills Can Now Be Installed Globally Instead of Per-Workspace

A new --global flag routes skill installs and updates to ~/.openclaw/skills

One Flag, Shared Skills Everywhere

OpenClaw's skill system has operated on a per-workspace model since launch. Every skill installation landed in the current agent's workspace directory. If you wanted the same skill available across five agents, you installed it five times.

The new --global flag on both openclaw skills install and openclaw skills update routes installations to the shared managed directory at ~/.openclaw/skills. Skills installed there are available to every agent on the machine without duplication. The flag is mutually exclusive with --agent, which continues to target a specific workspace. Default behavior remains workspace-based when neither flag is specified.

It's a small change with practical ergonomic impact. Operators managing multiple agents no longer need to track which skills are installed where, and skill updates propagate to every agent in one command instead of requiring per-workspace maintenance.

Telegram DMs Get Ephemeral Progress Previews During Tool-Heavy Turns

Native draft messages show transient tool progress without cluttering the conversation history

Progress Updates That Disappear When They Should

Anyone running tool-heavy agents through Telegram DMs knows the noise problem. Every intermediate status update — “searching the web,” “reading file,” “executing command” — arrives as a visible message edit, cluttering the conversation alongside the actual answer. The chat becomes a log file.

OpenClaw now supports Telegram's native sendMessageDraft as an optional delivery path for transient tool progress. When enabled, progress updates appear as ephemeral UI elements — visible in real time but absent from the persistent chat history. Final answers, reasoning, media, buttons, and approval prompts continue using the normal persistent delivery path. The result is a clean conversation where you see the agent thinking in real time but only the actual responses survive.

The feature ships default-off with tight safety boundaries. An optional allowlist limits rollout by Telegram user or chat ID. It's DM-only — groups and other chat types are excluded. If native drafts aren't available in the client, the system falls back silently to the existing edited preview behavior.

Browser Evaluate Command Gets a Timeout Flag

--timeout-ms exposes evaluate timeout control through the CLI for the first time

Long-Running Page Scripts No Longer Race the Clock Blind

OpenClaw's browser automation lets agents execute JavaScript on controlled pages. The underlying evaluate function already supported timeout configuration, but that capability was locked behind the programmatic API — CLI users had no way to extend the timeout for legitimate long-running page operations.

The new --timeout-ms flag on openclaw browser evaluate fixes that. Pass a value in milliseconds, and both the evaluate action budget and the outer request timeout adjust accordingly. The server-side path still clamps the maximum at 120 seconds, preserving existing safety guardrails. But for scripts that need 30 or 60 seconds instead of the default, operators no longer need workarounds.

Image Generation Stops Confusing Different Prompts for Duplicates

Duplicate detection now matches by prompt content, not just session key and task type

Two Different Images, Two Different Tasks

OpenClaw's image generation tool had a duplicate detection mechanism that was too aggressive. When an agent requested two different images in rapid succession — say, a logo and a banner — the system checked for active tasks by session key and task type alone. Since both requests were “image generation” in the same session, the second request returned the status of the first instead of creating a new task.

The fix scopes duplicate detection by prompt content. Same prompt in the same session still reuses the active task, which is the correct behavior for retries. Different prompts now create independent background tasks. Music and video generation are unaffected — the change is specific to image generation where parallel requests with distinct prompts are a common workflow.

Update Recovery Guidance Now Tells You to Stop the Gateway First

EACCES permission failures during npm updates had incomplete recovery steps that risked corrupting a running gateway

The Recovery Step Everyone Was Skipping

On Linux machines with root-owned global npm installations, OpenClaw updates can fail with EACCES permission errors. The existing recovery guidance told users to manually replace the package tree using sudo. What it didn't mention: if the managed gateway is running while npm replaces core files, the gateway can attempt to load partially-written modules mid-swap.

The updated CLI recovery hints and documentation now include the full safe sequence: openclaw gateway stop, then sudo npm install, then openclaw gateway install --force, then openclaw gateway restart. The gateway stop comes first. The forced reinstall ensures the service definition matches the new package version. The restart brings everything back cleanly. It's the kind of procedural detail that seems obvious in retrospect but was missing from every recovery path until now.

QA Lab Adds Share-Safe Diagnostics for Personal Agents

A new test scenario validates that personal data stays redacted during diagnostic operations

Testing What Happens When an Agent Handles Your Data

OpenClaw's QA Lab — the project's internal testing infrastructure for validating agent behavior — gains a new scenario focused on personal data handling. The share-safe diagnostics pack runs a deterministic read/read/write/reply flow through a mock provider, verifying that personal information stays properly redacted when agents perform diagnostic operations.

This isn't a bug fix. It's infrastructure investment. As personal agent use cases grow — agents managing calendars, reading emails, handling financial data — the testing surface needs to keep pace with the privacy surface. The scenario joins the existing personal agent benchmark pack, giving maintainers a repeatable gate for validating data handling before changes ship.

Documentation Changes at a Glance

docs/plugins/codex-harness.md

major

New appServer.codeModeOnly config option documented; code_mode_only default reversed to false

docs/plugins/codex-harness-runtime.md

major

Visible reply delivery defaults updated with hierarchical resolution chain

docs/channels/groups.md

updated

Codex source reply behavior in group and direct chats clarified

docs/gateway/config-channels.md

updated

Wildcard channel model override behavior for direct chats documented

docs/cli/skills.md

major

--global flag for install and update commands with mutual exclusivity rules

docs/tools/skills.md

updated

Global skill directory path and resolution semantics added

docs/help/faq.md

updated

FAQ entry for global vs workspace skill installations

docs/channels/telegram.md

major

Native DM draft preview configuration with allowlist and fallback behavior

docs/cli/browser.md

updated

--timeout-ms flag added to browser evaluate command reference

docs/tools/browser-control.md

updated

Evaluate timeout configuration and server-side clamp behavior documented

docs/automation/tasks.md

updated

Image generation duplicate detection scoped to prompt content

docs/install/updating.md

major

Full EACCES recovery sequence with gateway stop-first requirement

docs/concepts/personal-agent-benchmark-pack.md

updated

Share-safe diagnostics scenario registered in personal agent pack

When a Default Breaks Everything

The Codex code-mode story is the headliner here, and it's a familiar pattern in fast-moving projects. A seemingly reasonable default — keep Codex in its native code execution mode — had a side effect nobody anticipated: it severed the bridge that every other tool in the ecosystem needed. Three engineers filed three separate fixes before the full picture emerged. The first reversed the flag. The second made it a proper config option. The third fixed the reply delivery that broke downstream.

The rest of Sunday's batch follows a different theme: surfaces that were overdue for direct user control. Skills needed a global install path. Browser evaluate needed a timeout flag. Telegram needed a way to show progress without polluting the chat. Image generation needed prompt-aware duplicate detection. Update recovery needed a safe shutdown sequence. None of these are architecturally ambitious. All of them address friction that operators hit repeatedly and worked around silently.

The QA Lab addition is the quietest change and arguably the most forward-looking. Personal agents are the growth frontier for self-hosted AI — agents that touch your email, your files, your calendar. Building the test infrastructure to validate data handling before it becomes a crisis is the kind of investment that doesn't generate headlines but prevents them.

OpenClaw Resets Codex Code Mode Defaults, Ships Global Skill Installation, and Gives Telegram Agents Disappearing Progress Updates

Codex Was Silently Breaking Every Dynamic Tool Call

Three Fixes to Undo One Bad Default

Skills Can Now Be Installed Globally Instead of Per-Workspace

One Flag, Shared Skills Everywhere

Telegram DMs Get Ephemeral Progress Previews During Tool-Heavy Turns

Progress Updates That Disappear When They Should

Browser Evaluate Command Gets a Timeout Flag

Long-Running Page Scripts No Longer Race the Clock Blind

Image Generation Stops Confusing Different Prompts for Duplicates

Two Different Images, Two Different Tasks

Update Recovery Guidance Now Tells You to Stop the Gateway First

The Recovery Step Everyone Was Skipping

QA Lab Adds Share-Safe Diagnostics for Personal Agents

Testing What Happens When an Agent Handles Your Data

Documentation Changes at a Glance

When a Default Breaks Everything

Related

OpenClaw Stops Prematurely Killing Busy Codex Agents, Ships Structured Heartbeats

OpenClaw Fixes Silent Group Chat Failures, Tightens Codex Credentials

Self-host with confidence.