All News
OpinionPerformance

OpenClaw Told Users Their Context Was Full. It Wasn't Even Close.

The web UI said “100% context used — 757k / 200k tokens.” Actual usage: 46k. That's 23%. Three PRs this week reveal a pattern of internals that misrepresented resource usage to the people running the software.

March 21, 20266 min read
Reported usage
757k / 200k
“100% context used”
Actual usage
46k / 200k
23% — four tool-use loops inflated the counter
Startup memory
329 MB
Loaded full web-search registry just to validate config

Trust is the most underrated feature in developer tools. When your monitoring dashboard says a resource is at 100%, you act on it. You restart processes. You clear sessions. You tell users to start new conversations. You make decisions based on a number you assume is correct.

What if the number was wrong by 4x?

The Banner That Cried Wolf

PR #51721 · Contributed by BunsDev · One line changed

OpenClaw's web UI displays a context-usage banner that shows operators how close their agent is to the context window limit. It's supposed to show the actual prompt snapshot — the tokens currently in the context window right now. When that number isn't available, the banner fell back to inputTokens.

The problem: inputTokens is cumulative. It accumulates across every API call in a run — tool-use loops, retries, internal re-prompts. A session with four tool calls doesn't use 4x the context. But the counter says it does. The result: a banner screaming “100% context used — 757.3k / 200k” when the actual prompt snapshot was 46k tokens. Twenty-three percent.

The fix is one line. Remove the fallback. If totalTokens — the actual prompt snapshot — isn't available, show nothing instead of showing garbage. Showing nothing is more accurate than showing 757k when the answer is 46k.

I want to dwell on this because it's instructive. A well-intentioned fallback — “better to show something than nothing” — created a worse outcome than showing nothing. Users were resetting functional sessions because a banner told them the context was exhausted. Support threads were filed. Operators doubted their infrastructure. All because of a single ?? operator that reached for the wrong number.

The Summaries That Ate Their Own Budget

PR #27727 · Contributed by Pandadadadazxf · 34 review comments

Context compaction exists to save tokens. You summarize old messages so new ones have room. Simple concept. But OpenClaw's compaction-safeguard extension was appending massive file operation lists and metadata to its summaries — without any size limit. The summaries themselves were consuming the token budget they were supposed to free up.

It's the fiscal equivalent of hiring a cost-cutting consultant whose fees exceed the savings. And it went unnoticed because the system never measured its own overhead.

The new budget caps

Individual file list900 chars
Combined file-ops section2,000 chars
Final summary hard cap16,000 chars

Critical sections (workspace rules, split-turn context) are protected via a reserved suffix pattern and survive truncation.

The implementation is thoughtful. Rather than slashing summaries with a crude character limit, contributor Pandadadadazxf built a reserved suffix pattern: high-priority sections (workspace rules, diagnostics, split-turn context) are protected in a suffix that survives truncation. The main body gets the remaining budget. When even the suffix is too large, workspace rules are preserved over earlier preserved turns.

It went through 34 review comments. Reviewers caught a redundant loop condition, a suffix separator bug, and a priority ordering issue. The final version includes 24 tests covering overflow indicators, workspace rule survival, and retry fallback scenarios. This is what code review is supposed to look like.

329 MB to Run ‘--help’

PR #51574 · Contributed by RichardCao · 16 review comments

Here's a fun exercise: start OpenClaw with the gateway status command and watch your process memory. Before this fix, it peaked at 329 MB RSS. For a status check. The --help flag alone consumed 72 MB.

The culprit: the config validation path imported the full bundled web-search plugin registry — Brave, Firecrawl, Google, Moonshot, Perplexity, Tavily, xAI — every time it ran. All seven providers, with all their transitive dependencies, loaded into memory just to validate a config file. The validation only needed the plugin IDs. It got the entire runtime.

RichardCao's fix creates a lightweight bundled-web-search-ids.ts module containing just the static ID list. Config validation imports the IDs. The full registry loads only when someone actually uses web search. A regression test ensures the static list stays in sync with the registry.

The Greptile bot raised a valid concern: what if someone adds a plugin to the registry and forgets to update the static list? The regression test handles it — if the lists diverge, CI fails. It's a neat solution to the stale-constant problem, and it's the kind of guard that most codebases skip.

The Pattern Worth Naming

These three PRs don't share a feature. They share a failure mode: internal systems misrepresenting resource usage. The token counter overstated context consumption by 4x. The compaction system consumed the budget it was supposed to save. The config validator loaded 329 MB of code it didn't need.

None of these caused crashes. None of them corrupted data. They just quietly made everything worse — slower startups, unnecessary session resets, compaction summaries that compressed nothing. The kind of problems that make operators say “something feels off” without being able to point at a specific error.

I'd call it resource gaslighting: your system tells you resources are constrained, so you behave as if they are, but the constraints were fabricated by the system's own accounting errors. It's a particularly insidious category of bug because the symptoms look like legitimate scaling problems.

“The most dangerous monitoring bug isn't the one that shows nothing. It's the one that shows a plausible lie. You can diagnose silence. You can't diagnose a number that looks right but isn't.”

All three fixes are live for DeployClaw users. For the full technical details, see PR #51721, PR #27727, and PR #51574 on GitHub.

Get the fixes without the false alarms

DeployClaw ships every upstream optimization automatically. Faster startups, accurate monitoring, no manual patching.