Source-grounded rewrite of 529 published docs pages with per-unit information-loss verification: 1,713 factual corrections cited to src/**, generated surfaces regenerated, frontmatter titles preserved for i18n, release notes pages untouched. All docs gates green. Closes #100141
26 KiB
summary, read_when, title
| summary | read_when | title | |||
|---|---|---|---|---|---|
| Deep dive: session store + transcripts, lifecycle, and (auto)compaction internals |
|
Session management deep dive |
A single Gateway process owns session state end-to-end. UIs (macOS app, web Control UI, TUI) query the Gateway for session lists and token counts. In remote mode, session files live on the remote host, so checking your local Mac's files will not reflect what the Gateway is using.
Overview docs first: Session management, Compaction, Memory overview, Memory search, Session pruning, Transcript hygiene, full config reference at Agent config.
Two persistence layers
- Session store (
sessions.json) - key/value mapsessionKey -> SessionEntry. Small, mutable, safe to edit or delete entries. Tracks metadata: current session id, last activity, toggles, token counters. - Transcript (
<sessionId>.jsonl) - append-only, tree-structured (entries haveid+parentId). Stores the conversation, tool calls, and compaction summaries; rebuilds model context for future turns. Compaction checkpoints are metadata over the compacted successor transcript - a new compaction does not write a second.checkpoint.*.jsonlcopy.
Gateway history readers avoid materializing the whole transcript unless the surface needs arbitrary historical access. First-page history, embedded chat history, restart recovery, and token/usage checks use bounded tail reads. Full transcript scans go through the async transcript index, cached by file path plus mtimeMs/size and shared across concurrent readers.
On-disk locations
Per agent, on the Gateway host (resolved via src/config/sessions.ts):
- Store:
~/.openclaw/agents/<agentId>/sessions/sessions.json - Transcripts:
~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl- Telegram topic sessions:
.../<sessionId>-topic-<threadId>.jsonl
- Telegram topic sessions:
Store maintenance and disk controls
session.maintenance controls automatic maintenance for sessions.json, transcript artifacts, and trajectory sidecars:
| Key | Default | Notes |
|---|---|---|
mode |
"enforce" |
or "warn" (report only, no mutation) |
pruneAfter |
"30d" |
stale-entry age cutoff |
maxEntries |
500 |
cap on entries in sessions.json |
resetArchiveRetention |
same as pruneAfter |
retention for *.reset.<timestamp> transcript archives; false disables cleanup |
maxDiskBytes |
unset | optional sessions-directory budget |
highWaterBytes |
80% of maxDiskBytes |
target after budget cleanup |
Gateway model-run probe sessions (keys matching agent:*:explicit:model-run-<uuid>) get a separate, fixed 24h retention. This pruning is pressure-gated: it only runs when session-entry maintenance/cap pressure is reached, and only before the global stale-entry cleanup/cap step. Other explicit sessions do not use this retention.
Enforcement order for disk-budget cleanup (mode: "enforce"):
- Remove oldest archived, orphan transcript, or orphan trajectory artifacts first.
- If still above target, evict oldest session entries and their transcript/trajectory files.
- Repeat until usage is at or below
highWaterBytes.
mode: "warn" reports potential evictions without mutating the store or files.
Run maintenance on demand:
openclaw sessions cleanup --dry-run
openclaw sessions cleanup --enforce
Maintenance keeps durable external conversation pointers such as group sessions and thread-scoped chat sessions, but synthetic runtime entries (cron, hooks, heartbeat, ACP, sub-agents) can still be removed once they exceed the configured age, count, or disk budget. Isolated cron runs use a separate cron.sessionRetention control, independent of model-run probe retention.
Normal Gateway writes flow through a per-store session writer that serializes in-process mutations without taking a runtime file lock. Hot-path patch helpers borrow the validated mutable cache while holding that writer slot, so large sessions.json files are not cloned or reread for every metadata update. Prefer updateSessionStore(...) / updateSessionStoreEntry(...) in runtime code; direct whole-store saves are for compatibility and offline maintenance tools. When a Gateway is reachable, non-dry-run openclaw sessions cleanup and openclaw agents delete delegate store mutations to the Gateway so cleanup joins the same writer queue; --store <path> is the explicit offline repair path for direct file maintenance and always stays local (as does --dry-run). maxEntries cleanup is batched for production-sized stores, so a store may briefly exceed the configured cap before the next high-water cleanup rewrites it down. Reads never prune or cap entries during Gateway startup - only writes or openclaw sessions cleanup --enforce do, and the latter also applies the cap immediately and prunes old unreferenced transcript, checkpoint, and trajectory artifacts even with no disk budget configured.
OpenClaw no longer creates automatic sessions.json.bak.* rotation backups during Gateway writes. The legacy session.maintenance.rotateBytes key is ignored and openclaw doctor --fix removes it from older configs.
Transcript mutations use a session write lock on the transcript file:
| Setting | Default | Env override |
|---|---|---|
session.writeLock.acquireTimeoutMs |
60000 |
OPENCLAW_SESSION_WRITE_LOCK_ACQUIRE_TIMEOUT_MS |
session.writeLock.staleMs |
1800000 |
OPENCLAW_SESSION_WRITE_LOCK_STALE_MS |
session.writeLock.maxHoldMs |
300000 |
OPENCLAW_SESSION_WRITE_LOCK_MAX_HOLD_MS |
acquireTimeoutMs is how long a lock wait surfaces a busy-session error before giving up; raise it only when legitimate prep, cleanup, compaction, or transcript mirror work contends longer on slow machines. staleMs is when an existing lock can be reclaimed as stale. maxHoldMs is the in-process watchdog release threshold.
Cron sessions and run logs
Isolated cron runs create their own session entries/transcripts with dedicated retention:
cron.sessionRetention(default"24h") prunes old isolated cron run sessions from the store;falsedisables.cron.runLog.keepLinesprunes retained SQLite run-history rows per cron job (default2000).cron.runLog.maxBytesis accepted only for compatibility with older file-backed run logs.
When cron force-creates a new isolated run session, it sanitizes the previous cron:<jobId> session entry before writing the new row: it carries safe preferences (thinking/fast/verbose/reasoning settings, labels, display name) and explicit user-selected model/auth overrides, but drops ambient conversation context (channel/group routing, send/queue policy, elevation, origin, ACP runtime binding) so a fresh isolated run cannot inherit stale delivery or runtime authority from an older run.
Session keys (sessionKey)
A sessionKey identifies which conversation bucket you are in (routing + isolation). Canonical rules: /concepts/session.
| Pattern | Example |
|---|---|
| Main/direct chat (per agent) | agent:<agentId>:<mainKey> (default main) |
| Group | agent:<agentId>:<channel>:group:<id> |
| Room/channel (Discord/Slack) | agent:<agentId>:<channel>:channel:<id> or ...:room:<id> |
| Cron | cron:<job.id> |
| Webhook | hook:<uuid> (unless overridden) |
Session ids (sessionId)
Each sessionKey points at a current sessionId (the transcript file continuing the conversation). Decision logic lives in initSessionState() in src/auto-reply/reply/session.ts.
- Reset (
/new,/reset) creates a newsessionIdfor thatsessionKey. - Daily reset (default 4:00 AM local time on the gateway host) creates a new
sessionIdon the next message after the reset boundary. - Idle expiry (
session.reset.idleMinutes, or legacysession.idleMinutes) creates a newsessionIdwhen a message arrives after the idle window. If daily and idle are both configured, whichever expires first wins. - Control UI reconnect resume preserves the currently visible session for one reconnect send when the Gateway receives the matching
sessionIdfrom an operator UI client. This is a one-shot signal; ordinary stale sends still create a newsessionId. - System events (heartbeat, cron wakeups, exec notifications, gateway bookkeeping) may mutate the session row but never extend daily/idle reset freshness. Reset rollover discards queued system-event notices for the previous session before the fresh prompt is built.
- Parent fork policy uses OpenClaw's active branch when creating a thread or subagent fork. If that branch is too large (over a fixed internal cap, currently 100K tokens), OpenClaw starts the child with isolated context instead of failing or inheriting unusable history. Sizing is automatic and not configurable; legacy
session.parentForkMaxTokensconfig is removed byopenclaw doctor --fix.
Session store schema (sessions.json)
The value type is SessionEntry in src/config/sessions.ts. Key fields (not exhaustive):
sessionId: current transcript id (filename derives from this unlesssessionFileis set)sessionStartedAt: start timestamp for the currentsessionId; daily reset freshness uses this. Legacy rows may derive it from the JSONL session header.lastInteractionAt: last real user/channel interaction timestamp; idle reset freshness uses this so heartbeat, cron, and exec events do not keep sessions alive. Legacy rows without this field fall back to the recovered session start time.updatedAt: last store-row mutation timestamp, used for listing/pruning/bookkeeping - not the daily/idle freshness authority.archivedAt: optional archive timestamp. Archived sessions stay in the store with their transcript intact and are excluded from normal active listings.pinnedAt: optional pin timestamp. Active pinned sessions sort ahead of unpinned sessions; archiving a session clears its pin.- Codex thread interop: both fields follow the Codex thread-management shape - the
archived/pinnedbooleans on the wire are always derived from the timestamp and stamped server-side, matching Codexthreads.archived_atsemantics and camelCase serialization. OpenClaw timestamps are epoch milliseconds while Codex uses epoch seconds, so bridges convert at the codex plugin seam. Codex has no pin API yet (thread/archive/thread/unarchiveonly); pinned state stays OpenClaw-side until one exists, at which point the matching shape lets bound sessions round-trip pin state mechanically. sessionFile: optional explicit transcript path overridechatType:direct | group | roomprovider,subject,room,space,displayName: group/channel labeling metadata- Toggles:
thinkingLevel,verboseLevel,reasoningLevel,elevatedLevel,sendPolicy(per-session override) - Model selection:
providerOverride,modelOverride,authProfileOverride - Token counters (best-effort/provider-dependent):
inputTokens,outputTokens,totalTokens,contextTokens compactionCount: how many times auto-compaction completed for this session keymemoryFlushAt/memoryFlushCompactionCount: timestamp and compaction count of the last pre-compaction memory flush
The store is safe to edit, but the Gateway is the authority: it may rewrite or rehydrate entries as sessions run.
Transcript structure (*.jsonl)
Transcripts are managed by SessionManager (openclaw/plugin-sdk/agent-sessions). The file is JSONL:
- First line: session header -
type: "session",id,cwd,timestamp, optionalparentSession. - Then: entries with
id+parentId(tree structure).
Notable entry types:
message: user/assistant/toolResult messagescustom_message: extension-injected message that does enter model context (rendered in the TUI whendisplay: true, hidden entirely whendisplay: false)custom: extension state that does not enter model context (for persisting extension state across reloads)compaction: persisted compaction summary withfirstKeptEntryIdandtokensBeforebranch_summary: persisted summary when navigating a tree branch
OpenClaw intentionally does not "fix up" transcripts; the Gateway uses SessionManager to read/write them.
Context windows vs tracked tokens
Two different concepts:
- Model context window: hard cap per model (tokens visible to the model). Comes from the model catalog and can be overridden via config.
- Session store counters: rolling stats written into
sessions.json(used for/statusand dashboards).contextTokensis a runtime estimate/reporting value - do not treat it as a strict guarantee.
More on limits: /reference/token-use.
Compaction: what it is
Compaction summarizes older conversation into a persisted compaction entry in the transcript and keeps recent messages intact. After compaction, future turns see the compaction summary plus messages after firstKeptEntryId. Compaction is persistent, unlike session pruning - see /concepts/session-pruning.
AGENTS.md section reinjection after compaction is opt-in via agents.defaults.compaction.postCompactionSections; when unset or [], OpenClaw does not append AGENTS.md excerpts on top of the compaction summary.
Chunk boundaries and tool pairing
When splitting a long transcript into compaction chunks, OpenClaw keeps assistant tool calls paired with their matching toolResult entries:
- If the token-share split would land between a tool call and its result, OpenClaw shifts the boundary to the assistant tool-call message instead of separating the pair.
- If a trailing tool-result block would otherwise push the chunk over target, OpenClaw preserves that pending tool block and keeps the unsummarized tail intact.
- Aborted/error tool-call blocks do not hold a pending split open.
When auto-compaction happens
Two triggers in the embedded OpenClaw agent:
- Overflow recovery: the model returns a context-overflow error (
request_too_large,context length exceeded,input exceeds the maximum number of tokens,input token count exceeds the maximum number of input tokens,input is too long for the model,ollama error: context length exceeded, and other provider-shaped variants) - compact, then retry. When the provider reports the attempted token count, OpenClaw forwards that observed count into overflow-recovery compaction; if the provider confirms overflow but exposes no parseable count, OpenClaw passes a minimally over-budget synthetic count to compaction engines and diagnostics. If overflow recovery still fails, OpenClaw surfaces explicit guidance and preserves the current session mapping instead of silently rotating to a fresh session id - retry the message, run/compact, or run/new. - Threshold maintenance: after a successful turn, when
contextTokens > contextWindow - reserveTokens, wherecontextWindowis the model's context window andreserveTokensis headroom reserved for prompts plus the next model output.
Two additional guards run outside these two triggers:
- Preflight local compaction: set
agents.defaults.compaction.maxActiveTranscriptBytes(bytes or a string like"20mb") to trigger local compaction before opening the next run once the active transcript file reaches that size. This is a file-size guard for local reopen cost, not raw archival - normal semantic compaction still runs, and it requirestruncateAfterCompactionso the compacted summary becomes a new successor transcript. - Mid-turn precheck: set
agents.defaults.compaction.midTurnPrecheck.enabled: true(defaultfalse) to add a tool-loop guard. After a tool result is appended and before the next model call, OpenClaw estimates prompt pressure using the same preflight budget logic used at turn start. If context no longer fits, the guard does not compact inline - it raises a structured mid-turn precheck signal, stops the current prompt submission, and lets the outer run loop use the existing recovery path (truncate oversized tool results when that is enough, or trigger the configured compaction mode and retry). Works with bothdefaultandsafeguardcompaction modes, including provider-backed safeguard compaction. Independent ofmaxActiveTranscriptBytes: the byte-size guard runs before a turn opens, mid-turn precheck runs later, after new tool results are appended.
Compaction settings
{
agents: {
defaults: {
compaction: {
enabled: true,
reserveTokens: 16384,
keepRecentTokens: 20000,
},
},
},
}
OpenClaw also enforces a safety floor for embedded runs: if compaction.reserveTokens is below reserveTokensFloor (default 20000), OpenClaw bumps it up; if already higher, it is left alone. Set agents.defaults.compaction.reserveTokensFloor: 0 to disable the floor. The floor itself is automatically capped to a safe fraction of the model's context window, so small-context models (for example a 16K-token local model) are not starved of prompt budget - without that cap, the 20000-token default floor could exceed the whole window and put every prompt into an overflow-compaction loop. Why a floor at all: leave enough headroom for multi-turn "housekeeping" (like the memory flush, below) before compaction becomes unavoidable. Implementation: applyAgentCompactionSettingsFromConfig() in src/agents/agent-settings.ts, called from embedded-runner turn and compaction setup paths.
Manual /compact honors an explicit agents.defaults.compaction.keepRecentTokens and keeps the runtime's recent-tail cut point. Without an explicit keep budget, manual compaction is a hard checkpoint and rebuilt context starts from the new summary.
When truncateAfterCompaction is enabled, OpenClaw rotates the active transcript to a compacted successor JSONL after compaction. Branch/restore checkpoint actions use that compacted successor; legacy pre-compaction checkpoint files remain readable while referenced.
Pluggable compaction providers
Plugins register a compaction provider via registerCompactionProvider() on the plugin API. When agents.defaults.compaction.provider is set to a registered provider id, the safeguard extension delegates summarization to that provider instead of the built-in summarizeInStages pipeline.
provider: id of a registered compaction provider plugin. Leave unset for default LLM summarization. Setting aproviderforcesmode: "safeguard".- Providers receive the same compaction instructions and identifier-preservation policy as the built-in path, and the safeguard still preserves recent-turn and split-turn suffix context after provider output.
- Built-in safeguard summarization re-distills prior summaries with new messages instead of preserving the full previous summary verbatim.
- Safeguard mode enables summary quality audits by default; set
qualityGuard.enabled: falseto skip retry-on-malformed-output behavior. - If the provider fails or returns an empty result, OpenClaw falls back to built-in LLM summarization automatically. Abort/timeout signals the caller explicitly triggered are re-thrown, not swallowed, so cancellation is always respected.
Source: src/plugins/compaction-provider.ts, src/agents/agent-hooks/compaction-safeguard.ts.
User-visible surfaces
/statusin any chat sessionopenclaw status(CLI)openclaw sessions/openclaw sessions --json- Gateway logs (
pnpm gateway:watchoropenclaw logs --follow):embedded run auto-compaction start+complete - Verbose mode:
🧹 Auto-compaction completeplus the compaction count
Silent housekeeping (NO_REPLY)
OpenClaw supports "silent" turns for background tasks where the user should not see intermediate output.
- The assistant starts its output with the exact silent token
NO_REPLY/no_replyto mean "do not deliver a reply to the user." OpenClaw strips/suppresses this in the delivery layer. - Exact silent-token suppression is case-insensitive:
NO_REPLYandno_replyboth count when the whole payload is just the silent token. - As of
2026.1.10, OpenClaw also suppresses draft/typing streaming when a partial chunk begins withNO_REPLY, so silent operations do not leak partial output mid-turn. - This is for true background/no-delivery turns only - it is not a shortcut for ordinary actionable user requests.
Pre-compaction memory flush
Before auto-compaction happens, OpenClaw can run a silent agentic turn that writes durable state to disk (for example memory/YYYY-MM-DD.md in the agent workspace) so compaction cannot erase critical context. It monitors session context usage, and once it crosses a soft threshold below the compaction threshold, it sends a silent "write memory now" directive using the exact silent token NO_REPLY / no_reply so the user sees nothing.
Config (agents.defaults.compaction.memoryFlush), full reference at /gateway/config-agents:
| Key | Default | Notes |
|---|---|---|
enabled |
true |
|
model |
unset | exact provider/model override for the flush turn only, for example ollama/qwen3:8b |
softThresholdTokens |
4000 |
gap below the compaction threshold that triggers a flush |
forceFlushTranscriptBytes |
unset (disabled) | force a flush once the transcript file reaches this byte size (or string like "2mb"), even if token counters are stale; 0 disables |
prompt |
built-in | user message for the flush turn |
systemPrompt |
built-in | extra system prompt appended for the flush turn |
Notes:
- The default prompt/system prompt include a
NO_REPLYhint to suppress delivery. - When
modelis set, the flush turn uses that model without inheriting the active session's fallback chain, so local-only housekeeping does not silently fall back to a paid conversation model on failure. - The flush runs once per compaction cycle (tracked in
sessions.json). - The flush runs only for embedded OpenClaw sessions; CLI backends and heartbeat turns skip it.
- The flush is skipped when the session workspace is read-only (
workspaceAccess: "ro"or"none"). - See Memory for the workspace file layout and write patterns.
OpenClaw exposes a session_before_compact hook in the extension API, but the flush logic above lives on the Gateway side (src/auto-reply/reply/memory-flush.ts, src/auto-reply/reply/agent-runner-memory.ts), not on that hook.
Troubleshooting checklist
- Session key wrong? Start with /concepts/session and confirm the
sessionKeyin/status. - Store vs transcript mismatch? Confirm the Gateway host and the store path from
openclaw status. - Compaction spam? Check the model's context window (too small forces frequent compaction),
reserveTokens(too high for the model window causes earlier compaction), and tool-result bloat (tune session pruning). - Every prompt seems to overflow on a small local model? The
reserveTokensFloordefault (20000) auto-caps to a safe fraction of the context window, but an explicitreserveTokensset higher than the window itself is not capped - lower it or unset it. - Silent turns leaking? Confirm the reply starts with the exact silent token
NO_REPLY(case-insensitive) and you are on a build that includes the streaming-suppression fix (2026.1.10+).