openclaw-openclaw/docs/reference/session-management-compaction.md at main

mirror of https://github.com/openclaw/openclaw.git synced 2026-07-05 19:35:50 +08:00

Files

Peter Steinberger f7d7148cf0 docs: rewrite published docs grounded in current source (#100142 )

Source-grounded rewrite of 529 published docs pages with per-unit information-loss verification: 1,713 factual corrections cited to src/**, generated surfaces regenerated, frontmatter titles preserved for i18n, release notes pages untouched. All docs gates green.

Closes #100141

2026-07-05 00:32:47 -04:00

26 KiB

Raw Permalink Blame History

summary, read_when, title

summary

read_when

title

Deep dive: session store + transcripts, lifecycle, and (auto)compaction internals

You need to debug session ids, transcript JSONL, or sessions.json fields

You are changing auto-compaction behavior or adding "pre-compaction" housekeeping

You want to implement memory flushes or silent system turns

Session management deep dive

A single Gateway process owns session state end-to-end. UIs (macOS app, web Control UI, TUI) query the Gateway for session lists and token counts. In remote mode, session files live on the remote host, so checking your local Mac's files will not reflect what the Gateway is using.

Overview docs first: Session management, Compaction, Memory overview, Memory search, Session pruning, Transcript hygiene, full config reference at Agent config.

Two persistence layers

Session store (sessions.json) - key/value map sessionKey -> SessionEntry. Small, mutable, safe to edit or delete entries. Tracks metadata: current session id, last activity, toggles, token counters.
Transcript (<sessionId>.jsonl) - append-only, tree-structured (entries have id + parentId). Stores the conversation, tool calls, and compaction summaries; rebuilds model context for future turns. Compaction checkpoints are metadata over the compacted successor transcript - a new compaction does not write a second .checkpoint.*.jsonl copy.

Gateway history readers avoid materializing the whole transcript unless the surface needs arbitrary historical access. First-page history, embedded chat history, restart recovery, and token/usage checks use bounded tail reads. Full transcript scans go through the async transcript index, cached by file path plus mtimeMs/size and shared across concurrent readers.

On-disk locations

Per agent, on the Gateway host (resolved via src/config/sessions.ts):

Store: ~/.openclaw/agents/<agentId>/sessions/sessions.json
Transcripts: ~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl
- Telegram topic sessions: .../<sessionId>-topic-<threadId>.jsonl

Store maintenance and disk controls

session.maintenance controls automatic maintenance for sessions.json, transcript artifacts, and trajectory sidecars:

Key	Default	Notes
`mode`	`"enforce"`	or `"warn"` (report only, no mutation)
`pruneAfter`	`"30d"`	stale-entry age cutoff
`maxEntries`	`500`	cap on entries in `sessions.json`
`resetArchiveRetention`	same as `pruneAfter`	retention for `*.reset.<timestamp>` transcript archives; `false` disables cleanup
`maxDiskBytes`	unset	optional sessions-directory budget
`highWaterBytes`	80% of `maxDiskBytes`	target after budget cleanup

Gateway model-run probe sessions (keys matching agent:*:explicit:model-run-<uuid>) get a separate, fixed 24h retention. This pruning is pressure-gated: it only runs when session-entry maintenance/cap pressure is reached, and only before the global stale-entry cleanup/cap step. Other explicit sessions do not use this retention.

Enforcement order for disk-budget cleanup (mode: "enforce"):

Remove oldest archived, orphan transcript, or orphan trajectory artifacts first.
If still above target, evict oldest session entries and their transcript/trajectory files.
Repeat until usage is at or below highWaterBytes.

mode: "warn" reports potential evictions without mutating the store or files.

Run maintenance on demand:

openclaw sessions cleanup --dry-run
openclaw sessions cleanup --enforce

Maintenance keeps durable external conversation pointers such as group sessions and thread-scoped chat sessions, but synthetic runtime entries (cron, hooks, heartbeat, ACP, sub-agents) can still be removed once they exceed the configured age, count, or disk budget. Isolated cron runs use a separate cron.sessionRetention control, independent of model-run probe retention.

Normal Gateway writes flow through a per-store session writer that serializes in-process mutations without taking a runtime file lock. Hot-path patch helpers borrow the validated mutable cache while holding that writer slot, so large sessions.json files are not cloned or reread for every metadata update. Prefer updateSessionStore(...) / updateSessionStoreEntry(...) in runtime code; direct whole-store saves are for compatibility and offline maintenance tools. When a Gateway is reachable, non-dry-run openclaw sessions cleanup and openclaw agents delete delegate store mutations to the Gateway so cleanup joins the same writer queue; --store <path> is the explicit offline repair path for direct file maintenance and always stays local (as does --dry-run). maxEntries cleanup is batched for production-sized stores, so a store may briefly exceed the configured cap before the next high-water cleanup rewrites it down. Reads never prune or cap entries during Gateway startup - only writes or openclaw sessions cleanup --enforce do, and the latter also applies the cap immediately and prunes old unreferenced transcript, checkpoint, and trajectory artifacts even with no disk budget configured.

OpenClaw no longer creates automatic sessions.json.bak.* rotation backups during Gateway writes. The legacy session.maintenance.rotateBytes key is ignored and openclaw doctor --fix removes it from older configs.

Transcript mutations use a session write lock on the transcript file:

Setting	Default	Env override
`session.writeLock.acquireTimeoutMs`	`60000`	`OPENCLAW_SESSION_WRITE_LOCK_ACQUIRE_TIMEOUT_MS`
`session.writeLock.staleMs`	`1800000`	`OPENCLAW_SESSION_WRITE_LOCK_STALE_MS`
`session.writeLock.maxHoldMs`	`300000`	`OPENCLAW_SESSION_WRITE_LOCK_MAX_HOLD_MS`

acquireTimeoutMs is how long a lock wait surfaces a busy-session error before giving up; raise it only when legitimate prep, cleanup, compaction, or transcript mirror work contends longer on slow machines. staleMs is when an existing lock can be reclaimed as stale. maxHoldMs is the in-process watchdog release threshold.

Cron sessions and run logs

Isolated cron runs create their own session entries/transcripts with dedicated retention:

cron.sessionRetention (default "24h") prunes old isolated cron run sessions from the store; false disables.
cron.runLog.keepLines prunes retained SQLite run-history rows per cron job (default 2000). cron.runLog.maxBytes is accepted only for compatibility with older file-backed run logs.

When cron force-creates a new isolated run session, it sanitizes the previous cron:<jobId> session entry before writing the new row: it carries safe preferences (thinking/fast/verbose/reasoning settings, labels, display name) and explicit user-selected model/auth overrides, but drops ambient conversation context (channel/group routing, send/queue policy, elevation, origin, ACP runtime binding) so a fresh isolated run cannot inherit stale delivery or runtime authority from an older run.

Session keys (`sessionKey`)

A sessionKey identifies which conversation bucket you are in (routing + isolation). Canonical rules: /concepts/session.

Pattern	Example
Main/direct chat (per agent)	`agent:<agentId>:<mainKey>` (default `main`)
Group	`agent:<agentId>:<channel>:group:<id>`
Room/channel (Discord/Slack)	`agent:<agentId>:<channel>:channel:<id>` or `...:room:<id>`
Cron	`cron:<job.id>`
Webhook	`hook:<uuid>` (unless overridden)

Session ids (`sessionId`)

Each sessionKey points at a current sessionId (the transcript file continuing the conversation). Decision logic lives in initSessionState() in src/auto-reply/reply/session.ts.

Reset (/new, /reset) creates a new sessionId for that sessionKey.
Daily reset (default 4:00 AM local time on the gateway host) creates a new sessionId on the next message after the reset boundary.
Idle expiry (session.reset.idleMinutes, or legacy session.idleMinutes) creates a new sessionId when a message arrives after the idle window. If daily and idle are both configured, whichever expires first wins.
Control UI reconnect resume preserves the currently visible session for one reconnect send when the Gateway receives the matching sessionId from an operator UI client. This is a one-shot signal; ordinary stale sends still create a new sessionId.
System events (heartbeat, cron wakeups, exec notifications, gateway bookkeeping) may mutate the session row but never extend daily/idle reset freshness. Reset rollover discards queued system-event notices for the previous session before the fresh prompt is built.
Parent fork policy uses OpenClaw's active branch when creating a thread or subagent fork. If that branch is too large (over a fixed internal cap, currently 100K tokens), OpenClaw starts the child with isolated context instead of failing or inheriting unusable history. Sizing is automatic and not configurable; legacy session.parentForkMaxTokens config is removed by openclaw doctor --fix.

Session store schema (`sessions.json`)

The value type is SessionEntry in src/config/sessions.ts. Key fields (not exhaustive):

sessionId: current transcript id (filename derives from this unless sessionFile is set)
sessionStartedAt: start timestamp for the current sessionId; daily reset freshness uses this. Legacy rows may derive it from the JSONL session header.
lastInteractionAt: last real user/channel interaction timestamp; idle reset freshness uses this so heartbeat, cron, and exec events do not keep sessions alive. Legacy rows without this field fall back to the recovered session start time.
updatedAt: last store-row mutation timestamp, used for listing/pruning/bookkeeping - not the daily/idle freshness authority.
archivedAt: optional archive timestamp. Archived sessions stay in the store with their transcript intact and are excluded from normal active listings.
pinnedAt: optional pin timestamp. Active pinned sessions sort ahead of unpinned sessions; archiving a session clears its pin.
Codex thread interop: both fields follow the Codex thread-management shape - the archived/pinned booleans on the wire are always derived from the timestamp and stamped server-side, matching Codex threads.archived_at semantics and camelCase serialization. OpenClaw timestamps are epoch milliseconds while Codex uses epoch seconds, so bridges convert at the codex plugin seam. Codex has no pin API yet (thread/archive/thread/unarchive only); pinned state stays OpenClaw-side until one exists, at which point the matching shape lets bound sessions round-trip pin state mechanically.
sessionFile: optional explicit transcript path override
chatType: direct | group | room
provider, subject, room, space, displayName: group/channel labeling metadata
Toggles: thinkingLevel, verboseLevel, reasoningLevel, elevatedLevel, sendPolicy (per-session override)
Model selection: providerOverride, modelOverride, authProfileOverride
Token counters (best-effort/provider-dependent): inputTokens, outputTokens, totalTokens, contextTokens
compactionCount: how many times auto-compaction completed for this session key
memoryFlushAt / memoryFlushCompactionCount: timestamp and compaction count of the last pre-compaction memory flush

The store is safe to edit, but the Gateway is the authority: it may rewrite or rehydrate entries as sessions run.

Transcript structure (`*.jsonl`)

Transcripts are managed by SessionManager (openclaw/plugin-sdk/agent-sessions). The file is JSONL:

First line: session header - type: "session", id, cwd, timestamp, optional parentSession.
Then: entries with id + parentId (tree structure).

Notable entry types:

message: user/assistant/toolResult messages
custom_message: extension-injected message that does enter model context (rendered in the TUI when display: true, hidden entirely when display: false)
custom: extension state that does not enter model context (for persisting extension state across reloads)
compaction: persisted compaction summary with firstKeptEntryId and tokensBefore
branch_summary: persisted summary when navigating a tree branch

OpenClaw intentionally does not "fix up" transcripts; the Gateway uses SessionManager to read/write them.

Context windows vs tracked tokens

Two different concepts:

Model context window: hard cap per model (tokens visible to the model). Comes from the model catalog and can be overridden via config.
Session store counters: rolling stats written into sessions.json (used for /status and dashboards). contextTokens is a runtime estimate/reporting value - do not treat it as a strict guarantee.

Compaction: what it is

Compaction summarizes older conversation into a persisted compaction entry in the transcript and keeps recent messages intact. After compaction, future turns see the compaction summary plus messages after firstKeptEntryId. Compaction is persistent, unlike session pruning - see /concepts/session-pruning.

AGENTS.md section reinjection after compaction is opt-in via agents.defaults.compaction.postCompactionSections; when unset or [], OpenClaw does not append AGENTS.md excerpts on top of the compaction summary.

Chunk boundaries and tool pairing

When splitting a long transcript into compaction chunks, OpenClaw keeps assistant tool calls paired with their matching toolResult entries:

If the token-share split would land between a tool call and its result, OpenClaw shifts the boundary to the assistant tool-call message instead of separating the pair.
If a trailing tool-result block would otherwise push the chunk over target, OpenClaw preserves that pending tool block and keeps the unsummarized tail intact.
Aborted/error tool-call blocks do not hold a pending split open.

When auto-compaction happens

Two triggers in the embedded OpenClaw agent:

Overflow recovery: the model returns a context-overflow error (request_too_large, context length exceeded, input exceeds the maximum number of tokens, input token count exceeds the maximum number of input tokens, input is too long for the model, ollama error: context length exceeded, and other provider-shaped variants) - compact, then retry. When the provider reports the attempted token count, OpenClaw forwards that observed count into overflow-recovery compaction; if the provider confirms overflow but exposes no parseable count, OpenClaw passes a minimally over-budget synthetic count to compaction engines and diagnostics. If overflow recovery still fails, OpenClaw surfaces explicit guidance and preserves the current session mapping instead of silently rotating to a fresh session id - retry the message, run /compact, or run /new.
Threshold maintenance: after a successful turn, when contextTokens > contextWindow - reserveTokens, where contextWindow is the model's context window and reserveTokens is headroom reserved for prompts plus the next model output.

Two additional guards run outside these two triggers:

Preflight local compaction: set agents.defaults.compaction.maxActiveTranscriptBytes (bytes or a string like "20mb") to trigger local compaction before opening the next run once the active transcript file reaches that size. This is a file-size guard for local reopen cost, not raw archival - normal semantic compaction still runs, and it requires truncateAfterCompaction so the compacted summary becomes a new successor transcript.
Mid-turn precheck: set agents.defaults.compaction.midTurnPrecheck.enabled: true (default false) to add a tool-loop guard. After a tool result is appended and before the next model call, OpenClaw estimates prompt pressure using the same preflight budget logic used at turn start. If context no longer fits, the guard does not compact inline - it raises a structured mid-turn precheck signal, stops the current prompt submission, and lets the outer run loop use the existing recovery path (truncate oversized tool results when that is enough, or trigger the configured compaction mode and retry). Works with both default and safeguard compaction modes, including provider-backed safeguard compaction. Independent of maxActiveTranscriptBytes: the byte-size guard runs before a turn opens, mid-turn precheck runs later, after new tool results are appended.

Compaction settings

{
  agents: {
    defaults: {
      compaction: {
        enabled: true,
        reserveTokens: 16384,
        keepRecentTokens: 20000,
      },
    },
  },
}

OpenClaw also enforces a safety floor for embedded runs: if compaction.reserveTokens is below reserveTokensFloor (default 20000), OpenClaw bumps it up; if already higher, it is left alone. Set agents.defaults.compaction.reserveTokensFloor: 0 to disable the floor. The floor itself is automatically capped to a safe fraction of the model's context window, so small-context models (for example a 16K-token local model) are not starved of prompt budget - without that cap, the 20000-token default floor could exceed the whole window and put every prompt into an overflow-compaction loop. Why a floor at all: leave enough headroom for multi-turn "housekeeping" (like the memory flush, below) before compaction becomes unavoidable. Implementation: applyAgentCompactionSettingsFromConfig() in src/agents/agent-settings.ts, called from embedded-runner turn and compaction setup paths.

Manual /compact honors an explicit agents.defaults.compaction.keepRecentTokens and keeps the runtime's recent-tail cut point. Without an explicit keep budget, manual compaction is a hard checkpoint and rebuilt context starts from the new summary.

When truncateAfterCompaction is enabled, OpenClaw rotates the active transcript to a compacted successor JSONL after compaction. Branch/restore checkpoint actions use that compacted successor; legacy pre-compaction checkpoint files remain readable while referenced.

Pluggable compaction providers

Plugins register a compaction provider via registerCompactionProvider() on the plugin API. When agents.defaults.compaction.provider is set to a registered provider id, the safeguard extension delegates summarization to that provider instead of the built-in summarizeInStages pipeline.

provider: id of a registered compaction provider plugin. Leave unset for default LLM summarization. Setting a provider forces mode: "safeguard".
Providers receive the same compaction instructions and identifier-preservation policy as the built-in path, and the safeguard still preserves recent-turn and split-turn suffix context after provider output.
Built-in safeguard summarization re-distills prior summaries with new messages instead of preserving the full previous summary verbatim.
Safeguard mode enables summary quality audits by default; set qualityGuard.enabled: false to skip retry-on-malformed-output behavior.
If the provider fails or returns an empty result, OpenClaw falls back to built-in LLM summarization automatically. Abort/timeout signals the caller explicitly triggered are re-thrown, not swallowed, so cancellation is always respected.

Source: src/plugins/compaction-provider.ts, src/agents/agent-hooks/compaction-safeguard.ts.

User-visible surfaces

/status in any chat session
openclaw status (CLI)
openclaw sessions / openclaw sessions --json
Gateway logs (pnpm gateway:watch or openclaw logs --follow): embedded run auto-compaction start + complete
Verbose mode: 🧹 Auto-compaction complete plus the compaction count

Silent housekeeping (`NO_REPLY`)

OpenClaw supports "silent" turns for background tasks where the user should not see intermediate output.

The assistant starts its output with the exact silent token NO_REPLY / no_reply to mean "do not deliver a reply to the user." OpenClaw strips/suppresses this in the delivery layer.
Exact silent-token suppression is case-insensitive: NO_REPLY and no_reply both count when the whole payload is just the silent token.
As of 2026.1.10, OpenClaw also suppresses draft/typing streaming when a partial chunk begins with NO_REPLY, so silent operations do not leak partial output mid-turn.
This is for true background/no-delivery turns only - it is not a shortcut for ordinary actionable user requests.

Pre-compaction memory flush

Before auto-compaction happens, OpenClaw can run a silent agentic turn that writes durable state to disk (for example memory/YYYY-MM-DD.md in the agent workspace) so compaction cannot erase critical context. It monitors session context usage, and once it crosses a soft threshold below the compaction threshold, it sends a silent "write memory now" directive using the exact silent token NO_REPLY / no_reply so the user sees nothing.

Config (agents.defaults.compaction.memoryFlush), full reference at /gateway/config-agents:

Key	Default	Notes
`enabled`	`true`
`model`	unset	exact provider/model override for the flush turn only, for example `ollama/qwen3:8b`
`softThresholdTokens`	`4000`	gap below the compaction threshold that triggers a flush
`forceFlushTranscriptBytes`	unset (disabled)	force a flush once the transcript file reaches this byte size (or string like `"2mb"`), even if token counters are stale; `0` disables
`prompt`	built-in	user message for the flush turn
`systemPrompt`	built-in	extra system prompt appended for the flush turn

Notes:

The default prompt/system prompt include a NO_REPLY hint to suppress delivery.
When model is set, the flush turn uses that model without inheriting the active session's fallback chain, so local-only housekeeping does not silently fall back to a paid conversation model on failure.
The flush runs once per compaction cycle (tracked in sessions.json).
The flush runs only for embedded OpenClaw sessions; CLI backends and heartbeat turns skip it.
The flush is skipped when the session workspace is read-only (workspaceAccess: "ro" or "none").
See Memory for the workspace file layout and write patterns.

OpenClaw exposes a session_before_compact hook in the extension API, but the flush logic above lives on the Gateway side (src/auto-reply/reply/memory-flush.ts, src/auto-reply/reply/agent-runner-memory.ts), not on that hook.

Troubleshooting checklist

Session key wrong? Start with /concepts/session and confirm the sessionKey in /status.
Store vs transcript mismatch? Confirm the Gateway host and the store path from openclaw status.
Compaction spam? Check the model's context window (too small forces frequent compaction), reserveTokens (too high for the model window causes earlier compaction), and tool-result bloat (tune session pruning).
Every prompt seems to overflow on a small local model? The reserveTokensFloor default (20000) auto-caps to a safe fraction of the context window, but an explicit reserveTokens set higher than the window itself is not capped - lower it or unset it.
Silent turns leaking? Confirm the reply starts with the exact silent token NO_REPLY (case-insensitive) and you are on a build that includes the streaming-suppression fix (2026.1.10+).

26 KiB Raw Permalink Blame History