Files
openclaw-openclaw/docs/reference/session-management-compaction.md
2026-07-03 21:59:16 -07:00

26 KiB

summary, read_when, title
summary read_when title
Deep dive: session store + transcripts, lifecycle, and (auto)compaction internals
You need to debug session ids, transcript events, or session row fields
You are changing auto-compaction behavior or adding "pre-compaction" housekeeping
You want to implement memory flushes or silent system turns
Session management deep dive

OpenClaw manages sessions end-to-end across these areas:

  • Session routing (how inbound messages map to a sessionKey)
  • Session store (per-agent SQLite) and what it tracks
  • Transcript persistence (per-agent SQLite transcript rows) and its structure
  • Transcript hygiene (provider-specific fixups before runs)
  • Context limits (context window vs tracked tokens)
  • Compaction (manual and auto-compaction) and where to hook pre-compaction work
  • Silent housekeeping (memory writes that should not produce user-visible output)

If you want a higher-level overview first, start with:


Source of truth: the Gateway

OpenClaw is designed around a single Gateway process that owns session state.

  • UIs (macOS app, web Control UI, TUI) should query the Gateway for session lists and token counts.
  • In remote mode, session state is on the remote host; "checking your local Mac files" won't reflect what the Gateway is using.

Two persistence layers

OpenClaw persists sessions in two layers:

  1. Session rows (per-agent SQLite)

    • Key/value map: sessionKey -> SessionEntry
    • Mutable runtime state owned by the Gateway
    • Tracks session metadata (current session id, last activity, toggles, token counters, etc.)
  2. Transcript events (per-agent SQLite)

    • Append-only transcript with tree structure (entries have id + parentId)
    • Stores the actual conversation + tool calls + compaction summaries
    • Used to rebuild the model context for future turns
    • Compaction checkpoints are metadata over the compacted successor transcript. New compactions do not write a second .checkpoint.*.jsonl copy.

Older installs may still have sessions.json files under the agent sessions/ directory. Treat those files as legacy session-row migration inputs or explicit offline-maintenance targets. Gateway startup and openclaw doctor --fix import hot legacy rows and transcript history into the per-agent SQLite store automatically. Run openclaw doctor --session-sqlite inspect --session-sqlite-all-agents, then follow the Doctor migration sequence, when you need explicit inspection or validation evidence. If a migration fails after legacy transcript artifacts were archived, use the Doctor recovery mode from that sequence. Recovery uses migration manifests, restores only the affected archived support artifacts, prepares a sanitized GitHub issue report when requested, and does not make active runtime read JSONL files again.

Gateway history readers should avoid materializing the whole transcript unless the surface explicitly needs arbitrary historical access. First-page history, embedded chat history, restart recovery, and token/usage checks use bounded tail reads from SQLite. Full transcript scans go through the async transcript index and are shared across concurrent readers.


On-disk locations

Per agent, on the Gateway host:

  • Runtime session row store: ~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite
  • Runtime transcript rows: ~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite
  • Legacy/archive transcript artifacts: ~/.openclaw/agents/<agentId>/sessions/
  • Legacy row migration input: ~/.openclaw/agents/<agentId>/sessions/sessions.json

OpenClaw resolves these via src/config/sessions.ts.


Store maintenance and disk controls

Session persistence has automatic maintenance controls (session.maintenance) for SQLite session rows, SQLite transcript rows, archive artifacts, and trajectory sidecars:

  • mode: enforce (default) or warn
  • pruneAfter: stale-entry age cutoff (default 30d)
  • maxEntries: cap session entries (default 500)
  • Short-lived gateway model-run probe retention is fixed at 24h, but it is pressure-gated: it only removes stale strict probe rows when session-entry maintenance/cap pressure is reached. This applies only to strict explicit probe keys matching agent:*:explicit:model-run-<uuid> and runs before global stale-entry cleanup/capping when it runs.
  • resetArchiveRetention: retention for *.reset.<timestamp> transcript archives (default: same as pruneAfter; false disables cleanup)
  • maxDiskBytes: optional row/artifact budget. Active SQLite enforcement measures session-row JSON plus transcript-event JSON bytes per session; legacy offline-maintenance enforcement measures files in the selected sessions directory.
  • highWaterBytes: optional target after cleanup (default 80% of maxDiskBytes)

Normal Gateway writes flow through the session accessor, which serializes per-agent SQLite mutations through the runtime writer path. Runtime code should prefer the accessor helpers in src/config/sessions/session-accessor.ts; legacy sessions.json helpers are migration and offline-maintenance tools. When a Gateway is reachable, non-dry-run openclaw sessions cleanup and openclaw agents delete delegate store mutations to the Gateway so cleanup joins the same writer queue; --store <path> is the explicit offline repair path for a selected legacy store. maxEntries cleanup is still batched for production-sized caps, so a store may briefly exceed the configured cap before the next high-water cleanup rewrites it back down. Session store reads do not prune or cap entries during Gateway startup; writes and explicit cleanup runs are the paths that apply configured session-entry maintenance. Legacy artifact pruning remains tied to archived legacy transcript files and explicit offline-maintenance targets.

Maintenance keeps durable external conversation pointers such as group sessions and thread-scoped chat sessions, but synthetic runtime entries for cron, hooks, heartbeat, ACP, and sub-agents can still be removed when they exceed the configured age, count, or disk budget. Gateway model-run probe sessions use the separate 24h model-run retention only when their key exactly matches agent:*:explicit:model-run-<uuid>; other explicit sessions are not part of that retention. The model-run cleanup is applied only under session-entry cap pressure. Isolated cron runs keep their own cron.sessionRetention control, independent of model-run probe retention.

OpenClaw no longer creates automatic sessions.json.bak.* rotation backups during Gateway writes. The legacy session.maintenance.rotateBytes key is ignored and openclaw doctor --fix removes it from older configs.

Transcript mutations use the session write queue for the SQLite transcript target. Lock acquisition waits up to session.writeLock.acquireTimeoutMs before surfacing a busy-session error; the default is 60000 ms. Raise this only when legitimate prep, cleanup, compaction, or transcript mirror work contends longer on slow machines. session.writeLock.staleMs controls when an existing lock can be reclaimed as stale; the default is 1800000 ms. session.writeLock.maxHoldMs controls the in-process watchdog release threshold; the default is 300000 ms. Emergency env overrides are OPENCLAW_SESSION_WRITE_LOCK_ACQUIRE_TIMEOUT_MS, OPENCLAW_SESSION_WRITE_LOCK_STALE_MS, and OPENCLAW_SESSION_WRITE_LOCK_MAX_HOLD_MS.

Enforcement order for disk budget cleanup (mode: "enforce"):

  1. Remove oldest archived transcript artifacts, orphan legacy artifacts, or orphan trajectory artifacts first.
  2. If still above the target, evict oldest session entries and their transcript rows or trajectory artifacts.
  3. Keep going until usage is at or below highWaterBytes.

In mode: "warn", OpenClaw reports potential evictions but does not mutate the store/files.

Run maintenance on demand:

openclaw sessions cleanup --dry-run
openclaw sessions cleanup --enforce

Downgrading After The SQLite Flip

Restore archived legacy transcript artifacts before running an older file-backed OpenClaw version:

openclaw doctor --session-sqlite restore --session-sqlite-all-agents

The migration leaves legacy sessions.json files in place for support and rollback, but hot transcript JSONL files that were imported into SQLite are renamed into session-sqlite-import-archive/. Older file-backed runtimes follow the sessionFile paths in sessions.json, so they need those artifacts restored before startup. Restore uses migration manifests, moves only recorded archived artifacts whose original paths are missing, and leaves the SQLite database in place for forward recovery.

Sessions created after the SQLite flip are SQLite-only and will not appear to an older file-backed runtime. If you re-upgrade after a downgrade, run the Doctor inspection and validation sequence again so OpenClaw can verify restored legacy artifacts before importing.


Cron sessions and run logs

Isolated cron runs also create session entries/transcripts, and they have dedicated retention controls:

  • cron.sessionRetention (default 24h) prunes old isolated cron run sessions from the session store (false disables).
  • cron.runLog.keepLines prunes retained SQLite run-history rows per cron job (default: 2000). cron.runLog.maxBytes remains accepted for older file-backed run logs.

When cron force-creates a new isolated run session, it sanitizes the previous cron:<jobId> session entry before writing the new row. It carries safe preferences such as thinking/fast/verbose settings, labels, and explicit user-selected model/auth overrides. It drops ambient conversation context such as channel/group routing, send or queue policy, elevation, origin, and ACP runtime binding so a fresh isolated run cannot inherit stale delivery or runtime authority from an older run.


Session keys (sessionKey)

A sessionKey identifies which conversation bucket you're in (routing + isolation).

Common patterns:

  • Main/direct chat (per agent): agent:<agentId>:<mainKey> (default main)
  • Group: agent:<agentId>:<channel>:group:<id>
  • Room/channel (Discord/Slack): agent:<agentId>:<channel>:channel:<id> or ...:room:<id>
  • Cron: cron:<job.id>
  • Webhook: hook:<uuid> (unless overridden)

The canonical rules are documented at /concepts/session.


Session ids (sessionId)

Each sessionKey points at a current sessionId (the SQLite transcript identity that continues the conversation).

Rules of thumb:

  • Reset (/new, /reset) creates a new sessionId for that sessionKey.
  • Daily reset (default 4:00 AM local time on the gateway host) creates a new sessionId on the next message after the reset boundary.
  • Idle expiry (session.reset.idleMinutes or legacy session.idleMinutes) creates a new sessionId when a message arrives after the idle window. When daily + idle are both configured, whichever expires first wins.
  • Control UI reconnect resume can preserve the currently visible session for one reconnect send when the Gateway receives the matching sessionId from an operator UI client. Ordinary stale sends still create a new sessionId.
  • System events (heartbeat, cron wakeups, exec notifications, gateway bookkeeping) may mutate the session row but do not extend daily/idle reset freshness. Reset rollover discards queued system-event notices for the previous session before the fresh prompt is built.
  • Parent fork policy uses OpenClaw's active branch when creating a thread or subagent fork. If that branch is too large, OpenClaw starts the child with isolated context instead of failing or inheriting unusable history. The sizing policy is automatic; legacy session.parentForkMaxTokens config is removed by openclaw doctor --fix.

Implementation detail: the decision happens in initSessionState() in src/auto-reply/reply/session.ts.


Session store schema

The runtime store keeps SessionEntry values in per-agent SQLite. The value type is defined in src/config/sessions.ts.

Key fields (not exhaustive):

  • sessionId: current transcript id used to address SQLite transcript rows
  • sessionStartedAt: start timestamp for the current sessionId; daily reset freshness uses this. Legacy rows may derive it from the JSONL session header.
  • lastInteractionAt: last real user/channel interaction timestamp; idle reset freshness uses this so heartbeat, cron, and exec events do not keep sessions alive. Legacy rows without this field fall back to the recovered session start time for idle freshness.
  • updatedAt: last store-row mutation timestamp, used for listing, pruning, and bookkeeping. It is not the authority for daily/idle reset freshness.
  • sessionFile: legacy marker retained for migration/archive compatibility; active runtime uses SQLite identity
  • chatType: direct | group | room (helps UIs and send policy)
  • provider, subject, room, space, displayName: metadata for group/channel labeling
  • Toggles:
    • thinkingLevel, verboseLevel, reasoningLevel, elevatedLevel
    • sendPolicy (per-session override)
  • Model selection:
    • providerOverride, modelOverride, authProfileOverride
  • Token counters (best-effort / provider-dependent):
    • inputTokens, outputTokens, totalTokens, contextTokens
  • compactionCount: how often auto-compaction completed for this session key
  • memoryFlushAt: timestamp for the last pre-compaction memory flush
  • memoryFlushCompactionCount: compaction count when the last flush ran

The Gateway is the authority: it may rewrite or rehydrate entries as sessions run. For legacy file-backed installs, migrate with openclaw doctor --session-sqlite import --session-sqlite-all-agents instead of editing sessions.json and expecting runtime to keep reading that file.


Transcript event structure

Transcripts are managed by the OpenClaw session accessor and exposed to runtime code through identity-based helpers.

The event stream is append-only:

  • First line: session header (type: "session", includes id, cwd, timestamp, optional parentSession)
  • Then: session entries with id + parentId (tree)

Notable entry types:

  • message: user/assistant/toolResult messages
  • custom_message: extension-injected messages that do enter model context (can be hidden from UI)
  • custom: extension state that does not enter model context
  • compaction: persisted compaction summary with firstKeptEntryId and tokensBefore
  • branch_summary: persisted summary when navigating a tree branch

OpenClaw intentionally does not "fix up" transcripts; the Gateway uses SessionManager to read/write them.


Context windows vs tracked tokens

Two different concepts matter:

  1. Model context window: hard cap per model (tokens visible to the model)
  2. Session store counters: rolling stats written into the session row (used for /status and dashboards)

If you're tuning limits:

  • The context window comes from the model catalog (and can be overridden via config).
  • contextTokens in the store is a runtime estimate/reporting value; don't treat it as a strict guarantee.

For more, see /token-use.


Compaction: what it is

Compaction summarizes older conversation into a persisted compaction entry in the transcript and keeps recent messages intact.

After compaction, future turns see:

  • The compaction summary
  • Messages after firstKeptEntryId

AGENTS.md section reinjection after compaction is opt-in via agents.defaults.compaction.postCompactionSections; when unset or [], OpenClaw does not append AGENTS.md excerpts on top of the compaction summary.

Compaction is persistent (unlike session pruning). See /concepts/session-pruning.

Compaction chunk boundaries and tool pairing

When OpenClaw splits a long transcript into compaction chunks, it keeps assistant tool calls paired with their matching toolResult entries.

  • If the token-share split lands between a tool call and its result, OpenClaw shifts the boundary to the assistant tool-call message instead of separating the pair.
  • If a trailing tool-result block would otherwise push the chunk over target, OpenClaw preserves that pending tool block and keeps the unsummarized tail intact.
  • Aborted/error tool-call blocks do not hold a pending split open.

When auto-compaction happens (OpenClaw runtime)

In the embedded OpenClaw agent, auto-compaction triggers in two cases:

  1. Overflow recovery: the model returns a context overflow error (request_too_large, context length exceeded, input exceeds the maximum number of tokens, input token count exceeds the maximum number of input tokens, input is too long for the model, ollama error: context length exceeded, and similar provider-shaped variants) → compact → retry. When the provider reports the attempted token count, OpenClaw forwards that observed count into overflow recovery compaction. If the provider confirms overflow but does not expose a parseable count, OpenClaw passes a minimally over-budget synthetic count to compaction engines and diagnostics. If overflow recovery still fails, OpenClaw surfaces explicit guidance to the user and preserves the current session mapping instead of silently rotating the session key to a fresh session id. The next step is operator-controlled: retry the message, run /compact, or run /new when a fresh session is preferred.
  2. Threshold maintenance: after a successful turn, when:

contextTokens > contextWindow - reserveTokens

Where:

  • contextWindow is the model's context window
  • reserveTokens is headroom reserved for prompts + the next model output

These are OpenClaw runtime semantics.

OpenClaw can also trigger a preflight local compaction before opening the next run when transcript history exceeds the configured local threshold. This is a reopen-cost guard for oversized legacy/imported history, not raw archival: OpenClaw still runs normal semantic compaction and stores the successor transcript in SQLite.

For embedded OpenClaw runs, agents.defaults.compaction.midTurnPrecheck.enabled: true adds an opt-in tool-loop guard. After a tool result is appended and before the next model call, OpenClaw estimates the prompt pressure using the same preflight budget logic used at turn start. If the context no longer fits, the guard does not compact inside OpenClaw runtime's transformContext hook. It raises a structured mid-turn precheck signal, stops the current prompt submission, and lets the outer run loop use the existing recovery path: truncate oversized tool results when that is enough, or trigger the configured compaction mode and retry. The option is disabled by default and works with both default and safeguard compaction modes, including provider-backed safeguard compaction. This is independent of maxActiveTranscriptBytes: the byte-size guard runs before a turn opens, while mid-turn precheck runs later in the embedded OpenClaw tool loop after new tool results have been appended.


Compaction settings (reserveTokens, keepRecentTokens)

OpenClaw runtime's compaction settings live in agent settings:

{
  compaction: {
    enabled: true,
    reserveTokens: 16384,
    keepRecentTokens: 20000,
  },
}

OpenClaw also enforces a safety floor for embedded runs:

  • If compaction.reserveTokens < reserveTokensFloor, OpenClaw bumps it.
  • Default floor is 20000 tokens.
  • Set agents.defaults.compaction.reserveTokensFloor: 0 to disable the floor.
  • If it's already higher, OpenClaw leaves it alone.
  • Manual /compact honors an explicit agents.defaults.compaction.keepRecentTokens and keeps OpenClaw runtime's recent-tail cut point. Without an explicit keep budget, manual compaction remains a hard checkpoint and rebuilt context starts from the new summary.
  • Set agents.defaults.compaction.midTurnPrecheck.enabled: true to run the optional tool-loop precheck after new tool results and before the next model call. This is a trigger only; summary generation still uses the configured compaction path. It is independent of maxActiveTranscriptBytes, which is a turn-start active-transcript byte-size guard.
  • Set agents.defaults.compaction.maxActiveTranscriptBytes to a byte value or string such as "20mb" to run local compaction before a turn when the active transcript gets large. This guard is active only when truncateAfterCompaction is also enabled. Leave it unset or set 0 to disable.
  • When agents.defaults.compaction.truncateAfterCompaction is enabled, OpenClaw rotates the active transcript to a compacted successor after compaction. Branch/restore checkpoint actions use that compacted successor; legacy pre-compaction checkpoint files remain readable while referenced.

Why: leave enough headroom for multi-turn "housekeeping" (like memory writes) before compaction becomes unavoidable.

Implementation: applyAgentCompactionSettingsFromConfig() in src/agents/agent-settings.ts (called from embedded-runner turn and compaction setup paths).


Pluggable compaction providers

Plugins can register a compaction provider via registerCompactionProvider() on the plugin API. When agents.defaults.compaction.provider is set to a registered provider id, the safeguard extension delegates summarization to that provider instead of the built-in summarizeInStages pipeline.

  • provider: id of a registered compaction provider plugin. Leave unset for default LLM summarization.
  • Setting a provider forces mode: "safeguard".
  • Providers receive the same compaction instructions and identifier-preservation policy as the built-in path.
  • The safeguard still preserves recent-turn and split-turn suffix context after provider output.
  • Built-in safeguard summarization re-distills prior summaries with new messages instead of preserving the full previous summary verbatim.
  • Safeguard mode enables summary quality audits by default; set qualityGuard.enabled: false to skip retry-on-malformed-output behavior.
  • If the provider fails or returns an empty result, OpenClaw falls back to built-in LLM summarization automatically.
  • Abort/timeout signals are re-thrown (not swallowed) to respect caller cancellation.

Source: src/plugins/compaction-provider.ts, src/agents/agent-hooks/compaction-safeguard.ts.


User-visible surfaces

You can observe compaction and session state via:

  • /status (in any chat session)
  • openclaw status (CLI)
  • openclaw sessions / sessions --json
  • Gateway logs (pnpm gateway:watch or openclaw logs --follow): embedded run auto-compaction start + complete
  • Verbose mode: 🧹 Auto-compaction complete + compaction count

Silent housekeeping (NO_REPLY)

OpenClaw supports "silent" turns for background tasks where the user should not see intermediate output.

Convention:

  • The assistant starts its output with the exact silent token NO_REPLY / no_reply to indicate "do not deliver a reply to the user".
  • OpenClaw strips/suppresses this in the delivery layer.
  • Exact silent-token suppression is case-insensitive, so NO_REPLY and no_reply both count when the whole payload is just the silent token.
  • This is for true background/no-delivery turns only; it is not a shortcut for ordinary actionable user requests.

As of 2026.1.10, OpenClaw also suppresses draft/typing streaming when a partial chunk begins with NO_REPLY, so silent operations don't leak partial output mid-turn.


Pre-compaction "memory flush" (implemented)

Goal: before auto-compaction happens, run a silent agentic turn that writes durable state to disk (e.g. memory/YYYY-MM-DD.md in the agent workspace) so compaction can't erase critical context.

OpenClaw uses the pre-threshold flush approach:

  1. Monitor session context usage.
  2. When it crosses a "soft threshold" (below OpenClaw runtime's compaction threshold), run a silent "write memory now" directive to the agent.
  3. Use the exact silent token NO_REPLY / no_reply so the user sees nothing.

Config (agents.defaults.compaction.memoryFlush):

  • enabled (default: true)
  • model (optional exact provider/model override for the flush turn, for example ollama/qwen3:8b)
  • softThresholdTokens (default: 4000)
  • prompt (user message for the flush turn)
  • systemPrompt (extra system prompt appended for the flush turn)

Notes:

  • The default prompt/system prompt include a NO_REPLY hint to suppress delivery.
  • When model is set, the flush turn uses that model without inheriting the active session fallback chain, so local-only housekeeping does not silently fall back to a paid conversation model.
  • The flush runs once per compaction cycle (tracked in the session row).
  • The flush runs only for embedded OpenClaw sessions (CLI backends skip it).
  • The flush is skipped when the session workspace is read-only (workspaceAccess: "ro" or "none").
  • See Memory for the workspace file layout and write patterns.

OpenClaw also exposes a session_before_compact hook in the extension API, but OpenClaw's flush logic lives on the Gateway side today.


Troubleshooting checklist

  • Session key wrong? Start with /concepts/session and confirm the sessionKey in /status.
  • Store vs transcript mismatch? Confirm the Gateway host and the store path from openclaw status.
  • Compaction spam? Check:
    • model context window (too small)
    • compaction settings (reserveTokens too high for the model window can cause earlier compaction)
    • tool-result bloat: enable/tune session pruning
  • Silent turns leaking? Confirm the reply starts with NO_REPLY (case-insensitive exact token) and you're on a build that includes the streaming suppression fix.