chore: bump version to 13.8.1

This commit is contained in:
Alex Newman
2026-06-24 16:34:28 -07:00
parent 16b2c72d57
commit 3fe0725a97
14 changed files with 711 additions and 15 deletions

View File

@@ -9,7 +9,7 @@
"plugins": [
{
"name": "claude-mem",
"version": "13.8.0",
"version": "13.8.1",
"source": "./plugin",
"description": "Persistent memory system for Claude Code - context compression across sessions"
}

View File

@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "13.8.0",
"version": "13.8.1",
"description": "Memory compression system for Claude Code - persist context across sessions",
"author": {
"name": "Alex Newman"

View File

@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "13.8.0",
"version": "13.8.1",
"description": "Memory compression system for Claude Code - persist context across sessions",
"author": {
"name": "Alex Newman",

View File

@@ -3,7 +3,7 @@
"name": "Claude-Mem (Persistent Memory)",
"description": "OpenClaw plugin for Claude-Mem. Records observations from embedded runner sessions and streams them to messaging channels.",
"kind": "memory",
"version": "13.8.0",
"version": "13.8.1",
"license": "Apache-2.0",
"author": "thedotmack",
"homepage": "https://claude-mem.ai",

View File

@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "13.8.0",
"version": "13.8.1",
"description": "Memory compression system for Claude Code - persist context across sessions",
"keywords": [
"claude",

View File

@@ -0,0 +1,358 @@
# Codex Restart Handoff — claude-mem Recovery Release
You are Codex in the `thedotmack/claude-mem` repo. Continue from the current
working tree; do not restart analysis from scratch and do not revert user or
previous-agent changes.
## User Intent
The user wants `claude-mem` working correctly in Codex first, then wants the
recovery-release plan executed to bring users back. Ignore the Gemini-generated
artifacts unless the user explicitly asks for them again.
Immediate user context:
- The user is going to restart Codex to confirm `claude-mem` works.
- The plugin cache was broken and has been patched locally.
- After confirming Codex works, start executing the recovery plan, beginning
with preserving/landing the Codex compatibility fix and then Phase 0/Phase 1
of the release plan.
## Repo And Working Tree
Current repo/worktree:
```text
/Users/alexnewman/.superset/worktrees/df8069a7-eb08-4626-9d3d-918d1e12eb9f/night-parsnip
```
Expected relevant working tree changes:
```text
M plugin/hooks/codex-hooks.json
M plugin/scripts/worker-service.cjs
M scripts/build-hooks.js
M src/services/worker-service.ts
M tests/infrastructure/plugin-distribution.test.ts
M tests/infrastructure/worker-json-status.test.ts
?? plans/2026-06-24-release-recovery-plan.md
?? plans/2026-06-24-codex-restart-handoff.md
```
The untracked release plan is intentional. Preserve it.
## What Was Fixed Already
Codex compatibility root cause:
1. `plugin/hooks/codex-hooks.json` had an unsupported root-level
`description` key. Codex 0.140+ rejects unknown hook-config root keys, so
hooks looked installed/enabled but did not load correctly.
2. Codex hook startup paths could emit Claude-style `suppressOutput`, which
Codex rejects on current hook output contracts.
Implemented repo changes:
- `plugin/hooks/codex-hooks.json`
- Removed root `description`; root keys are now only `["hooks"]`.
- Added `CLAUDE_MEM_CODEX_HOOK=1` to every Codex hook command.
- `scripts/build-hooks.js`
- Codex hook generation now injects `CLAUDE_MEM_CODEX_HOOK=1`.
- Build verification now fails if Codex hooks contain unsupported root keys.
- `src/services/worker-service.ts`
- `buildStatusOutput()` includes `suppressOutput` by default for Claude.
- `worker-service start` omits `suppressOutput` when
`CLAUDE_MEM_CODEX_HOOK=1`.
- `plugin/scripts/worker-service.cjs`
- Regenerated bundle carrying the worker-service fix.
- Tests updated in:
- `tests/infrastructure/plugin-distribution.test.ts`
- `tests/infrastructure/worker-json-status.test.ts`
Local installed plugin copies were also patched so the user's restarted Codex
session should work immediately:
```text
/Users/alexnewman/.claude/plugins/marketplaces/thedotmack/plugin
/Users/alexnewman/.codex/plugins/cache/claude-mem-local/claude-mem/13.8.0
```
Both local copies were verified:
```text
rootKeys=["hooks"]
commandCount=7
missingCodexEnv=0
```
Installed-cache smoke results:
```text
Codex env:
{"continue":true,"status":"ready"}
Default/Claude env:
{"continue":true,"status":"ready","suppressOutput":true}
```
`codex doctor --summary --ascii` after the patch:
```text
Configuration config loaded
16 ok | 1 idle | 2 notes | 1 warn | 0 fail degraded
```
The remaining doctor warning was unrelated stale thread state:
```text
threads: rollout files are missing from the state DB
```
## Verification Already Run
These passed after the Codex fix:
```bash
bun test tests/infrastructure/plugin-distribution.test.ts tests/infrastructure/worker-json-status.test.ts tests/hook-lifecycle.test.ts
npm run typecheck:root
npm run lint:spawn-env
npm run lint:hook-io
```
Focused test result:
```text
124 pass
3 skip
0 fail
403 expect() calls
```
## Primary Plan File
Use this as the release execution source of truth:
```text
plans/2026-06-24-release-recovery-plan.md
```
That plan cross-references the PostHog report and GitHub issues/PRs. It defines
the recovery-release blockers:
1. Setup/dependency preflight and graceful degradation.
2. Chroma launch/lifecycle reliability.
3. Observer output loop fix.
4. Codex hook compatibility.
5. Gemini request-shape fix.
6. Platform session identity fix.
7. Chroma backfill JSON tolerance.
8. Telemetry UUID compatibility.
9. Upgrade/install survival for partial dependency installs.
The Codex hook compatibility blocker is already implemented locally and should
be treated as the first completed release slice, subject to final review/commit.
## Start Here After Restart
First confirm the restarted Codex session can load the plugin:
```bash
codex doctor --summary --ascii
codex plugin list
```
Expected:
- `claude-mem@claude-mem-local` is installed and enabled.
- `codex doctor` shows config loaded and no plugin/hook config failure.
- Any stale-thread warning is unrelated unless it changes.
Then verify the local cache still has the patched hook file:
```bash
node - <<'NODE'
const fs = require('fs');
const paths = [
'/Users/alexnewman/.claude/plugins/marketplaces/thedotmack/plugin/hooks/codex-hooks.json',
'/Users/alexnewman/.codex/plugins/cache/claude-mem-local/claude-mem/13.8.0/hooks/codex-hooks.json',
];
function commands(hooks) {
return Object.values(hooks).flatMap(groups =>
groups.flatMap(group => (group.hooks || []).map(hook => hook.command || ''))
);
}
for (const p of paths) {
const json = JSON.parse(fs.readFileSync(p, 'utf8'));
const cmds = commands(json.hooks);
console.log(p);
console.log('rootKeys=' + JSON.stringify(Object.keys(json)));
console.log('commandCount=' + cmds.length);
console.log('missingCodexEnv=' + cmds.filter(c => !c.includes('CLAUDE_MEM_CODEX_HOOK=1')).length);
}
NODE
```
Then smoke the installed worker-service output shape:
```bash
node - <<'NODE'
const { spawnSync } = require('child_process');
const runner = '/Users/alexnewman/.codex/plugins/cache/claude-mem-local/claude-mem/13.8.0/scripts/bun-runner.js';
const worker = '/Users/alexnewman/.codex/plugins/cache/claude-mem-local/claude-mem/13.8.0/scripts/worker-service.cjs';
for (const [label, env] of [
['codex', { ...process.env, CLAUDE_MEM_CODEX_HOOK: '1' }],
['default', { ...process.env }],
]) {
const result = spawnSync(process.execPath, [runner, worker, 'start'], { env, encoding: 'utf8' });
console.log(label + ': exit=' + result.status + ' stdout=' + result.stdout.trim() + ' stderr=' + result.stderr.trim());
}
NODE
```
Expected:
```text
codex: exit=0 stdout={"continue":true,"status":"ready"} stderr=
default: exit=0 stdout={"continue":true,"status":"ready","suppressOutput":true} stderr=
```
## Execution Plan
### Step 1 — Preserve The Codex Compatibility Fix
Review the six modified files and the generated bundle diff. Do not throw away
`plugin/scripts/worker-service.cjs`; it is the distributed artifact for users.
Run:
```bash
git diff --stat
git diff -- plugin/hooks/codex-hooks.json scripts/build-hooks.js src/services/worker-service.ts tests/infrastructure/plugin-distribution.test.ts tests/infrastructure/worker-json-status.test.ts
```
Then rerun:
```bash
bun test tests/infrastructure/plugin-distribution.test.ts tests/infrastructure/worker-json-status.test.ts tests/hook-lifecycle.test.ts
npm run typecheck:root
npm run lint:spawn-env
npm run lint:hook-io
```
If the user wants a commit, commit only the Codex compatibility fix plus the
handoff/recovery plan files if they want those included. Do not include
unrelated generated churn.
Suggested commit message:
```text
fix(codex): ship strict plugin hooks and Codex-safe worker status
```
### Step 2 — Start Release Branch Discipline
Use the plan file:
```text
plans/2026-06-24-release-recovery-plan.md
```
Target branch from the plan:
```text
release/recovery-2026-06-24
```
Before creating or switching branches, inspect current branch and status. Do
not drop local changes.
```bash
git branch --show-current
git status --short
```
If continuing in this worktree, keep the release branch scoped to recovery
blockers only.
### Step 3 — Execute Remaining Release Phases
Codex compatibility is plan-19 / Phase 1A and is already implemented locally.
Next priorities from the recovery plan:
1. Phase 0: branch/freeze and route GitHub issues/PRs to recovery blockers.
2. Phase 1: setup/install survival.
- Add dependency health for Claude CLI, Bun, uv/uvx, plugin hard deps, and
provider key state.
- Runtime missing Claude CLI becomes `setup_required`, not retry spam.
- Runtime missing `uvx` disables vector search, but SQLite capture/search
continues.
- Replace `Bun.randomUUIDv5` in `src/services/telemetry/backfill.ts`.
3. Phase 2: Chroma lifecycle reliability.
- Prefer `uvx --from chroma-mcp==<pin> chroma-mcp`.
- Split prewarm timeout from MCP handshake timeout.
- Capture bounded stderr on connect failure.
- Treat backoff/unavailable as "not synced yet", not user-flow throws.
4. Phase 3: observer output and quota pause.
- Drop non-XML/prose instead of poison-respawn.
- Pause on quota/weekly-limit messages without losing pending work.
5. Phase 4: Gemini request envelopes and platform-namespaced session identity.
6. Phase 5: Chroma backfill malformed JSON tolerance.
Do not spend time on new providers, broad refactors, or feature bundles unless
they directly unblock one of the recovery blockers.
## GitHub / Report Context
The plan was built from:
```text
Attached PostHog report:
/Users/alexnewman/.superset/host/e7c5cb1f-3f94-4b7b-b6b7-37a97d3b4a51/attachments/08a4bcfe-650a-4094-a534-815c15b67701/08a4bcfe-650a-4094-a534-815c15b67701.json
GitHub snapshots:
/tmp/claude_mem_open_issues_full.json
/tmp/claude_mem_open_prs.json
```
High-impact report categories:
- Claude executable not found.
- `uvx` not found.
- `Bun.randomUUIDv5` not a function.
- Chroma 30s timeout.
- MCP `-32000 Connection closed`.
- Chroma backoff throws into sync.
- Gemini 400 bad request.
- Platform source conflict.
- JSON parse error with Chinese/non-JSON strings.
- Observer poison/respawn loop.
Codex-specific blockers:
- #2972 / #2947: Codex refuses to load hooks config.
- #2975 / #2871: Codex rejects hook output.
- #2962 / #2941 / #2914: Codex/Windows spawn contract regressions.
PRs to consolidate from the plan:
- #3039 partial dependency install survival.
- #3033 UTF-8 BOM settings readers.
- #3018 proxy env preservation.
- #3028 observer poisoned respawn fix.
- #2920 Chroma uvx prewarm.
- #2880 Chroma `uvx --from`.
- #2887 bundle zod.
- #2849 SQLite busy timeout.
- #2953 Codex compatibility, if it still rebases cleanly.
- #2945 Windows hook install spawn/PATH fixes, if still needed.
## Operating Constraints
- Keep edits scoped to recovery blockers.
- Preserve user changes and untracked plan files.
- Use `rg` for search.
- Use `apply_patch` for manual file edits.
- Do not use destructive git commands.
- Verify with focused tests before broad tests.
- For generated bundles, ensure only required bundle artifacts remain modified.

View File

@@ -0,0 +1,338 @@
# 2026-06-24 Release Recovery Plan
## Goal
Ship a reliability-first recovery release that removes the largest June 2026 error sources, stops observer/chroma churn, and gives users a setup path that does not fail only after their first session starts.
This plan cross-references the attached PostHog error report with the live `thedotmack/claude-mem` GitHub backlog as of 2026-06-24:
- Open GitHub issues: 89
- Open GitHub PRs: 123
- Attached report: 10 high-priority PostHog error categories, about 987k occurrences / 21k affected users in 30 days
## Source Evidence
- Attached report: `/Users/alexnewman/.superset/host/e7c5cb1f-3f94-4b7b-b6b7-37a97d3b4a51/attachments/08a4bcfe-650a-4094-a534-815c15b67701/08a4bcfe-650a-4094-a534-815c15b67701.json`
- GitHub snapshots created from `gh`:
- `/tmp/claude_mem_open_issues_full.json`
- `/tmp/claude_mem_open_prs.json`
- Local code surfaces:
- `src/shared/find-claude-executable.ts`
- `src/services/sync/ChromaMcpManager.ts`
- `src/services/sync/ChromaSync.ts`
- `src/services/worker/GeminiProvider.ts`
- `src/services/worker/OpenAICompatibleProvider.ts`
- `src/services/sqlite/SessionStore.ts`
- `src/services/telemetry/backfill.ts`
- `src/services/worker/agents/ResponseProcessor.ts`
- `plugin/hooks/codex-hooks.json`
- `scripts/build-hooks.js`
- `src/cli/adapters/codex.ts`
- `src/services/integrations/CodexCliInstaller.ts`
## Crosswalk
| Report item | Report impact | Matching GitHub issues | Matching PRs | Current root cause |
|---|---:|---|---|---|
| Claude executable not found | 466,499 occurrences / 9,039 users | No exact open issue found | No exact PR found | Claude CLI dependency is discovered only when the generator starts; no first-run preflight or one-time remediation state. |
| `uvx` not found | 67,958 occurrences / about 966 users | Partly covered by #2961 / plan #2779 | #2920, #2880, #2940, partly #3039 | Installer has `ensureUv`, but existing installs can still hit runtime `uvx` spawn failure. Runtime Chroma path does not degrade cleanly when uvx is absent. |
| `Bun.randomUUIDv5` not a function | 5,908 / 36 | No exact open issue found | No exact PR found | `src/services/telemetry/backfill.ts` calls a Bun-specific API; replacing it with a small UUIDv5 helper is better than requiring newer Bun. |
| Chroma 30s timeout | 102,186 / 7,061 | #2897, #2961, #3016, #3012 | #2920, #2880/#2940, #2536 | The MCP handshake timeout includes cold `uvx` environment installation; repeated timeout kills prevent cache completion and can leak temp dirs/processes. |
| MCP `-32000 Connection closed` | 210,951 / 2,833 | #2879, #2939, #2954, #2961, #2959, #2950 | #2880, #2940, #2536 | Multiple causes collapse to a generic close: old uv rejects bare `chroma-mcp==...`, Windows shell handling mangles args, and stderr is not surfaced. |
| Chroma backoff throws into sync | 5,810 / 639 | #3016, #2896, #2959 | #2536, partly local singleton tests | `ensureCollectionExists()` can throw before `addDocuments()` reaches its per-batch catch path; write paths should return "not synced yet" instead of throwing user-visible errors. |
| Gemini bad request 400 | 100,784 / 555 | No exact open issue found | No exact PR found | Gemini request shaping/truncation can produce invalid conversation envelopes; 400s are classified but not prevented or bucketed by closed reason. |
| Platform source conflict | 22,078 / 465 | No exact open issue found | No exact PR found | `sdk_sessions.content_session_id` is globally unique, and tests currently require throwing when the same raw session ID appears from two platforms. |
| JSON parse error with Chinese chars | 4,965 / 78 | Partly plan #2782, no exact issue | No exact PR found | `ChromaSync.formatObservationDocs()` raw-parses `facts` and `concepts`; bad legacy rows can kill backfill instead of being quarantined. |
| Observer poison/respawn loop | Not in report top 10, but dominates GitHub | #3037, #3032, #3022, #3007, #2960, #2955, #2935, #2817 | #3028, #2857, #2943, #2927, #2901 | Non-XML/idle/quota prose is treated as invalid output and can trigger respawn loops that wipe context and stop memory generation. |
GitHub-only Codex compatibility blockers to include in the same recovery release:
| Codex blocker | Matching GitHub issues | Matching PRs | Current root cause |
|---|---|---|---|
| Codex refuses to load hooks config | #2972, #2947 | #2953, #2948 | `plugin/hooks/codex-hooks.json` still has a root-level `description` field. Codex 0.140.0-0.142.0 rejects unknown root keys, so all hooks appear enabled but never run. |
| Codex rejects hook output | #2975, #2871 | #2953 | Some Codex hook paths can emit Claude-style `suppressOutput`, which current Codex reports as an unsupported field on PreToolUse/PostToolUse. |
| Codex/Windows spawn contract regressions | #2962, #2941, #2914 | #2945, #2598 | Published bundles and hook commands have had `shell: true` + args and fragile login-shell PATH probes; Codex installs are sensitive to both. |
## Release Scope
This release should be a recovery release, not a feature release. Hold broad feature PRs unless they remove a top recovery blocker.
Release blockers:
1. Setup/dependency preflight and graceful degradation.
2. Chroma launch/lifecycle reliability.
3. Observer output loop fix.
4. Codex hook compatibility: strict hooks schema, no unsupported output fields, and stable spawn/PATH contract.
5. Gemini request-shape fix.
6. Platform session identity fix.
7. Chroma backfill JSON tolerance.
8. Telemetry UUID compatibility.
9. Upgrade/install survival for partial dependency installs.
Explicitly hold from this release unless already required by a blocker:
- New providers or integrations: #3044, #3034, #3000, #2764, #2523, #2514.
- Broad refactors: #2878, #2877, #2632.
- Large feature bundles: #3027, #2829, #2606, #2623.
## PR Disposition
Merge or rebase into the recovery branch:
- #3039 `fix: prevent a broken/partial dependency install from bricking the worker` — clean, directly supports setup/upgrade survival.
- #3033 `fix(windows): strip UTF-8 BOM in all settings.json readers` — relevant to hook-breaking setup failures; rebase/check because merge state is unstable.
- #3018 `Preserve proxy variables during environment sanitization` — relevant to enterprise installs and provider/chroma network failures; rebase/check because merge state is unstable.
- #3028 `fix: ignore unparseable observer output instead of poisoned respawn` — use as canonical observer-loop PR if cleaned; supersede narrower #2857/#2943/#2927/#2901.
- #2920 `fix(chroma): prewarm uvx installs before the MCP connect deadline` — clean, essential for #2897.
- #2880 `fix(chroma): spawn chroma-mcp via --from so uv < 0.5.31 works` — prefer this over #2940 because it handles old uv and avoids bare positional package syntax. Pull any useful #2940 tests into the canonical Chroma PR, then close #2940 as superseded.
- #3009 or #2895 — choose one Windows stale-port recovery implementation, not both. #3009 is scoped to #2996; #2895 has the better cross-platform root-cause framing. Consolidate into one PR with tests.
- #2887 `fix(build): bundle zod into worker-service.cjs` — clean and removes a known install-bricking path.
- #2849 `fix(sqlite): apply busy_timeout to primary SQLite connections` — clean, low-risk data durability improvement.
- #2953 `Fix claude-mem codex-hooks.json for current Codex` — use as the canonical Codex compatibility PR if it rebases cleanly. It should remove the unsupported root `description`, verify Codex output never includes `suppressOutput`, and include generated artifact updates.
- #2945 `fix: install Windows Claude Code hooks without bash` — merge if the spawn/PATH changes cover Codex-distributed hooks too; otherwise pull the shared hook-template fix into the Codex compatibility PR.
Close or mark superseded after consolidation:
- #2536 if the final Chroma lifecycle PR includes singleton teardown and process-tree kill coverage.
- #2857, #2943, #2927, #2901 after #3028 lands with the broader parse/drop behavior and quota tests.
- #2940 after `--from` invocation is adopted and tested across uv versions.
- #2948 after #2953 lands, unless #2953 is abandoned and #2948 becomes the minimal hooks-schema fix.
- #2598 after the final hook-template/spawn-contract PR includes the PATH-probe behavior.
## New Plan Masters
Create these GitHub plan-master issues because the current backlog does not cover the report's biggest missing roots:
### `[plan-15] Startup Dependency Health -- preflight, runtime degradation, and repair`
Children to route:
- New: Claude CLI missing from PATH / `CLAUDE_CODE_PATH`.
- New: runtime `uvx` missing after old install.
- Existing related: #3039, #3035, #2964, #2823, #2831, #3013, #2999.
Fix sequence:
1. Add a side-effect-free dependency health module for Claude CLI, Bun, uv/uvx, plugin hard deps, and provider API key state.
2. Run it from install/repair and from worker startup.
3. Store a bounded setup status so hooks show one actionable hint and continue, instead of failing repeatedly.
4. In Claude provider startup, classify missing CLI as `setup_required` and do not keep retrying until settings or PATH changes.
5. In Chroma startup, classify missing uvx as `vector_search_unavailable`; SQLite capture must continue.
### `[plan-16] Chroma Runtime Lifecycle -- launch contract, backoff semantics, and data-dir hygiene`
Children to route:
- #2879, #2897, #2896, #2907, #2939, #2950, #2954, #2959, #2961, #3012, #3016.
Fix sequence:
1. Invoke Chroma through `uvx --from chroma-mcp==<pin> chroma-mcp ...`.
2. Split prewarm timeout from MCP stdio handshake timeout.
3. Capture and log child stderr on connect failure.
4. Make uv/chroma dependency versions deterministic enough to avoid surprise cold rebuilds.
5. Keep exactly one Chroma subprocess tree per worker and reap it on reconnect, backfill close, worker stop, and failed connect.
6. Treat backoff/unavailable as "write not synced yet" from `ChromaSync`, not as a thrown user-flow error.
7. Add Chroma temp/cache cleanup guidance or automated safe cleanup after repeated aborted prewarm attempts.
### `[plan-17] Provider Request Envelopes -- Gemini/OpenRouter shape, truncation, and closed-error reasons`
Children to route:
- New: Gemini 400 bad request from PostHog report.
- Related provider issues in #2785 only if they are defects, not features.
Fix sequence:
1. Add provider-specific request-envelope builders with tests.
2. For Gemini, enforce a user-first, alternating `contents[]` sequence after truncation.
3. Preserve the current instruction/init message when possible; if truncation must drop it, rebuild a compact instruction wrapper instead of sending an orphaned assistant/model turn.
4. Map upstream 400 bodies to closed categories: `role_sequence`, `context_limit`, `model_unsupported`, `api_key`, `unknown_bad_request`.
5. Emit scrubbed telemetry counters for those closed categories only.
### `[plan-18] Platform-Namespaced Session Identity -- one raw session ID can exist in multiple clients`
Children to route:
- New: platform source conflict `existing=claude, received=cursor`.
Fix sequence:
1. Introduce a canonical internal session key: `platform_source + '\0' + content_session_id`.
2. Migrate `sdk_sessions` away from global `content_session_id TEXT UNIQUE` to uniqueness on `(platform_source, content_session_id)`.
3. Migrate `pending_messages` uniqueness and joins to include `session_db_id` or the same composite platform key.
4. Replace the current throw in `createSDKSession()` with get-or-create per platform.
5. Update tests that currently expect a conflict.
### `[plan-19] Codex Hook Compatibility -- strict schema, output contract, and spawn safety`
Children to route:
- #2972, #2947, #2975, #2871, #2962, #2941, #2914.
- PRs to consolidate or close: #2953, #2948, #2945, #2598, #2692.
Fix sequence:
1. Remove root metadata from `plugin/hooks/codex-hooks.json`; Codex hook config root must be only the keys Codex accepts.
2. Add build-time validation in `scripts/build-hooks.js` that fails if the Codex hooks file contains unsupported root keys.
3. Verify every Codex hook path goes through `codexAdapter.formatOutput()` and never emits Claude-only `suppressOutput`.
4. Keep Codex SessionStart context in `hookSpecificOutput.additionalContext` only.
5. Apply the hook shell-template/spawn contract to generated Codex hooks and the npx/Codex installer path: no `shell: true` + args, no required login-shell PATH probe.
6. Add a clean-room Codex plugin smoke check for Codex 0.140.0+ shape: hooks config parses, SessionStart/UserPromptSubmit/PreToolUse/PostToolUse/Stop all return accepted output shapes.
## Implementation Phases
### Phase 0 -- Branch and freeze
Create `release/recovery-2026-06-24`. Merge only recovery-scoped fixes above. No new providers, UI features, or storage refactors.
Verification:
- `gh pr list` for the release branch contains only blocker PRs.
- PR descriptions list `Closes #...` for every child issue covered.
### Phase 1 -- Setup and install survival
Implement plan-15 plus merge #3039/#3033/#3018/#2887 as applicable.
Required code:
- Shared dependency-health module used by installer, repair, worker startup, and settings/doctor.
- Replace `Bun.randomUUIDv5` in `src/services/telemetry/backfill.ts` with a local deterministic UUIDv5 implementation or small dependency-free helper.
- One-shot user-facing remediation for missing Claude CLI and uvx.
Tests:
- Missing Claude CLI does not respawn or block hooks; it records setup-required state.
- Missing uvx disables vector search but leaves SQLite capture/search alive.
- Telemetry backfill UUID is stable across runs without `Bun.randomUUIDv5`.
- Broken plugin deps do not kill a healthy previous worker.
### Phase 1A -- Codex hook compatibility
Implement plan-19 before the recovery release candidate is cut. This is a user-visible compatibility gate even though it is not in the PostHog top-10 report.
Required code:
- Regenerate `plugin/hooks/codex-hooks.json` without the root `description`.
- Add Codex hook-config root-key validation to the build.
- Confirm Codex output formatting strips `suppressOutput` on success, skipped input, worker-unavailable, and error paths.
- Fold any needed spawn/PATH fixes into the shared hook-template path used by Codex.
Tests:
- `plugin/hooks/codex-hooks.json` root keys match the Codex-accepted schema.
- Codex adapter output never includes `suppressOutput`.
- Hook-command skipped-input and worker-unavailable paths do not leak `suppressOutput` for Codex.
- Generated Codex hooks cover SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, and Stop without unsupported fields.
### Phase 2 -- Chroma runtime reliability
Implement plan-16 by consolidating #2920 + #2880 + current singleton/process-tree work.
Required code:
- `buildCommandArgs()` emits `--from chroma-mcp==0.2.6 chroma-mcp`.
- Prewarm timeout is configurable separately from MCP handshake timeout.
- `StdioClientTransport` stderr is drained into bounded logs.
- `addDocuments()` returns `0` when collection creation hits known Chroma-unavailable/backoff states.
- Backfill close/failed connect always reaps subprocess tree.
Tests:
- uv 0.5.29 and latest uv launch with the same args.
- Cold prewarm exceeding 30s does not get killed by MCP handshake timeout.
- Five concurrent `ensureConnected()` calls spawn one process.
- Backoff during prompt sync returns no throw and leaves watermark unchanged.
- Windows direct spawn never routes `>` / `<` through `cmd.exe`.
### Phase 3 -- Observer loop and quota pause
Land the broad #3028 behavior, then add the missing quota branch from #3037.
Required code:
- Non-XML output is logged and dropped unless it is a structured provider error.
- Idle/prose skip acknowledgements confirm the claimed batch and do not increment respawn debt.
- Claude subscription weekly-limit prose is detected before parser invalid-output handling; generator pauses until reset/backoff instead of respawning.
- Remove or rename `poisoned` telemetry once no behavior depends on it.
Tests:
- Text containing "context window" but not valid XML is dropped, not respawned.
- Repeated "No observations to record" never respawns.
- Weekly-limit message pauses generation and does not consume/drop pending work.
- Pending queue behavior differs intentionally: skip/no-op confirms; quota pause preserves.
### Phase 4 -- Provider and session identity fixes
Implement plan-17 and plan-18.
Required code:
- Gemini `contents[]` builder that repairs alternation after truncation.
- Closed 400 categories with no raw provider body in telemetry.
- SQLite migration for `(platform_source, content_session_id)` uniqueness.
- API/search joins updated to use `session_db_id` or composite identity where needed.
Tests:
- Truncated Gemini history never starts with `model`.
- Odd/even max-message truncation keeps a valid Gemini role sequence.
- Same raw session ID from Claude and Cursor creates two rows, no conflict.
- Existing single-platform DB migrates without losing observations, summaries, or pending messages.
### Phase 5 -- Backfill/data tolerance
Fold the JSON-parse issue into plan-09.
Required code:
- Replace raw `JSON.parse(obs.facts)` / `JSON.parse(obs.concepts)` in `ChromaSync` with tolerant JSON-array parsing.
- Quarantine malformed legacy columns by row id and continue backfill.
- Add closed telemetry/log reason: `malformed_json_column`.
Tests:
- `facts = '开始'` or raw non-JSON string does not crash backfill.
- Valid JSON arrays still produce fact/concept documents.
- Malformed one row does not prevent later rows from syncing.
## Release Verification Matrix
| Axis | Required proof |
|---|---|
| Clean install | macOS, Linux, Windows install/repair succeeds; missing Claude CLI gives actionable setup state. |
| Existing broken install | Partial deps, BOM settings, missing uvx, stale worker port all degrade or recover without blocking hooks. |
| Codex | Codex 0.140.0+ parses `codex-hooks.json`; all five Codex hook events return accepted output without `suppressOutput`. |
| Chroma | uv 0.5.29, latest uv, slow cold cache, Windows direct-spawn, process leak regression. |
| Provider | Gemini long histories and odd truncation limits do not generate invalid request bodies. |
| Multi-client | Claude + Cursor with same raw session id do not conflict. |
| Data pipeline | Chroma backfill survives malformed JSON and Chroma backoff. |
| Observer | Idle/prose/quota outputs do not poison-loop. |
| Packaging | `npm run build`, `npm run typecheck:root`, targeted Bun test matrix, clean-room smoke install. |
## Ship Criteria
Ship only when:
- All release blocker tests pass locally and in CI.
- `gh issue list --state open` has all report-related symptoms routed to plan masters or closing PRs.
- Codex compatibility issues #2972 and #2975 are closed by the release PR or explicitly superseded by one merged Codex compatibility PR.
- PR body for the recovery release has `Closes #...` for the covered child issues.
- Post-release dashboard tracks these closed categories: setup_required, chroma_unavailable, chroma_backoff, provider_bad_request_category, observer_invalid_output_dropped, quota_paused, malformed_json_column.
## Post-Release Watch
For 72 hours after release:
- PostHog top-10 report items should drop materially, especially:
- Claude executable not found
- uvx not found
- Chroma timeout / connection closed / backoff
- Gemini 400
- platform source conflict
- JSON parse error
- GitHub Codex intake for hook parse failures and unsupported `suppressOutput` should stop after users upgrade.
- GitHub intake should route new symptoms into plan masters, not create standalone open issues.
- If Chroma errors remain high after launch fixes, prioritize remote-Chroma opt-out/disable flow and exact dependency pins next.

View File

@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "13.8.0",
"version": "13.8.1",
"description": "Memory compression system for Claude Code - persist context across sessions",
"author": {
"name": "Alex Newman"

View File

@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "13.8.0",
"version": "13.8.1",
"description": "Memory compression system for Claude Code - persist context across sessions",
"author": {
"name": "Alex Newman",

View File

@@ -1,6 +1,6 @@
{
"name": "claude-mem-plugin",
"version": "13.8.0",
"version": "13.8.1",
"private": true,
"description": "Runtime dependencies for claude-mem bundled hooks",
"type": "module",

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long