ensureProjectAllowed only checks a key's optional project scope, so a
team-scoped key could call DELETE /v1/projects/:projectId/memory with any
projectId and get 200 { purged: true } with zero counts — misreporting an
unauthorized or nonexistent purge as success. Verify the project belongs to
the team (getByIdForTeam) before purging and 404 otherwise. The underlying
purge was already team-scoped, so no cross-team data was ever deleted; this
fixes the misleading success response. Addresses greptile P1 on PR #3089.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5.4 KiB
Server API
REST V1 is mounted under /v1; legacy worker routes remain under /api.
Available beta endpoints:
GET /healthzGET /v1/infoGET /v1/projectsPOST /v1/projectsGET /v1/projects/:idPOST /v1/sessions/startPOST /v1/sessions/:id/endGET /v1/sessions/:idPOST /v1/eventsPOST /v1/events/batchGET /v1/events/:idPOST /v1/memoriesGET /v1/memories/:idPATCH /v1/memories/:idPOST /v1/searchPOST /v1/contextALL /v1/mcp(remote MCP recall — see below)POST /v1/keysGET /v1/connectGET /v1/usageDELETE /v1/memories/:idDELETE /v1/projects/:projectId/memoryGET /v1/audit?projectId=<id>
When CLAUDE_MEM_AUTH_MODE=api-key, send Authorization: Bearer <key>. Read endpoints require memories:read; write endpoints require memories:write.
Rate limiting, quota, and usage metering
These paid-readiness guards run after auth and are opt-in via env — unset (the default) means no rate limit, no quota, and no metering, so behavior is unchanged.
CLAUDE_MEM_RATE_LIMIT_PER_MIN— max requests per API key per minute. Over the limit returns429withRetry-After(andX-RateLimit-*headers). Fail-open.CLAUDE_MEM_MONTHLY_REQUEST_CAP— max requests per team per calendar month (UTC). At the cap, returns402 quota_exceeded. Fail-open.CLAUDE_MEM_MONTHLY_TOKEN_CAP— max provider tokens per team per month. Gates writes only (ingestion drives generation = token spend); reads stay available so a team over budget can still recall.402at the cap. Fail-open.CLAUDE_MEM_USAGE_METERING=1— record onerequestusage event per authenticated call (fire-and-forget). Token/observation metering writes to the sameusage_eventstable from the generation worker.
GET /v1/usage returns the caller team's per-kind totals for the current month:
{ "since": "2026-06-01T00:00:00.000Z", "usage": { "request": 1280, "observation": 44 } }
Connecting an MCP client (key issuance + connect)
-
POST /v1/keys(write scope) mints a read-only API key for the caller's team and returns the paste-ready connect command. The raw key is shown once. Body:{ "expiresInDays"?: number }. Minting requires write scope so a read key can't escalate into more keys.{ "id": "...", "apiKey": "cm_...", "scopes": ["memories:read"], "expiresAt": null, "mcpUrl": "https://<host>/v1/mcp", "connectCommand": "claude mcp add --transport http claude-mem https://<host>/v1/mcp --header \"Authorization: Bearer cm_...\"" } -
GET /v1/connect(read scope) returns the same command with a<YOUR_API_KEY>placeholder (a GET never mints).mcpUrlis built fromCLAUDE_MEM_PUBLIC_URL(recommended behind a proxy) or the request host.
Cold-start note: minting the team's first key still needs a session-gated path (web dashboard). better-auth's
apiKey()plugin exists but writes to a separate store than the Postgresapi_keysthese routes authenticate against — wiring the better-auth org → Server Beta team mapping is the remaining piece.
Event generation semantics
POST /v1/events accepts two query flags that control observation generation:
generate=false— write the event but do not enqueue a generation job.wait=true— return thegenerationJobdescriptor in the response, so callers can pollGET /v1/jobs/:idfor completion.
Without wait=true, the response includes the new event row and a best-
effort generationJob field. With wait=true, the generationJob field is
always populated (or null only when generation was explicitly disabled).
The actual provider call happens in a separate BullMQ worker process
(claude-mem server worker start); the HTTP path never blocks on a
provider response.
Remote MCP endpoint
/v1/mcp is a streamable-HTTP MCP server —
the secure, authenticated link a user pastes into Claude Code (or any MCP
client) to recall their cloud memory. It is read-only and authenticated by the
same API key as the REST routes (memories:read); the key's team (and project,
if the key is project-scoped) bound every read.
Connect:
claude mcp add --transport http claude-mem <server-base>/v1/mcp \
--header "Authorization: Bearer cm_..."
Tools:
search—{ projectId, query, limit? }→ matching observations (FTS, same path asPOST /v1/search).context—{ projectId, query, limit? }→ observations plus a concatenatedcontextstring ready for prompt injection (same path asPOST /v1/context).recent—{ projectId, limit? }→ the newest observations for a project.
The transport is stateless: one MCP server + transport per request, so it needs no session affinity behind a load balancer. Mutating tools are intentionally absent — a pasted recall link cannot write.
Data deletion (forget)
Right-to-erasure. Both require write scope and are scoped to the caller's team.
DELETE /v1/memories/:id— delete a single observation (its sources cascade).404if it doesn't exist for the team.DELETE /v1/projects/:projectId/memory— purge ALL captured content for a project (observations, agent events, sessions, generation jobs); keeps the project shell. Returns per-tablecounts.404if the project doesn't belong to the team. Both are audited (observation.deleted/project.memory_purged).