mirror of
https://github.com/thedotmack/claude-mem.git
synced 2026-07-03 12:32:32 +08:00
feat(telemetry): person profiles on lifecycle events to unlock retention/cohort analytics
PostHog cannot compute retention, stickiness, lifecycle, or cohort insights on profile-less events — exactly the charts growth reporting needs. Lifecycle events (install_*, uninstall_completed, worker_started; ~1-2/day/install) now build a person profile keyed to the anonymous install UUID with $set restricted to whitelisted enums. High-volume operational events stay $process_person_profile:false for cost. Adds plans/2026-06-09-telemetry-metrics-spec.md mapping every event to the growth/retention/activation/reliability metric it powers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -17,7 +17,9 @@ The standard [`DO_NOT_TRACK`](https://consoledonottrack.com) environment variabl
|
||||
|
||||
## What is collected
|
||||
|
||||
When enabled, events are anonymous and identified only by a random install UUID (`crypto.randomUUID()`, generated locally on first use). Events are sent with `$process_person_profile: false`, so PostHog never builds a person profile.
|
||||
When enabled, events are anonymous and identified only by a random install UUID (`crypto.randomUUID()`, generated locally on first use).
|
||||
|
||||
Low-volume lifecycle events (`install_*`, `uninstall_completed`, `worker_started`) build an analytics profile keyed to that random UUID so aggregate retention and cohort statistics are computable — the profile contains nothing beyond the whitelisted fields below (platform, version, IDE/provider choice). It is not, and cannot be, connected to you: there is no name, email, IP, hardware ID, or any other identifier. All high-volume events (`session_compressed`, `search_performed`, `context_injected`, `error_occurred`) are sent with `$process_person_profile: false` and build no profile at all.
|
||||
|
||||
Every event property passes through a strict whitelist scrubber — any key not in this table is silently dropped before sending:
|
||||
|
||||
|
||||
89
plans/2026-06-09-telemetry-metrics-spec.md
Normal file
89
plans/2026-06-09-telemetry-metrics-spec.md
Normal file
@@ -0,0 +1,89 @@
|
||||
# Telemetry Metrics Spec — the story the data must tell
|
||||
|
||||
Audience: us, when building the PostHog dashboard and the fundraise narrative.
|
||||
Premise: 82k GitHub stars, zero analytics history. The dataset starts the day
|
||||
this ships, so every chart below is designed to be meaningful within 4–8 weeks
|
||||
of data and to compound from there.
|
||||
|
||||
## The narrative arc (what a deck slide needs to say)
|
||||
|
||||
1. **Reach** — "X installs/week and growing N% w/w, across 12 IDEs."
|
||||
2. **Habit** — "Installs come back: D30 retention X%, DAU/MAU X%."
|
||||
3. **Value loop** — "Memory isn't shelfware: X% of installs reach the aha
|
||||
moment, and active installs read memory back X times/day."
|
||||
4. **Reliability** — "Core pipeline succeeds X% of the time at scale."
|
||||
|
||||
Everything below maps an event to one of those four sentences. If a metric
|
||||
doesn't feed a sentence, it doesn't go on the dashboard.
|
||||
|
||||
## Unit of measure — be precise with VCs
|
||||
|
||||
The `distinct_id` is an **install** (one machine + one `~/.claude-mem`), not a
|
||||
human. Quote "active installs", never "users". This is the honest dev-tool
|
||||
convention (Homebrew, VS Code extensions count the same way) and diligence
|
||||
will check. Reinstalls keep the same ID (uninstall preserves the data dir), so
|
||||
returning installs are not double-counted.
|
||||
|
||||
Always filter `is_ci = false` on every insight. CI noise inflates everything.
|
||||
|
||||
## Event → metric map
|
||||
|
||||
### Reach (growth accounting)
|
||||
| Metric | Definition |
|
||||
|---|---|
|
||||
| New installs/week | unique `distinct_id` on `install_completed` where `is_update = false` |
|
||||
| Upgrade adoption | `install_completed` where `is_update = true`, broken down by `version` |
|
||||
| Active installs (WAU/MAU) | unique `distinct_id` on `worker_started` (start + daily heartbeat = presence signal) |
|
||||
| Churn | `uninstall_completed` count; net growth = new − uninstalls |
|
||||
| Surface mix | `install_completed` breakdown by `ide`, `provider`, `runtime_mode` |
|
||||
|
||||
### Habit (retention — the slide that raises the round)
|
||||
| Metric | Definition |
|
||||
|---|---|
|
||||
| D1/D7/D30 retention | PostHog Retention insight: first `install_completed` → returning on `worker_started`. Requires person profiles — that's why lifecycle events set them. |
|
||||
| Stickiness (DAU/MAU) | PostHog Stickiness insight on `worker_started` |
|
||||
| Lifecycle | PostHog Lifecycle insight on `worker_started` (new / returning / resurrecting / dormant) |
|
||||
| Retention by segment | same retention insight broken down by person property `ide` or `provider` — "Cursor installs retain 2×" is a fundable sentence |
|
||||
|
||||
### Value loop (activation + engagement)
|
||||
| Metric | Definition |
|
||||
|---|---|
|
||||
| Activation funnel | Funnel: `install_completed` → first `session_compressed` → first `context_injected`. The third step is the aha moment: stored memory actually used. |
|
||||
| Time-to-value | median time from `install_completed` to first `context_injected` |
|
||||
| Engagement depth | `session_compressed` count per active install per day; `context_injected` per active install per day |
|
||||
| Read/write ratio | `context_injected` ÷ `session_compressed` — memory being consumed, not hoarded |
|
||||
| Feature adoption | `search_performed` breakdown by `endpoint` |
|
||||
|
||||
### Reliability (diligence armor)
|
||||
| Metric | Definition |
|
||||
|---|---|
|
||||
| Compression success rate | `session_compressed` outcome ok ÷ all, by `version` and `provider` |
|
||||
| Error rate | `error_occurred` per active install, by `error_category` and `version` |
|
||||
| Latency health | p50/p95 `duration_ms` on `session_compressed`, `search_performed`, `context_injected` |
|
||||
| Install success rate | `install_completed` ÷ (`install_completed` + `install_failed`), failures by `error_category` |
|
||||
|
||||
## Person-profile design (cost control)
|
||||
|
||||
Only lifecycle events (`install_*`, `uninstall_completed`, `worker_started`)
|
||||
carry person profiles — ~1–2 events/day/install, so profile-priced ingestion
|
||||
stays bounded even at 100k installs. High-volume operational events are
|
||||
profile-less (cheaper tier). Person properties are the whitelisted enums only:
|
||||
`version`, `os`, `arch`, `runtime`, `locale`, `ide`, `provider`, `runtime_mode`.
|
||||
|
||||
## Caveats to state proactively in diligence
|
||||
|
||||
- Telemetry is opt-out (`DO_NOT_TRACK` honored, one-command disable); numbers
|
||||
undercount by the opt-out rate. That's the credible direction to undercount.
|
||||
- Data starts <date this ships>; star history is the pre-telemetry proxy.
|
||||
- One human can be several installs (work + home). Quote installs.
|
||||
|
||||
## Dashboard build order (PostHog UI, ~30 min)
|
||||
|
||||
1. Trends: weekly unique `worker_started` (active installs) + weekly
|
||||
`install_completed` where `is_update=false` (new installs).
|
||||
2. Retention: `install_completed` → `worker_started`, weekly, breakdown `ide`.
|
||||
3. Funnel: `install_completed` → `session_compressed` → `context_injected`,
|
||||
14-day window.
|
||||
4. Stickiness + Lifecycle on `worker_started`.
|
||||
5. Trends: `session_compressed` outcome error ÷ total (reliability), p95
|
||||
`duration_ms` (latency).
|
||||
File diff suppressed because one or more lines are too long
@@ -1374,7 +1374,7 @@ export async function runInstallCommand(options: InstallOptions = {}): Promise<v
|
||||
} catch (error: unknown) {
|
||||
if (error instanceof InstallAbortError) {
|
||||
// error.category.id is OUR taxonomy id (error-taxonomy.ts), never a message.
|
||||
await captureCliEvent('install_failed', { error_category: error.category.id });
|
||||
await captureCliEvent('install_failed', { error_category: error.category.id }, { person: true });
|
||||
// Flush whatever warnings accrued before the abort, then print the
|
||||
// remediation headline and exit non-zero. ABORT must never reach the
|
||||
// "Installation Complete" path.
|
||||
@@ -1816,7 +1816,7 @@ async function runInstallCommandInner(options: InstallOptions, summary: InstallS
|
||||
is_update: alreadyInstalled,
|
||||
outcome: failedIDEs.length > 0 ? 'partial' : 'ok',
|
||||
duration_ms: Date.now() - installStartedAt,
|
||||
});
|
||||
}, { person: true });
|
||||
}
|
||||
|
||||
export async function runRepairCommand(): Promise<void> {
|
||||
|
||||
@@ -400,7 +400,7 @@ export async function runUninstallCommand(): Promise<void> {
|
||||
|
||||
// Capture BEFORE the data dir note becomes stale advice: consent and the
|
||||
// install ID still live in ~/.claude-mem, which uninstall preserves.
|
||||
await captureCliEvent('uninstall_completed');
|
||||
await captureCliEvent('uninstall_completed', {}, { person: true });
|
||||
|
||||
p.outro(pc.green('claude-mem has been uninstalled.'));
|
||||
}
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
|
||||
import { resolveTelemetryConsent, loadTelemetryConfig, getOrCreateInstallId } from './consent.js';
|
||||
import { scrubProperties } from './scrub.js';
|
||||
import { getTelemetryApiKey, getTelemetryHost, buildBaseProperties } from './common.js';
|
||||
import { getTelemetryApiKey, getTelemetryHost, buildBaseProperties, buildPersonSet } from './common.js';
|
||||
|
||||
const CAPTURE_TIMEOUT_MS = 2000;
|
||||
|
||||
@@ -21,7 +21,8 @@ const CAPTURE_TIMEOUT_MS = 2000;
|
||||
*/
|
||||
export async function captureCliEvent(
|
||||
event: string,
|
||||
props?: Record<string, unknown>
|
||||
props?: Record<string, unknown>,
|
||||
opts?: { person?: boolean }
|
||||
): Promise<void> {
|
||||
try {
|
||||
if (!resolveTelemetryConsent(process.env, loadTelemetryConfig())) {
|
||||
@@ -32,7 +33,13 @@ export async function captureCliEvent(
|
||||
...buildBaseProperties(),
|
||||
...(props ?? {}),
|
||||
});
|
||||
properties.$process_person_profile = false;
|
||||
// Lifecycle events (install_* / uninstall) build the anonymous person
|
||||
// profile that powers retention and cohort insights; see telemetry.ts.
|
||||
if (opts?.person) {
|
||||
properties.$set = buildPersonSet(properties);
|
||||
} else {
|
||||
properties.$process_person_profile = false;
|
||||
}
|
||||
|
||||
if (process.env.CLAUDE_MEM_TELEMETRY_DEBUG === '1') {
|
||||
process.stderr.write('[telemetry] ' + JSON.stringify({ event, properties }) + '\n');
|
||||
|
||||
@@ -25,6 +25,39 @@ export function getTelemetryHost(): string {
|
||||
return process.env.CLAUDE_MEM_TELEMETRY_HOST || DEFAULT_TELEMETRY_HOST;
|
||||
}
|
||||
|
||||
/**
|
||||
* Whitelisted properties that may also be set as PostHog person properties on
|
||||
* lifecycle events (install_*, worker_started). The "person" is the anonymous
|
||||
* install UUID — these traits make retention/cohort insights sliceable by
|
||||
* platform and product choices. Strict subset of the scrub whitelist.
|
||||
*/
|
||||
export const PERSON_PROPERTY_KEYS = [
|
||||
'version',
|
||||
'os',
|
||||
'arch',
|
||||
'runtime',
|
||||
'locale',
|
||||
'ide',
|
||||
'provider',
|
||||
'runtime_mode',
|
||||
] as const;
|
||||
|
||||
/**
|
||||
* Splits already-scrubbed properties into a $set object for person-profile
|
||||
* events. Lifecycle events are low-volume (~1-2/day/install), so the
|
||||
* person-profile ingestion cost is bounded while unlocking PostHog's native
|
||||
* retention, stickiness, lifecycle, and cohort insights.
|
||||
*/
|
||||
export function buildPersonSet(
|
||||
scrubbed: Record<string, unknown>
|
||||
): Record<string, unknown> {
|
||||
const set: Record<string, unknown> = {};
|
||||
for (const key of PERSON_PROPERTY_KEYS) {
|
||||
if (scrubbed[key] !== undefined) set[key] = scrubbed[key];
|
||||
}
|
||||
return set;
|
||||
}
|
||||
|
||||
export function buildBaseProperties(): Record<string, unknown> {
|
||||
return {
|
||||
version: packageVersion,
|
||||
|
||||
@@ -5,7 +5,7 @@ import {
|
||||
getOrCreateInstallId,
|
||||
} from './consent.js';
|
||||
import { scrubProperties } from './scrub.js';
|
||||
import { getTelemetryApiKey, getTelemetryHost, buildBaseProperties } from './common.js';
|
||||
import { getTelemetryApiKey, getTelemetryHost, buildBaseProperties, buildPersonSet } from './common.js';
|
||||
|
||||
let client: PostHog | null = null;
|
||||
let isShutdown = false;
|
||||
@@ -51,8 +51,20 @@ function getClient(): PostHog {
|
||||
* 4. No API key configured — no-op (telemetry ships dark until the
|
||||
* publishable token lands).
|
||||
* 5. posthog.capture() — SDK queues in memory and batches in background.
|
||||
*
|
||||
* Two event classes (opts.person):
|
||||
* - Lifecycle events (worker_started, install_*) pass person: true. They are
|
||||
* low-volume and build an anonymous person profile keyed to the random
|
||||
* install UUID, which is what makes PostHog's retention / stickiness /
|
||||
* lifecycle / cohort insights work. Person properties are restricted to
|
||||
* PERSON_PROPERTY_KEYS — the same whitelisted enums as event properties.
|
||||
* - Everything else (high-volume operational events) stays profile-less.
|
||||
*/
|
||||
export function captureEvent(event: string, props?: Record<string, unknown>): void {
|
||||
export function captureEvent(
|
||||
event: string,
|
||||
props?: Record<string, unknown>,
|
||||
opts?: { person?: boolean }
|
||||
): void {
|
||||
try {
|
||||
// Once shutdown has flushed the client, late events (e.g. a request that
|
||||
// raced graceful stop) are dropped rather than queued in a new client
|
||||
@@ -65,9 +77,13 @@ export function captureEvent(event: string, props?: Record<string, unknown>): vo
|
||||
...buildBaseProperties(),
|
||||
...(props ?? {}),
|
||||
});
|
||||
// Anonymous events: no person profile processing. Added AFTER scrubbing —
|
||||
// $-prefixed PostHog directives are not user data and bypass the whitelist.
|
||||
properties.$process_person_profile = false;
|
||||
// $-prefixed PostHog directives are not user data and bypass the whitelist;
|
||||
// they are added AFTER scrubbing.
|
||||
if (opts?.person) {
|
||||
properties.$set = buildPersonSet(properties);
|
||||
} else {
|
||||
properties.$process_person_profile = false;
|
||||
}
|
||||
|
||||
if (process.env.CLAUDE_MEM_TELEMETRY_DEBUG === '1') {
|
||||
// Direct stderr write (not console.*): debug mode is a human running the
|
||||
|
||||
@@ -298,13 +298,13 @@ export class WorkerService implements WorkerRef {
|
||||
captureEvent('worker_started', {
|
||||
trigger: 'start',
|
||||
duration_ms: Date.now() - this.startTime,
|
||||
});
|
||||
}, { person: true });
|
||||
|
||||
// Long-lived workers would otherwise look like a single day of activity.
|
||||
// A daily heartbeat makes DAU/WAU/retention computable from distinct_id.
|
||||
// unref() so the timer never keeps a stopping process alive.
|
||||
this.telemetryHeartbeat = setInterval(() => {
|
||||
captureEvent('worker_started', { trigger: 'heartbeat' });
|
||||
captureEvent('worker_started', { trigger: 'heartbeat' }, { person: true });
|
||||
}, 24 * 60 * 60 * 1000);
|
||||
this.telemetryHeartbeat.unref?.();
|
||||
|
||||
|
||||
Reference in New Issue
Block a user