* feat(telemetry): disclose 19 reliability-signal fields and 2 new events across all surfaces
Whitelist (scrub.ts), scrub tests, public docs (telemetry.mdx), and CLI
disclosure (COLLECTED_FIELDS/EVENT_NAMES) for the Plan 14 reliability
signals: search retrieval quality, compression trust, worker lifecycle,
and hook failure keys, plus the worker_stopped and hook_failed events.
Includes the plan document.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(telemetry): retrieval-quality signals on search_performed
SearchManager.search() fills an optional telemetry envelope
(result_count, search_strategy, chroma_available, fallback_reason)
across all three search paths; handlers stash it on
res.locals.searchTelemetry and the existing finish-middleware spreads it
into the search_performed capture. Zero-result searches report
result_count: 0; Chroma fallback reasons are a closed enum, never the
error message. Response shapes unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(telemetry): compression trust signals on session_compressed
fabrication_detected/fabricated_count flow through compressionProps (all
three emit paths); invalid-output respawns emit a respawn-gated
session_compressed with outcome invalid_output and the classifier value;
aborted generators emit outcome aborted with abort_reason normalized to
a closed enum in the .finally where all five abort flows converge (the
.catch path can never observe a non-null abortReason).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(telemetry): worker lifecycle signals — worker_stopped, crash detection, memory metrics
Clean-shutdown sentinel written before telemetry flush and consumed at
startup; worker_started gains previous_shutdown (crash/clean/unknown)
and previous_uptime_seconds derived from the stale PID file; new
worker_stopped event (uptime_seconds, shutdown_reason stop/restart/
signal) emitted before shutdownTelemetry(); the CLI restart path tags
/api/admin/shutdown?reason=restart so restarts are distinguishable;
buildLifecycleProps adds integer process_rss_mb/heap_used_mb.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(telemetry): threshold-gated hook_failed distress signal via CLI transport
recordWorkerUnreachable emits hook_failed exactly when the consecutive-
failure count reaches the fail-loud threshold; the generic blocking-error
branch emits error_mode blocking_error. Both emits are awaited before
the process.exit paths so the 2s-capped CLI POST survives; hook_type is
a closed enum registered at hookCommand entry. Exit codes unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(build): regenerate plugin artifacts with Plan 14 telemetry signals
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(tests): make PostHog client regression test order-independent via global preload mock
The disableGeoip regression test mocked posthog-node per-file, but
telemetry.ts is imported transitively by many test files in the shared
bun process, so the mock registered too late and the test failed in
full-suite runs — CI on main has been red since v13.5.4. The mock now
registers in a bunfig [test].preload before any module loads, which
also guarantees test runs can never construct a real PostHog client and
flush fabricated events into production analytics (consent is
default-on and the suite outlives flushInterval). telemetry.ts gains a
test-only state reset so construction is observed deterministically
regardless of suite order.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(telemetry): forward shutdown reason in Windows-managed IPC message
Review follow-up: the wrapper IPC path discarded the restart tag, so an
external Windows wrapper could only ever report shutdown_reason 'stop'.
No wrapper in this repo listens for the message, but the reason now
travels with it for any that does.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>