PR #92 added a per-cycle diagnostics.json that surfaces backend stderr,
optimizer replies, and task responses so a 0.0 night is self-diagnosing.
Those free-text fields can carry credentials (e.g. a codex 401 stderr dump
containing an auth token), so persisting them verbatim was a new on-disk
leak surface.
- Add a shared redact_secrets() in staging.py and route diagnostics.json's
call_error / reflect_raw_head / holdout_detail through it before writing.
- Redact the codex and Claude auth-error log lines too (a secondary sink
when a file log handler is attached); last_call_error stays raw in memory
so _AUTH_MARKERS matching is unaffected.
- Centralize _SECRET_PATTERNS in staging.py (harvest_codex now reuses them)
and extend coverage to AWS / GitHub / Slack / Google / JWT token shapes.
- Tests: secret-shape coverage, private-key blocks, recursive/scalar
passthrough, no over-redaction of plain prose, fail-fast auth-error log
redaction, and an end-to-end check that diagnostics.json has no secret.
Observability-only; the gate and learning algorithm are unchanged.
Co-Authored-By: Claude <noreply@anthropic.com>
Open-source-tool / research-code separation:
- git mv skillopt/sleep/ -> skillopt_sleep/ (top-level, sibling to the research
skillopt/ package). History preserved as renames.
- All imports skillopt.sleep.* -> skillopt_sleep.*.
- Vendor the validation gate into skillopt_sleep/gate.py (a self-contained copy
of skillopt.evaluation.gate). The engine now has ZERO dependency on the
research package — verified: grep finds no `from skillopt.` in skillopt_sleep/,
and consolidate's gate resolves to skillopt_sleep.gate.
- Plugin scripts/commands/skill call `-m skillopt_sleep`.
29 tests pass; `python -m skillopt_sleep` runs standalone.
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>