microsoft-SkillOpt

mirror of https://github.com/microsoft/SkillOpt.git synced 2026-07-03 14:02:58 +08:00

Files

Yifan Yang 4186e5bb73 docs(sleep): definitive clean results — Sonnet->Haiku 3/3 seeds 0->1.00

Strong-optimizer/weak-target (Sonnet -> Haiku), fully isolated:
  brief-writer, advisor, thorough-analyst all 0.00 -> 1.00 on held-out.
thorough-analyst shows 2-night convergence (0.33 -> 1.00). Codex self-optimized
brief-writer also 0 -> 1.00.

Key finding answering the optimizer/target-split request: the OPTIMIZER MODEL is
decisive — weak Haiku-as-optimizer is flaky (0 or 1.0 across runs), strong
Sonnet-as-optimizer reliably hits 1.0 on every seed. Raw logs under docs/sleep/raw/.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

2026-06-08 14:31:51 +00:00

raw

docs(sleep): definitive clean results — Sonnet->Haiku 3/3 seeds 0->1.00

2026-06-08 14:31:51 +00:00

experiment_results.md

feat(sleep): nightly offline self-evolution engine + Claude Code plugin

2026-06-08 14:31:51 +00:00

FINAL_REPORT.md

docs(sleep): definitive clean results — Sonnet->Haiku 3/3 seeds 0->1.00

2026-06-08 14:31:51 +00:00

real_api_results.md

docs(sleep): record real Claude+Codex gbrain results; both reach 0->1.00

2026-06-08 14:31:51 +00:00

WAKE_UP_SUMMARY.md

docs(sleep): add wake-up summary of the overnight build

2026-06-08 14:31:51 +00:00