mirror of
https://github.com/microsoft/SkillOpt.git
synced 2026-07-03 22:24:36 +08:00
Three live runs exercise the new code paths on both runtimes:
A) Claude Sonnet->Haiku, gate=OFF + rollouts_k=2: brief-writer test 0->1.00,
action 'greedy_improved', val & test both reported (3-way split works).
B) Codex, gate=ON + rollouts_k=2: brief-writer test 0->1.00 in 2 nights.
C) Claude Sonnet->Haiku, thorough-analyst, 3 nights: slow-update fires and
distils a durable cross-night meta-rule (general, not task-specific).
Confirms gate-off greedy path, 3-way val/test split, multi-rollout, and the
gate-independent slow-update all work with real models on Claude AND Codex.
Raw logs under docs/sleep/raw/crosscheck_*.txt.
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>