mirror of
https://github.com/microsoft/SkillOpt.git
synced 2026-07-03 14:02:58 +08:00
Codex with the directive reflect prompt + 2 nights converges 0.00 -> 1.00
(up from 0.67 single-night); its night-2 edit diagnoses its own residual
failure ("preserve required sections even when keeping the brief short").
Claude (Haiku) reaches 1.00 in one night. Update plugin README + skill to
reference --backend claude|codex (was anthropic) and surface the benchmark.
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>