mirror of
https://github.com/microsoft/SkillOpt.git
synced 2026-07-03 14:02:58 +08:00
- sweep.py: run many (backend, model, seed, transfer-pair) configs sequentially,
append each result to JSONL incrementally (resumable, interrupt-safe).
- report.py: render the sweep JSONL into a presented Markdown scorecard with
direct-improvement and cross-model-transfer tables.
- reflect prompt now tells the optimizer its edits are APPENDED (can't delete the
base skill text), so on a conflict it must write a forceful OVERRIDE rule.
Diagnosed from a real failure: thorough-analyst (needs <=1200 chars) kept its
edits rejected because the base "be exhaustive" line won; a verified override
("HARD LIMIT ... supersedes") makes Haiku obey (1194/880 chars -> hard=1.0).
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>