mirror of
https://github.com/microsoft/SkillOpt.git
synced 2026-07-03 14:02:58 +08:00
Replace the compact baseline->after grid with three grouped per-benchmark tables (SearchQA / LiveMath / SpreadsheetBench), each showing all 3 targets x both modes across every night (N0..N5) + Δ. Makes the trajectory visible — gains reach a level and hold rather than being single lucky readings — and presents the full 18-cell evidence in a more solid, readable form. Footnotes LiveMath's 4-night run (train split <50 tasks). Numbers unchanged; just richer presentation.