mirror of
https://github.com/microsoft/SkillOpt.git
synced 2026-07-03 14:02:58 +08:00
Exposes scripts/train.py and scripts/eval_only.py as Copilot MCP tools (skillopt_list_configs, skillopt_train, skillopt_eval) via a stdlib-only stdio server, mirroring the existing SkillOpt-Sleep plugin layout. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1.7 KiB
1.7 KiB
SkillOpt (research skill-optimization engine)
This repo exposes the core SkillOpt training/eval engine via an MCP server
(skillopt). SkillOpt is validation-gated, text-space skill optimization: it
reflects on rollouts, makes bounded edits to a skill, and keeps a change only
if it improves a held-out validation set.
When the user asks to "optimize a skill", "train on ", "run SkillOpt", "evaluate this skill", or "what configs can I run", use the MCP tools:
skillopt_list_configs— list the benchmark YAML configs you can pass asconfigskillopt_train— run a reflective skill-optimization loop on a config (long-running; spends API/compute budget)skillopt_eval— evaluate a single skill markdown file on a dataset (no training)
Guidance:
- Always run
skillopt_list_configsfirst if you don't already know a validconfigpath. skillopt_trainandskillopt_evalare long-running and consume the user's model backend/budget — confirm theconfig,backend, and model choices with the user before launching, and surface the held-out gate result when the run finishes.- For one-off YAML overrides use
cfg_options(e.g.seed=123 batch_size=40); for any other underlying flag useextra_args.
This is distinct from the SkillOpt-Sleep MCP server (skillopt-sleep,
sleep_* tools), which evolves a local coding agent from past sessions rather
than running the research benchmarks.