Files
microsoft-SkillOpt/plugins/copilot/skillopt/copilot-instructions.snippet.md
DB Lee 5dc894715f Add SkillOpt research-engine MCP server plugin for Copilot
Exposes scripts/train.py and scripts/eval_only.py as Copilot MCP tools
(skillopt_list_configs, skillopt_train, skillopt_eval) via a stdlib-only
stdio server, mirroring the existing SkillOpt-Sleep plugin layout.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-17 17:24:00 -07:00

1.7 KiB

SkillOpt (research skill-optimization engine)

This repo exposes the core SkillOpt training/eval engine via an MCP server (skillopt). SkillOpt is validation-gated, text-space skill optimization: it reflects on rollouts, makes bounded edits to a skill, and keeps a change only if it improves a held-out validation set.

When the user asks to "optimize a skill", "train on ", "run SkillOpt", "evaluate this skill", or "what configs can I run", use the MCP tools:

  • skillopt_list_configs — list the benchmark YAML configs you can pass as config
  • skillopt_train — run a reflective skill-optimization loop on a config (long-running; spends API/compute budget)
  • skillopt_eval — evaluate a single skill markdown file on a dataset (no training)

Guidance:

  • Always run skillopt_list_configs first if you don't already know a valid config path.
  • skillopt_train and skillopt_eval are long-running and consume the user's model backend/budget — confirm the config, backend, and model choices with the user before launching, and surface the held-out gate result when the run finishes.
  • For one-off YAML overrides use cfg_options (e.g. seed=123 batch_size=40); for any other underlying flag use extra_args.

This is distinct from the SkillOpt-Sleep MCP server (skillopt-sleep, sleep_* tools), which evolves a local coding agent from past sessions rather than running the research benchmarks.