--- hide: - navigation ---
# SkillOpt ### Train Agent Skills Like Neural Networks *Optimize natural-language skill documents through iterative rollout, reflection, and gated validation — with epochs, learning rates, and validation gates — without touching model weights.* [Get Started :material-rocket-launch:](guide/installation.md){ .md-button .md-button--primary } [View on GitHub :material-github:](https://github.com/microsoft/SkillOpt){ .md-button }
--- ## How It Works
🎯
Rollout
Target executes tasks
🔍
Reflect
Optimizer analyzes trajectories
🔗
Aggregate
Merge edit patches
✂️
Select
Rank & clip edits
📝
Update
Apply to skill doc
🚦
Gate
Validate & accept
🔄 Slow Update
🧠 Meta Skill
Epoch Boundary
--- ## Deep Learning Analogy SkillOpt brings the familiar deep-learning training paradigm to agentic prompt optimization: | Deep Learning | SkillOpt | |---|---| | Model weights | Skill document (Markdown) | | Forward pass | Rollout (target executes tasks) | | Loss / gradient | Reflect (optimizer produces edit patches) | | Gradient clipping | Edit selection (`learning_rate` = max edits) | | SGD step | Patch application to skill | | Validation set | Gated evaluation on selection split | | LR schedule | `lr_scheduler`: cosine, linear, constant | | Epochs | Multi-epoch with slow update & meta skill memory | --- ## Supported Benchmarks | Benchmark | Type | Config | |---|---|---| | **DocVQA** | Document QA | `configs/docvqa/` | | **ALFWorld** | Embodied AI | `configs/alfworld/` | | **OfficeQA** | Enterprise QA | `configs/officeqa/` | | **SearchQA** | Open-domain QA | `configs/searchqa/` | | **LiveMathBench** | Math reasoning | `configs/livemathematicianbench/` | | **SWEBench** | Software Engineering | `configs/swebench/` | | + 5 more | Various | See [docs](guide/first-experiment.md) | --- ## Quick Example ```bash # Install pip install -e . # Configure credentials export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/" export AZURE_OPENAI_API_KEY="your-key" # Train on SearchQA python scripts/train.py --config configs/searchqa/default.yaml # Evaluate best skill python scripts/eval_only.py \ --config configs/searchqa/default.yaml \ --skill outputs/best_skill.md ``` ---
- :material-book-open-variant:{ .lg .middle } **Getting Started** --- Install SkillOpt, configure your API keys, and run your first experiment in 5 minutes. [:octicons-arrow-right-24: Installation](guide/installation.md) - :material-puzzle:{ .lg .middle } **Add a Benchmark** --- Extend SkillOpt with your own benchmark in ~100 lines of code. [:octicons-arrow-right-24: Extension Guide](guide/new-benchmark.md) - :material-cog:{ .lg .middle } **Configuration** --- Full reference for all hyperparameters with deep learning analogies. [:octicons-arrow-right-24: Config Reference](reference/config.md) - :material-monitor-dashboard:{ .lg .middle } **WebUI** --- Configure, launch, and monitor training from your browser. [:octicons-arrow-right-24: WebUI Guide](guide/first-experiment.md#webui)