Option A — from PyPI:
+pip install skillopt
+
+# Optional extras:
+pip install skillopt[alfworld] # ALFWorld benchmark
+pip install skillopt[webui] # Gradio monitoring dashboard
+pip install skillopt[claude] # Claude model backend
+
+ Option B — from source (for development):
git clone https://github.com/microsoft/SkillOpt.git
cd SkillOpt
pip install -e .
@@ -708,6 +717,9 @@ skillopt/ # the package
slow_update_gate_with_selectionbool false — false = force-inject guidance; true = gate it on the selection split (see §5.4).
longitudinal_pair_policystr mixed — mixed / changed / unchanged — which comparison pairs to keep.
use_meta_skillbool true Meta-learning Enable cross-epoch optimizer memory.
+ use_skill_aware_reflectionbool false — EmbodiSkill-style failure routing: SKILL_DEFECT (rule wrong/missing → gated body edit) vs EXECUTION_LAPSE (valid rule not followed → reminder appended to a protected appendix region that step-level edits never modify). Off = baseline-identical; resolved process-wide, works on every benchmark. Not supported with rewrite_from_suggestions / full-rewrite modes.
+ skill_aware_appendix_sourcestr both — both (success analyst may also re-emphasize rules) / failure_only (paper-faithful S_app: failure side only).
+ skill_aware_consolidate_thresholdint 0 — >0: LLM-compact the appendix once it exceeds N notes (experimental); 0 = off.