diff --git a/docs/guideline.html b/docs/guideline.html index 439fc55..1c0d1d3 100644 --- a/docs/guideline.html +++ b/docs/guideline.html @@ -380,6 +380,15 @@ skillopt/ # the package

2.2 Install the Package #

+

Option A — from PyPI:

+
pip install skillopt
+
+# Optional extras:
+pip install skillopt[alfworld]   # ALFWorld benchmark
+pip install skillopt[webui]      # Gradio monitoring dashboard
+pip install skillopt[claude]     # Claude model backend
+
+

Option B — from source (for development):

git clone https://github.com/microsoft/SkillOpt.git
 cd SkillOpt
 pip install -e .
@@ -708,6 +717,9 @@ skillopt/           # the package
           slow_update_gate_with_selectionboolfalse—false = force-inject guidance; true = gate it on the selection split (see §5.4).
           longitudinal_pair_policystrmixed—mixed / changed / unchanged — which comparison pairs to keep.
           use_meta_skillbooltrueMeta-learningEnable cross-epoch optimizer memory.
+          use_skill_aware_reflectionboolfalse—EmbodiSkill-style failure routing: SKILL_DEFECT (rule wrong/missing → gated body edit) vs EXECUTION_LAPSE (valid rule not followed → reminder appended to a protected appendix region that step-level edits never modify). Off = baseline-identical; resolved process-wide, works on every benchmark. Not supported with rewrite_from_suggestions / full-rewrite modes.
+          skill_aware_appendix_sourcestrboth—both (success analyst may also re-emphasize rules) / failure_only (paper-faithful S_app: failure side only).
+          skill_aware_consolidate_thresholdint0—>0: LLM-compact the appendix once it exceeds N notes (experimental); 0 = off.