Update eval-only README example

This commit is contained in:
hwq
2026-05-30 15:28:17 +00:00
parent 933c0a4ab5
commit 42e555d28e

View File

@@ -164,10 +164,10 @@ Key CLI arguments:
Evaluate a trained skill on specific data splits without training:
```bash
# Evaluate on test set only:
# Evaluate the packaged GPT-5.5 SearchQA skill on the test split:
python scripts/eval_only.py \
--config configs/searchqa/default.yaml \
--skill outputs/my_run/best_skill.md \
--skill ckpt/searchqa/gpt5.5_skill.md \
--split valid_unseen \
--split_dir /path/to/searchqa_split \
--azure_openai_endpoint https://your-resource.openai.azure.com/
@@ -175,12 +175,15 @@ python scripts/eval_only.py \
# Evaluate on all splits (train + val + test):
python scripts/eval_only.py \
--config configs/searchqa/default.yaml \
--skill outputs/my_run/best_skill.md \
--skill ckpt/searchqa/gpt5.5_skill.md \
--split all \
--split_dir /path/to/searchqa_split \
--azure_openai_endpoint https://your-resource.openai.azure.com/
```
To evaluate a skill produced by a training run, replace `--skill` with that
run's best-skill path, for example `outputs/my_run/best_skill.md`.
| Split | Description |
|---|---|
| `valid_unseen` | Test set |
@@ -260,4 +263,3 @@ python -m skillopt_webui.app --share
url={https://arxiv.org/abs/2605.23904}
}
```