Updates the SkillOpt-Sleep plugin on top of the current main. User-facing and engine improvements since the initial drop: * Command renamed /sleep -> /skillopt-sleep across Claude Code + Codex shells; refreshed plugin READMEs and install scripts. * Built-in scheduling (skillopt_sleep/scheduler.py + __main__): schedule / unschedule the nightly cycle without external cron wiring. * Backend robustness: bounded retry with backoff (no more silent empty-string on transient 429/timeout), content-filter-safe rollout prompt, an output-contract guardrail that rejects edits violating the task's required format, and a per-sample cache key so repeated dream rollouts are independent samples (fixes degenerate single-sample reflection). * consolidate / rollout / replay: parallel multi-rollout dreaming, gate-mode controls, TaskRecord.system framing field. Scope: this commit ships only the plugin engine + shells. Research/benchmark harnesses and their data are intentionally not included; the public package has no dependency on them (the one research-evaluator import is now guarded). Marked as an early preview in the README; we'll keep iterating. 99/99 unit tests pass. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
3.2 KiB
description, argument-hint, allowed-tools
| description | argument-hint | allowed-tools |
|---|---|---|
| Run or manage the SkillOpt-Sleep self-evolution cycle (review past sessions, replay tasks offline, consolidate validated memory + skills; can also schedule nightly runs) | [run | dry-run | status | adopt | harvest | schedule | unschedule] (default: status) | Bash, Read |
/skillopt-sleep — SkillOpt-Sleep nightly self-evolution
You are driving SkillOpt-Sleep: a tool that lets this user's Claude agent
improve offline by reviewing past sessions, replaying recurring tasks, and
consolidating what it learns into validated memory (CLAUDE.md) and skills
(SKILL.md). It is gated like SkillOpt: a change is kept only if it improves a
held-out replay score, and nothing live is modified until the user adopts it.
Requested action: $ARGUMENTS
(If $ARGUMENTS is empty, treat it as status.)
How to run it
The engine is the skillopt_sleep Python package in this repo. Use the
plugin's bundled runner so the right interpreter and repo are on the path:
"${CLAUDE_PLUGIN_ROOT}/scripts/sleep.sh" <action> --project "$(pwd)" --scope invoked
<action> is one of:
| action | what it does |
|---|---|
status |
show how many nights have run + the latest staged proposal (READ-ONLY) |
dry-run |
harvest → mine → replay → report, but stage nothing (safe preview) |
run |
full cycle: also stage a reviewed proposal (still does NOT touch live files) |
adopt |
apply the latest staged proposal to live CLAUDE.md / SKILL.md (backs up first) |
harvest |
debug: print the recurring tasks mined from recent sessions |
schedule |
install a nightly cron entry for this project (--hour --minute, off-:00 by default) |
unschedule |
remove the nightly cron entry (--all to remove every managed entry) |
Default backend is mock (deterministic, no API spend). To use real budget for
genuine improvement, add --backend claude or --backend codex. To steer what
the optimizer writes, add --preferences "<your house rules>".
Steps to follow
- Run the requested action via the bundled runner above. Capture stdout.
- For
run/dry-run: after it completes,Readthe generatedreport.mdin the staging dir it prints, and show the user:- held-out score: baseline → candidate (the proof it helped)
- the gate decision (accept/reject) and the exact edits it proposes
- where the proposal is staged
- For
runthat produced an accepted proposal: tell the user the diff is staged and that nothing live changed yet. Offer to run/skillopt-sleep adopt. - For
adopt: confirm which live files were updated and that backups were written under the staging dir'sbackup/. - Never edit
CLAUDE.mdorSKILL.mdyourself — only theadoptaction does that, with a backup. Respect the review gate.
Safety reminders
- Harvest is read-only over
~/.claude. Replay inmockmode runs no shell side effects. - The cycle stages proposals; the user is in control of adoption.
- If the user asks to schedule this nightly, point them at
${CLAUDE_PLUGIN_ROOT}/scripts/install-cron.sh(prints a crontab line; does not install anything without confirmation).