mirror of
https://github.com/microsoft/SkillOpt.git
synced 2026-07-03 14:02:58 +08:00
feat(plugins): ship SkillOpt-Sleep for Claude Code, Codex, and Copilot
Restructure into plugins/{claude-code,codex,copilot}/ — one engine, three thin
shells, all calling the shared plugins/run-sleep.sh -> python -m skillopt_sleep.
- claude-code/: existing plugin moved here; runner delegates to the shared
launcher (fixes repo-root resolution after the move).
- codex/: ~/.codex/prompts/sleep.md custom prompt + ~/.agents/skills SKILL.md +
install.sh + AGENTS.md hint — Codex's documented, stable extension surfaces.
- copilot/: a stdlib-only MCP server (mcp_server.py) exposing sleep_* tools,
plus mcp-config.example.json and a copilot-instructions snippet. Verified end
to end (initialize -> tools/list -> tools/call returns real engine output).
- plugins/README.md overview table; main README News + a dedicated SkillOpt-Sleep
section; pyproject lists skillopt_sleep as a first-class package.
Decoupling emphasized throughout: open-source tool (skillopt_sleep/) with zero
dependency on the research package. 29 tests pass; all three shells resolve.
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
This commit is contained in:
74
plugins/README.md
Normal file
74
plugins/README.md
Normal file
@@ -0,0 +1,74 @@
|
||||
# SkillOpt-Sleep — plugins for Claude Code, Codex, and Copilot
|
||||
|
||||
One engine, three thin shells. **SkillOpt-Sleep** gives a local coding agent a
|
||||
nightly **sleep cycle**: it reviews your past sessions offline, replays your
|
||||
recurring tasks on your own API budget, and consolidates what it learns into
|
||||
**validated** long-term memory and skills — behind a held-out gate, staged for
|
||||
your review. Your agent gets better the more you use it, with no model-weight
|
||||
training.
|
||||
|
||||
It synthesizes three ideas: **SkillOpt** (validation-gated bounded text
|
||||
optimization — the research in this repo), **Claude Dreams** (offline memory
|
||||
consolidation; input never mutated; review-then-adopt), and the **agent sleep**
|
||||
literature (short-term experience → long-term competence).
|
||||
|
||||
> **This is an open-source tool, decoupled from the research code.** The engine
|
||||
> lives in the top-level [`skillopt_sleep/`](../skillopt_sleep) package and has
|
||||
> **zero dependency** on the paper's `skillopt/` experiment package (the
|
||||
> validation gate is vendored). You can ship/use it without the research stack.
|
||||
|
||||
## The three integrations
|
||||
|
||||
| Platform | Folder | Mechanism | Status |
|
||||
|---|---|---|---|
|
||||
| **Claude Code** | [`claude-code/`](claude-code) | `.claude-plugin` + `/sleep` command + skill + hooks | full, installable |
|
||||
| **Codex** | [`codex/`](codex) | `~/.codex/prompts/sleep.md` + `~/.agents/skills` + `AGENTS.md` | full |
|
||||
| **Copilot** | [`copilot/`](copilot) | MCP server (`sleep_*` tools) + `copilot-instructions` | full (MCP) |
|
||||
|
||||
All three call the **same** [`plugins/run-sleep.sh`](run-sleep.sh) → `python -m
|
||||
skillopt_sleep`, so behaviour is identical everywhere. Per-platform setup is in
|
||||
each folder's README.
|
||||
|
||||
## Quick start (Claude Code)
|
||||
|
||||
```bash
|
||||
git clone <repo-url> && cd SkillOpt-Sleep
|
||||
# Claude Code:
|
||||
/plugin marketplace add ./plugins/claude-code
|
||||
/plugin install skillopt-sleep@skillopt-sleep
|
||||
/sleep status
|
||||
```
|
||||
Codex: `bash plugins/codex/install.sh`.
|
||||
Copilot: register `plugins/copilot/mcp_server.py` as an MCP server.
|
||||
|
||||
## What one "night" does
|
||||
|
||||
```
|
||||
harvest ~/.claude (or session) transcripts → mine recurring tasks → replay offline
|
||||
→ consolidate (reflect → bounded edit → GATE on real held-out tasks)
|
||||
→ stage proposal → (you) adopt
|
||||
```
|
||||
|
||||
Nothing live changes until you adopt; every adopt backs up first.
|
||||
|
||||
## Controls (work on all platforms)
|
||||
|
||||
`--gate on|off` · `--rollouts-k K` (multi-rollout contrastive reflection) ·
|
||||
`--budget-tokens/--budget-minutes` · `--preferences "..."` · separate
|
||||
optimizer/target models (`--optimizer-model` / `--target-model`) · slow-update
|
||||
long-term memory. Full guide:
|
||||
[`../docs/sleep/CONTROLLABLE_DREAMING.md`](../docs/sleep/CONTROLLABLE_DREAMING.md).
|
||||
|
||||
## Does it actually work?
|
||||
|
||||
Validated on the public
|
||||
[gbrain-evals](https://github.com/garrytan/gbrain-evals) `skillopt-v1` benchmark
|
||||
with **real models on both Claude and Codex**: deficient skills go **0.00 →
|
||||
1.00** on held-out sets (all 4 seeds incl. a real tool-use loop), cross-model
|
||||
transfer is positive, and the gate blocks regressions. Full results:
|
||||
[`../docs/sleep/FINAL_REPORT.md`](../docs/sleep/FINAL_REPORT.md).
|
||||
|
||||
Deterministic proof (no API key):
|
||||
```bash
|
||||
python -m skillopt_sleep.experiments.run_experiment --persona researcher --assert-improves
|
||||
```
|
||||
26
plugins/claude-code/.claude-plugin/marketplace.json
Normal file
26
plugins/claude-code/.claude-plugin/marketplace.json
Normal file
@@ -0,0 +1,26 @@
|
||||
{
|
||||
"$schema": "https://anthropic.com/claude-code/marketplace.schema.json",
|
||||
"name": "skillopt-sleep",
|
||||
"description": "SkillOpt-Sleep: give your local Claude agent a nightly sleep cycle that reviews past sessions and consolidates validated memory + skills.",
|
||||
"owner": {
|
||||
"name": "Yifan Yang",
|
||||
"email": "yifanyang@microsoft.com"
|
||||
},
|
||||
"plugins": [
|
||||
{
|
||||
"name": "skillopt-sleep",
|
||||
"description": "Nightly offline self-evolution: harvest your past Claude Code sessions, replay recurring tasks on your own API budget, and consolidate what the agent learns into validated CLAUDE.md memory and SKILL.md skills — behind a held-out gate, staged for your review.越用越好用. Synthesizes SkillOpt (validation-gated skill optimization), Claude Dreams (offline memory consolidation), and agent sleep/consolidation.",
|
||||
"author": {
|
||||
"name": "Yifan Yang"
|
||||
},
|
||||
"category": "productivity",
|
||||
"source": {
|
||||
"source": "git-subdir",
|
||||
"url": "https://github.com/microsoft/SkillOpt.git",
|
||||
"path": "skillopt-sleep-plugin",
|
||||
"ref": "main"
|
||||
},
|
||||
"homepage": "https://github.com/microsoft/SkillOpt"
|
||||
}
|
||||
]
|
||||
}
|
||||
22
plugins/claude-code/.claude-plugin/plugin.json
Normal file
22
plugins/claude-code/.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,22 @@
|
||||
{
|
||||
"name": "skillopt-sleep",
|
||||
"description": "Give your local Claude agent a nightly 'sleep cycle': it reviews your past sessions offline, replays recurring tasks on your own API budget, and consolidates what it learns into validated memory (CLAUDE.md) and skills (SKILL.md).越用越好用 — gets better the more you use it. Synthesizes SkillOpt (validation-gated skill optimization), Claude Dreams (offline memory consolidation), and agent sleep/consolidation.",
|
||||
"version": "0.1.0",
|
||||
"author": {
|
||||
"name": "Yifan Yang",
|
||||
"email": "yifanyang@microsoft.com"
|
||||
},
|
||||
"homepage": "https://github.com/microsoft/SkillOpt",
|
||||
"repository": "https://github.com/microsoft/SkillOpt",
|
||||
"license": "MIT",
|
||||
"keywords": [
|
||||
"skillopt",
|
||||
"self-improvement",
|
||||
"memory-consolidation",
|
||||
"dreams",
|
||||
"sleep",
|
||||
"skills",
|
||||
"continual-learning",
|
||||
"offline-optimization"
|
||||
]
|
||||
}
|
||||
138
plugins/claude-code/README.md
Normal file
138
plugins/claude-code/README.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# SkillOpt-Sleep (Claude Code plugin)
|
||||
|
||||
> Give your local Claude agent a **sleep cycle**. Every night it reviews your
|
||||
> past sessions offline, replays your recurring tasks on your own API budget,
|
||||
> and consolidates what it learns into **validated** memory (`CLAUDE.md`) and
|
||||
> skills (`SKILL.md`). Your agent gets better the more you use it — no
|
||||
> model-weight training.
|
||||
|
||||
SkillOpt-Sleep is the **deployment-time** companion to
|
||||
[SkillOpt](https://github.com/microsoft/SkillOpt). SkillOpt trains a skill
|
||||
offline on a benchmark; SkillOpt-Sleep applies the same discipline to *your own
|
||||
daily usage*: bounded text edits, accepted only through a held-out validation
|
||||
gate, with rejected edits kept as negative feedback.
|
||||
|
||||
It synthesizes three ideas:
|
||||
|
||||
| Idea | Contribution |
|
||||
|---|---|
|
||||
| **SkillOpt** | skill/memory = trainable text; bounded add/delete/replace edits; **held-out gate** keeps only changes that help. |
|
||||
| **Claude Dreams** | offline consolidation over past sessions; input never mutated; output **reviewed then adopted**. |
|
||||
| **Agent sleep** | periodic offline replay turns short-term episodes into long-term skill. |
|
||||
|
||||
## What it does (one "night")
|
||||
|
||||
```
|
||||
harvest ~/.claude transcripts → mine recurring tasks → replay offline
|
||||
→ consolidate (reflect → bounded edit → GATE) → stage proposal → (you) adopt
|
||||
```
|
||||
|
||||
Nothing live is modified until **you** run `/sleep adopt` (the Dreams "review,
|
||||
then adopt or discard" contract). Every adopt backs up the prior file first.
|
||||
|
||||
## Install
|
||||
|
||||
**Requirements:** Python ≥ 3.10, and the `claude` CLI (and/or `codex` CLI) on PATH.
|
||||
|
||||
```bash
|
||||
# 1) get the code (the plugin ships inside the SkillOpt repo)
|
||||
git clone https://github.com/microsoft/SkillOpt.git
|
||||
cd SkillOpt
|
||||
|
||||
# 2) add the plugin to Claude Code as a local marketplace
|
||||
/plugin marketplace add ./skillopt-sleep-plugin
|
||||
/plugin install skillopt-sleep@skillopt-sleep
|
||||
|
||||
# 3) verify
|
||||
/sleep status
|
||||
```
|
||||
|
||||
The plugin's bundled runner (`scripts/sleep.sh`) auto-selects a Python ≥ 3.10
|
||||
interpreter and calls the `skillopt_sleep` engine in the repo. No `pip install`
|
||||
is required for the default `mock` backend or for `claude`/`codex` backends —
|
||||
they shell out to the CLIs you already have.
|
||||
|
||||
## Quick start
|
||||
|
||||
```bash
|
||||
# from inside any project you use with Claude Code:
|
||||
/sleep dry-run # safe preview: what it would learn, no changes staged
|
||||
/sleep run # full cycle: stages a reviewed proposal (still no live edits)
|
||||
/sleep status # see history + the latest staged proposal
|
||||
/sleep adopt # apply the staged proposal to CLAUDE.md / SKILL.md (with backup)
|
||||
```
|
||||
|
||||
Or call the engine directly (Python ≥ 3.10):
|
||||
|
||||
```bash
|
||||
python -m skillopt_sleep run --project "$(pwd)" --scope invoked --backend mock
|
||||
python -m skillopt_sleep run --project "$(pwd)" --backend claude # real lift via Claude
|
||||
python -m skillopt_sleep run --project "$(pwd)" --backend codex # real lift via Codex
|
||||
```
|
||||
|
||||
Default backend is **`mock`** — deterministic, no API spend — so you can try the
|
||||
plumbing for free. Switch to `--backend claude` or `--backend codex` for genuine
|
||||
improvement on your own budget.
|
||||
|
||||
## Does it actually improve? (real models, public benchmark)
|
||||
|
||||
SkillOpt-Sleep is validated against [gbrain-evals](https://github.com/garrytan/gbrain-evals)'
|
||||
public `skillopt-v1` suite — the same benchmark gbrain scores its own skill
|
||||
optimizer against. We take a deliberately **deficient** skill and run one sleep
|
||||
night; held-out scoring is done by a local rule judge (no judge-API, no way to
|
||||
grade its own homework).
|
||||
|
||||
| Backend | Seed | Held-out before → after | Nights |
|
||||
|---|---|---|---|
|
||||
| **Claude (Haiku 4.5)** | brief-writer | **0.00 → 1.00** | 1 |
|
||||
| **Codex** | brief-writer | **0.00 → 1.00** | 2 |
|
||||
|
||||
Both took a brief-writer with no risks section / no confidence level and, within
|
||||
1–2 nights, proposed gated edits that lifted the held-out score to perfect —
|
||||
into the protected `LEARNED` block, nothing else touched. The Codex 2-night
|
||||
trace even shows the optimizer **diagnosing its own residual failure** and
|
||||
adding a meta-rule to fix it. Full writeup + reproduction:
|
||||
[`docs/sleep/real_api_results.md`](../docs/sleep/real_api_results.md).
|
||||
|
||||
Reproduce:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/garrytan/gbrain-evals /tmp/gbrain-evals
|
||||
python -m skillopt_sleep.experiments.run_gbrain --backend claude --model haiku \
|
||||
--seeds brief-writer --data-root /tmp/gbrain-evals/eval/data/skillopt-v1 \
|
||||
--nights 1 --limit-replay 3 --limit-holdout 3
|
||||
python -m skillopt_sleep.experiments.run_gbrain --backend codex \
|
||||
--seeds brief-writer --data-root /tmp/gbrain-evals/eval/data/skillopt-v1 \
|
||||
--nights 1 --limit-replay 3 --limit-holdout 3
|
||||
```
|
||||
|
||||
## Deterministic proof (no API, no keys)
|
||||
|
||||
```bash
|
||||
python -m skillopt_sleep.experiments.run_experiment --persona researcher --assert-improves
|
||||
python -m skillopt_sleep.experiments.run_experiment --persona programmer --assert-improves
|
||||
```
|
||||
|
||||
Each prints the held-out score rising from baseline toward 1.0 as the gate
|
||||
accepts the general rules your tasks need, and confirms the gate **rejects** an
|
||||
injected harmful edit. Recorded output: [`docs/sleep/experiment_results.md`](../docs/sleep/experiment_results.md).
|
||||
|
||||
## Schedule it nightly
|
||||
|
||||
```bash
|
||||
"${CLAUDE_PLUGIN_ROOT}/scripts/install-cron.sh" "$(pwd)" # prints a crontab line; installs nothing
|
||||
```
|
||||
|
||||
## Safety
|
||||
|
||||
- **Read-only** harvest of `~/.claude`. `mock` replay has no side effects.
|
||||
- Proposals are **staged**, never auto-applied (unless you opt in with `--auto-adopt`).
|
||||
- Every adopt writes a backup under the staging dir's `backup/`.
|
||||
- Per-night **token/task budget caps**; secrets redacted from prompts.
|
||||
- `fresh` replay (Phase 3) runs only in throwaway git worktrees.
|
||||
|
||||
## Status
|
||||
|
||||
Phase 1 (engine + deterministic experiment + plugin surface) is complete.
|
||||
Phase 3 adds the real-API miner/judge and `fresh` worktree replay. See
|
||||
[`docs/superpowers/specs/2026-06-07-skillopt-sleep-claude-code-plugin-design.md`](../docs/superpowers/specs/2026-06-07-skillopt-sleep-claude-code-plugin-design.md).
|
||||
63
plugins/claude-code/commands/sleep.md
Normal file
63
plugins/claude-code/commands/sleep.md
Normal file
@@ -0,0 +1,63 @@
|
||||
---
|
||||
description: Run or manage the SkillOpt-Sleep self-evolution cycle (review past sessions, replay tasks offline, consolidate validated memory + skills)
|
||||
argument-hint: "[run | dry-run | status | adopt | harvest] (default: status)"
|
||||
allowed-tools: Bash, Read
|
||||
---
|
||||
|
||||
# /sleep — SkillOpt-Sleep nightly self-evolution
|
||||
|
||||
You are driving **SkillOpt-Sleep**: a tool that lets this user's Claude agent
|
||||
improve offline by reviewing past sessions, replaying recurring tasks, and
|
||||
consolidating what it learns into **validated** memory (`CLAUDE.md`) and skills
|
||||
(`SKILL.md`). It is gated like SkillOpt: a change is kept only if it improves a
|
||||
held-out replay score, and nothing live is modified until the user adopts it.
|
||||
|
||||
## Requested action: $ARGUMENTS
|
||||
|
||||
(If `$ARGUMENTS` is empty, treat it as `status`.)
|
||||
|
||||
## How to run it
|
||||
|
||||
The engine is the `skillopt_sleep` Python package in this repo. Use the
|
||||
**plugin's bundled runner** so the right interpreter and repo are on the path:
|
||||
|
||||
```bash
|
||||
"${CLAUDE_PLUGIN_ROOT}/scripts/sleep.sh" <action> --project "$(pwd)" --scope invoked
|
||||
```
|
||||
|
||||
`<action>` is one of:
|
||||
|
||||
| action | what it does |
|
||||
|-----------|--------------|
|
||||
| `status` | show how many nights have run + the latest staged proposal (READ-ONLY) |
|
||||
| `dry-run` | harvest → mine → replay → report, but **stage nothing** (safe preview) |
|
||||
| `run` | full cycle: also **stage** a reviewed proposal (still does NOT touch live files) |
|
||||
| `adopt` | apply the latest staged proposal to live `CLAUDE.md` / `SKILL.md` (backs up first) |
|
||||
| `harvest` | debug: print the recurring tasks mined from recent sessions |
|
||||
|
||||
Default backend is `mock` (deterministic, no API spend). To use real Anthropic
|
||||
budget for genuine improvement, add `--backend anthropic`.
|
||||
|
||||
## Steps to follow
|
||||
|
||||
1. **Run the requested action** via the bundled runner above. Capture stdout.
|
||||
2. **For `run` / `dry-run`:** after it completes, `Read` the generated
|
||||
`report.md` in the staging dir it prints, and show the user:
|
||||
- held-out score: baseline → candidate (the proof it helped)
|
||||
- the gate decision (accept/reject) and the exact edits it proposes
|
||||
- where the proposal is staged
|
||||
3. **For `run` that produced an accepted proposal:** tell the user the diff is
|
||||
staged and that **nothing live changed yet**. Offer to run `/sleep adopt`.
|
||||
4. **For `adopt`:** confirm which live files were updated and that backups were
|
||||
written under the staging dir's `backup/`.
|
||||
5. **Never** edit `CLAUDE.md` or `SKILL.md` yourself — only the `adopt` action
|
||||
does that, with a backup. Respect the review gate.
|
||||
|
||||
## Safety reminders
|
||||
|
||||
- Harvest is **read-only** over `~/.claude`. Replay in `mock` mode runs no
|
||||
shell side effects.
|
||||
- The cycle stages proposals; the user is in control of adoption.
|
||||
- If the user asks to schedule this nightly, point them at
|
||||
`${CLAUDE_PLUGIN_ROOT}/scripts/install-cron.sh` (prints a crontab line; does
|
||||
not install anything without confirmation).
|
||||
16
plugins/claude-code/hooks/hooks.json
Normal file
16
plugins/claude-code/hooks/hooks.json
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"hooks": {
|
||||
"SessionEnd": [
|
||||
{
|
||||
"matcher": "*",
|
||||
"hooks": [
|
||||
{
|
||||
"type": "command",
|
||||
"command": "\"${CLAUDE_PLUGIN_ROOT}/hooks/on-session-end.sh\"",
|
||||
"async": true
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
18
plugins/claude-code/hooks/on-session-end.sh
Executable file
18
plugins/claude-code/hooks/on-session-end.sh
Executable file
@@ -0,0 +1,18 @@
|
||||
#!/usr/bin/env bash
|
||||
# SkillOpt-Sleep SessionEnd hook (async, best-effort, NON-BLOCKING).
|
||||
#
|
||||
# This does NOT run the optimizer. It only appends a tiny marker so the next
|
||||
# nightly cycle knows there is fresh activity to harvest, and (optionally)
|
||||
# nudges the user once that a sleep cycle is available. It must never fail the
|
||||
# session or spend API budget.
|
||||
set -uo pipefail
|
||||
|
||||
PLUGIN_ROOT="${CLAUDE_PLUGIN_ROOT:-$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)}"
|
||||
STATE_DIR="${HOME}/.skillopt-sleep"
|
||||
mkdir -p "$STATE_DIR" 2>/dev/null || exit 0
|
||||
|
||||
# Record that a session just ended (cheap; used for "is there new data?").
|
||||
printf '%s\t%s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "${PWD}" \
|
||||
>> "$STATE_DIR/session-end.log" 2>/dev/null || true
|
||||
|
||||
exit 0
|
||||
29
plugins/claude-code/scripts/install-cron.sh
Executable file
29
plugins/claude-code/scripts/install-cron.sh
Executable file
@@ -0,0 +1,29 @@
|
||||
#!/usr/bin/env bash
|
||||
# Print (does NOT install) a crontab line that runs SkillOpt-Sleep nightly.
|
||||
# The user copies the line into `crontab -e` if they want it.
|
||||
set -euo pipefail
|
||||
|
||||
PLUGIN_ROOT="${CLAUDE_PLUGIN_ROOT:-$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)}"
|
||||
RUNNER="$PLUGIN_ROOT/scripts/sleep.sh"
|
||||
PROJECT="${1:-$(pwd)}"
|
||||
BACKEND="${2:-mock}"
|
||||
|
||||
# 3:17am local — deliberately off the :00 mark so many users don't all hit the
|
||||
# API at once (and we leave room for jitter).
|
||||
MIN=17
|
||||
HOUR=3
|
||||
|
||||
cat <<EOF
|
||||
# ── SkillOpt-Sleep nightly cycle ────────────────────────────────────────────
|
||||
# Review past sessions, replay tasks, stage validated memory/skill updates.
|
||||
# Runs at ${HOUR}:$(printf '%02d' $MIN) local every day. Output goes to the project's
|
||||
# .skillopt-sleep/ dir; nothing live is changed until you run '/sleep adopt'
|
||||
# (unless you pass --auto-adopt below).
|
||||
#
|
||||
# Copy the next line into 'crontab -e':
|
||||
${MIN} ${HOUR} * * * "${RUNNER}" run --project "${PROJECT}" --scope invoked --backend ${BACKEND} >> "${PROJECT}/.skillopt-sleep/cron.log" 2>&1
|
||||
#
|
||||
# For fully-autonomous adoption (power users), append: --auto-adopt
|
||||
# To spend real API budget for genuine lift, set BACKEND=anthropic above.
|
||||
# ────────────────────────────────────────────────────────────────────────────
|
||||
EOF
|
||||
11
plugins/claude-code/scripts/sleep.sh
Executable file
11
plugins/claude-code/scripts/sleep.sh
Executable file
@@ -0,0 +1,11 @@
|
||||
#!/usr/bin/env bash
|
||||
# Claude Code plugin runner — thin wrapper over the shared runner so all three
|
||||
# platform plugins share one engine launcher. The shared runner lives at
|
||||
# <repo>/plugins/run-sleep.sh and handles repo-root + interpreter resolution.
|
||||
set -euo pipefail
|
||||
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # <repo>/plugins/claude-code/scripts
|
||||
SHARED="$(cd "$HERE/../.." && pwd)/run-sleep.sh" # <repo>/plugins/run-sleep.sh
|
||||
if [ ! -f "$SHARED" ] && [ -n "${CLAUDE_PLUGIN_ROOT:-}" ]; then
|
||||
SHARED="$(cd "$CLAUDE_PLUGIN_ROOT/.." && pwd)/run-sleep.sh"
|
||||
fi
|
||||
exec bash "$SHARED" "$@"
|
||||
79
plugins/claude-code/skills/skillopt-sleep/SKILL.md
Normal file
79
plugins/claude-code/skills/skillopt-sleep/SKILL.md
Normal file
@@ -0,0 +1,79 @@
|
||||
---
|
||||
name: skillopt-sleep
|
||||
description: "Use when the user wants their Claude agent to self-improve from past usage, asks about a nightly/offline 'sleep' or 'dream' cycle, memory/skill consolidation, or says things like '让 agent 越用越好用', 'review my past sessions', 'learn my preferences', 'consolidate what you learned', 'run the sleep cycle', or wants to schedule offline self-optimization. Drives the skillopt_sleep engine: harvest past sessions → mine recurring tasks → replay offline → consolidate validated CLAUDE.md/SKILL.md behind a held-out gate."
|
||||
---
|
||||
|
||||
# SkillOpt-Sleep: offline self-evolution for a local Claude agent
|
||||
|
||||
SkillOpt-Sleep gives the user's agent a **sleep cycle**. While the user is
|
||||
offline (e.g. nightly), it reviews their real past Claude Code sessions,
|
||||
re-runs recurring tasks on their own API budget, and consolidates what it
|
||||
learns into **memory** (`CLAUDE.md`) and **skills** (`SKILL.md`) — but only
|
||||
keeps changes that pass a held-out validation gate, and only after the user
|
||||
adopts them. The agent gets measurably better at *this* user's recurring work,
|
||||
with no model-weight training. It is the deployment-time analogue of training:
|
||||
short-term experience → long-term competence.
|
||||
|
||||
It synthesizes three ideas:
|
||||
- **SkillOpt** — the skill/memory doc is trainable text; bounded add/delete/replace
|
||||
edits; accepted only through a held-out gate; rejected edits become negative feedback.
|
||||
- **Claude Dreams** — offline consolidation that reads past sessions and rebuilds
|
||||
memory (dedup/merge/resolve); the input is never mutated; output is reviewed then adopted.
|
||||
- **Agent sleep** — periodic offline replay turns episodes into durable skill.
|
||||
|
||||
## When to use this skill
|
||||
|
||||
Trigger when the user wants any of:
|
||||
- "make my agent learn from how I use it" / "越用越好用" / "remember my preferences across sessions"
|
||||
- a nightly/scheduled or on-demand **offline self-improvement / dream / sleep** run
|
||||
- to **review past sessions/trajectories** and distill recurring tasks
|
||||
- to **consolidate** feedback into `CLAUDE.md` or a managed skill
|
||||
- to **schedule** the cycle (cron) or **adopt** a staged proposal
|
||||
|
||||
## The cycle (six stages)
|
||||
|
||||
1. **Harvest** — read `~/.claude/projects/*/<session>.jsonl` + `~/.claude/history.jsonl` (READ-ONLY) → session digests.
|
||||
2. **Mine** — digests → `TaskRecord`s (recurring intents + outcome labels + checkable refs where possible).
|
||||
3. **Replay** — re-run tasks offline under the *current* skill+memory → (hard, soft) scores.
|
||||
4. **Consolidate** — reflect on failures → propose bounded edits → **gate** on a held-out slice; accept only if it strictly improves.
|
||||
5. **Stage** — write `proposed_CLAUDE.md`, `proposed_SKILL.md`, a diff, and `report.md` into `<project>/.skillopt-sleep/staging/<date>/`. **Nothing live changes.**
|
||||
6. **Adopt** — explicit (or opt-in auto): copy staged files over live ones, backing up first.
|
||||
|
||||
## How to drive it
|
||||
|
||||
Prefer the `/sleep` command. Under the hood it calls the bundled runner:
|
||||
|
||||
```bash
|
||||
"${CLAUDE_PLUGIN_ROOT}/scripts/sleep.sh" status # what's happened
|
||||
"${CLAUDE_PLUGIN_ROOT}/scripts/sleep.sh" dry-run --project "$(pwd)" # safe preview
|
||||
"${CLAUDE_PLUGIN_ROOT}/scripts/sleep.sh" run --project "$(pwd)" # full cycle, stages a proposal
|
||||
"${CLAUDE_PLUGIN_ROOT}/scripts/sleep.sh" adopt --project "$(pwd)" # apply staged proposal (with backup)
|
||||
```
|
||||
|
||||
- Default backend is `mock` (deterministic, **no API spend**) — good for trying the plumbing.
|
||||
- Add `--backend claude` or `--backend codex` to spend the user's real budget for genuine improvement.
|
||||
- Scope defaults to the invoked project; `--scope all` harvests every project.
|
||||
|
||||
## Hard rules
|
||||
|
||||
- **Never** hand-edit the user's `CLAUDE.md` / `SKILL.md` as part of this skill.
|
||||
Only the `adopt` action changes live files, and it backs them up first.
|
||||
- Harvest is read-only. `mock` replay has no side effects.
|
||||
- Always show the user the **held-out baseline → candidate** score and the
|
||||
exact proposed edits before suggesting adoption. Evidence before adoption.
|
||||
- If asked whether it really helps, run
|
||||
`python -m skillopt_sleep.experiments.run_experiment --persona researcher --json`
|
||||
— a deterministic demo that proves held-out lift and that the gate blocks
|
||||
harmful edits.
|
||||
|
||||
## Validate / demo
|
||||
|
||||
```bash
|
||||
# deterministic proof (no API): held-out score rises, gate blocks regressions
|
||||
python -m skillopt_sleep.experiments.run_experiment --persona researcher --assert-improves
|
||||
python -m skillopt_sleep.experiments.run_experiment --persona programmer --assert-improves
|
||||
```
|
||||
|
||||
See `docs/sleep/experiment_results.md` for recorded output and
|
||||
`docs/superpowers/specs/2026-06-07-skillopt-sleep-claude-code-plugin-design.md`
|
||||
for the full design.
|
||||
59
plugins/codex/README.md
Normal file
59
plugins/codex/README.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# SkillOpt-Sleep — Codex integration
|
||||
|
||||
Give your **Codex** agent a nightly **sleep cycle**: it reviews past sessions
|
||||
offline, replays your recurring tasks on your own Codex budget, and consolidates
|
||||
what it learns into validated memory + skills behind a held-out gate. Same engine
|
||||
as the Claude Code plugin (`skillopt_sleep`), wrapped for Codex.
|
||||
|
||||
> **Verified on Codex:** on the public
|
||||
> [gbrain-evals](https://github.com/garrytan/gbrain-evals) `skillopt-v1`
|
||||
> benchmark, a deliberately deficient skill goes **0.00 → 1.00** on a held-out
|
||||
> set with the Codex backend (incl. the tool-use seed via a real tool loop).
|
||||
> See [`../../docs/sleep/FINAL_REPORT.md`](../../docs/sleep/FINAL_REPORT.md).
|
||||
|
||||
## What Codex supports (and what we use)
|
||||
|
||||
Codex (`@openai/codex`) extends via **`AGENTS.md`** instructions, **skills** at
|
||||
`~/.agents/skills/<name>/SKILL.md`, and **custom prompts** at
|
||||
`~/.codex/prompts/<name>.md` (invoked as `/<name>`). This integration ships all
|
||||
three, plus a shared runner.
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
git clone <repo-url> SkillOpt-Sleep
|
||||
cd SkillOpt-Sleep
|
||||
bash plugins/codex/install.sh # installs the /sleep prompt + skill
|
||||
export SKILLOPT_SLEEP_REPO="$(pwd)" # so the runner is found from anywhere
|
||||
```
|
||||
|
||||
Requires Python ≥ 3.10 and the `codex` CLI on PATH.
|
||||
|
||||
## Use
|
||||
|
||||
```text
|
||||
/sleep status # what's happened
|
||||
/sleep dry-run # safe preview, stages nothing
|
||||
/sleep run # full cycle, stages a reviewed proposal (no live edits)
|
||||
/sleep adopt # apply the staged proposal (with backup)
|
||||
```
|
||||
|
||||
Or call the engine directly:
|
||||
|
||||
```bash
|
||||
python -m skillopt_sleep run --project "$(pwd)" --backend codex
|
||||
```
|
||||
|
||||
Default backend is `mock` (no API spend). `--backend codex` uses your Codex
|
||||
budget for real improvement. All the controllable knobs (`--gate on|off`,
|
||||
`--rollouts-k`, `--budget-tokens`, `--preferences`, optimizer/target split) work
|
||||
identically — see [`../../docs/sleep/CONTROLLABLE_DREAMING.md`](../../docs/sleep/CONTROLLABLE_DREAMING.md).
|
||||
|
||||
## Notes / status
|
||||
|
||||
- Codex's `exec` runs shell, so the real-tool-loop replay (e.g. the
|
||||
`tool_called: search` benchmark seed) works natively.
|
||||
- Codex's standalone *plugin-package manifest* format is not yet a stable public
|
||||
spec; this integration uses the documented `AGENTS.md` + skills + prompts
|
||||
mechanisms, which are stable. If/when a `codex plugin` package format ships,
|
||||
we'll add a one-file manifest.
|
||||
36
plugins/codex/install.sh
Executable file
36
plugins/codex/install.sh
Executable file
@@ -0,0 +1,36 @@
|
||||
#!/usr/bin/env bash
|
||||
# Install the SkillOpt-Sleep Codex integration into the user's ~/.codex and
|
||||
# ~/.agents directories. Idempotent; prints what it does.
|
||||
set -euo pipefail
|
||||
|
||||
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
|
||||
CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
|
||||
AGENTS_SKILLS="${HOME}/.agents/skills"
|
||||
|
||||
echo "[install] repo: $REPO_ROOT"
|
||||
|
||||
# 1) custom /sleep prompt
|
||||
mkdir -p "$CODEX_HOME/prompts"
|
||||
cp "$REPO_ROOT/plugins/codex/prompts/sleep.md" "$CODEX_HOME/prompts/sleep.md"
|
||||
echo "[install] /sleep prompt -> $CODEX_HOME/prompts/sleep.md"
|
||||
|
||||
# 2) user-level skill
|
||||
mkdir -p "$AGENTS_SKILLS/skillopt-sleep"
|
||||
cp "$REPO_ROOT/plugins/codex/skills/skillopt-sleep/SKILL.md" "$AGENTS_SKILLS/skillopt-sleep/SKILL.md"
|
||||
echo "[install] skill -> $AGENTS_SKILLS/skillopt-sleep/SKILL.md"
|
||||
|
||||
# 3) record the repo location so the runner is found from anywhere
|
||||
echo "[install] add to your shell profile:"
|
||||
echo " export SKILLOPT_SLEEP_REPO=\"$REPO_ROOT\""
|
||||
|
||||
# 4) optional: append an AGENTS.md hint (only if the user opts in)
|
||||
cat <<EOF
|
||||
|
||||
[install] Optional — add this to ~/.codex/AGENTS.md so Codex always knows the tool:
|
||||
|
||||
## SkillOpt-Sleep
|
||||
An offline self-improvement cycle is available. To run it:
|
||||
\`bash "$REPO_ROOT/plugins/run-sleep.sh" status\`. Use \`/sleep\` for the guided flow.
|
||||
|
||||
Done. Try: /sleep status
|
||||
EOF
|
||||
21
plugins/codex/prompts/sleep.md
Normal file
21
plugins/codex/prompts/sleep.md
Normal file
@@ -0,0 +1,21 @@
|
||||
# /sleep — SkillOpt-Sleep for Codex
|
||||
#
|
||||
# Custom prompt: copy this file to ~/.codex/prompts/sleep.md and invoke with
|
||||
# `/sleep` in the Codex CLI. ($ARGUMENTS is the text after /sleep.)
|
||||
|
||||
Run the SkillOpt-Sleep offline self-evolution cycle. Action: $ARGUMENTS
|
||||
(empty → "status").
|
||||
|
||||
Use the bundled runner via shell:
|
||||
|
||||
bash "${SKILLOPT_SLEEP_REPO:?set SKILLOPT_SLEEP_REPO to the repo root}/plugins/run-sleep.sh" $ARGUMENTS --project "$(pwd)"
|
||||
|
||||
Then:
|
||||
- For `run`/`dry-run`: read the staged `report.md` and show the held-out
|
||||
baseline → candidate score and the proposed edits. `run` only stages a
|
||||
proposal; nothing live changes until `adopt`.
|
||||
- For `adopt`: confirm which files were updated and that a backup was written.
|
||||
- Never edit the user's AGENTS.md / skills yourself; only `adopt` does that.
|
||||
|
||||
Default backend is `mock` (no API spend). Add `--backend codex` for real
|
||||
improvement on the user's Codex budget.
|
||||
49
plugins/codex/skills/skillopt-sleep/SKILL.md
Normal file
49
plugins/codex/skills/skillopt-sleep/SKILL.md
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
name: skillopt-sleep
|
||||
description: Nightly offline self-evolution for a Codex agent. Reviews past sessions, replays recurring tasks, and consolidates validated memory + skills behind a held-out gate. Use when the user wants Codex to learn from past usage, run a "sleep"/"dream" cycle, or schedule offline self-optimization.
|
||||
---
|
||||
|
||||
# SkillOpt-Sleep (Codex skill)
|
||||
|
||||
This skill drives the `skillopt_sleep` engine — an offline "sleep cycle" that
|
||||
makes a Codex agent better at the user's recurring work without retraining.
|
||||
|
||||
## When to use
|
||||
|
||||
Trigger when the user wants to: review past sessions, learn their preferences,
|
||||
consolidate feedback into long-term memory/skills, run a nightly/offline
|
||||
self-improvement cycle, or adopt a staged proposal.
|
||||
|
||||
## How to run it
|
||||
|
||||
Invoke the bundled runner via shell (Codex `exec` has shell access). The runner
|
||||
finds the engine and a Python ≥ 3.10 automatically:
|
||||
|
||||
```bash
|
||||
# point at the repo if it isn't auto-detected from CWD:
|
||||
export SKILLOPT_SLEEP_REPO=/path/to/SkillOpt-Sleep
|
||||
bash "$SKILLOPT_SLEEP_REPO/plugins/run-sleep.sh" <action> --project "$(pwd)"
|
||||
```
|
||||
|
||||
`<action>` ∈ `status | dry-run | run | adopt | harvest`. Use `--backend codex`
|
||||
for real improvement on the user's own Codex budget (default `mock` = no spend).
|
||||
|
||||
## Steps
|
||||
|
||||
1. Run the requested action; capture stdout.
|
||||
2. For `run`/`dry-run`: read the staged `report.md` it prints and show the user
|
||||
the held-out baseline → candidate score and the exact proposed edits.
|
||||
3. `run` only **stages** a proposal under `<project>/.skillopt-sleep/staging/`;
|
||||
nothing live changes until `adopt`. Offer `/sleep adopt`.
|
||||
4. Never hand-edit the user's `AGENTS.md` / skills yourself — only `adopt` does,
|
||||
and it backs up first.
|
||||
|
||||
## Validate
|
||||
|
||||
```bash
|
||||
python -m skillopt_sleep.experiments.run_gbrain --backend codex \
|
||||
--seeds brief-writer --data-root /path/to/gbrain-evals/eval/data/skillopt-v1 \
|
||||
--nights 2 --limit-replay 3 --limit-holdout 3
|
||||
```
|
||||
A deficient skill goes 0.00 → 1.00 on a held-out set; the optimizer's edits are
|
||||
gated on real-task performance.
|
||||
67
plugins/copilot/README.md
Normal file
67
plugins/copilot/README.md
Normal file
@@ -0,0 +1,67 @@
|
||||
# SkillOpt-Sleep — GitHub Copilot integration
|
||||
|
||||
Give **Copilot** (CLI or VS Code) a nightly **sleep cycle** via a tiny **MCP
|
||||
server** that exposes the `skillopt_sleep` engine as tools. MCP is GitHub's
|
||||
supported way to extend Copilot, so this works across Copilot CLI, VS Code, and
|
||||
other MCP clients with the same server.
|
||||
|
||||
## What's here
|
||||
|
||||
| File | Purpose |
|
||||
|---|---|
|
||||
| `mcp_server.py` | stdlib-only MCP (stdio) server exposing `sleep_*` tools |
|
||||
| `mcp-config.example.json` | drop-in MCP server config |
|
||||
| `copilot-instructions.snippet.md` | paste into `.github/copilot-instructions.md` |
|
||||
|
||||
## Install
|
||||
|
||||
Requires Python ≥ 3.10. No third-party packages — the server is pure stdlib.
|
||||
|
||||
1. **Register the MCP server.** Add the server to your Copilot MCP config
|
||||
(Copilot CLI: `~/.copilot/mcp-config.json`; VS Code: your MCP settings).
|
||||
Use `mcp-config.example.json` as a template — set `SKILLOPT_SLEEP_REPO` to
|
||||
this repo's path:
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"skillopt-sleep": {
|
||||
"command": "python3",
|
||||
"args": ["/abs/path/SkillOpt-Sleep/plugins/copilot/mcp_server.py"],
|
||||
"env": { "SKILLOPT_SLEEP_REPO": "/abs/path/SkillOpt-Sleep" }
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. **(Optional) Tell Copilot about it.** Append
|
||||
`copilot-instructions.snippet.md` to your repo's
|
||||
`.github/copilot-instructions.md` so Copilot reaches for the tools when the
|
||||
user asks to "run the sleep cycle".
|
||||
|
||||
## Use
|
||||
|
||||
Ask Copilot things like *"run the sleep cycle"*, *"what did the last sleep
|
||||
propose?"*, *"adopt the staged sleep proposal"*. Copilot calls the MCP tools:
|
||||
`sleep_status`, `sleep_dry_run`, `sleep_run`, `sleep_adopt`, `sleep_harvest`.
|
||||
|
||||
Each tool takes optional `project`, `backend` (`mock`/`claude`/`codex`), and
|
||||
`scope` arguments. Default backend is `mock` (no API spend).
|
||||
|
||||
## Verify the server directly (no Copilot needed)
|
||||
|
||||
```bash
|
||||
printf '%s\n' \
|
||||
'{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' \
|
||||
'{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \
|
||||
| SKILLOPT_SLEEP_REPO="$(pwd)" python3 plugins/copilot/mcp_server.py
|
||||
```
|
||||
You should see the server info and the five `sleep_*` tools.
|
||||
|
||||
## Notes / status
|
||||
|
||||
- MCP is the stable, official Copilot extension surface, so this is the most
|
||||
portable of the three integrations (one server → CLI + IDE).
|
||||
- The engine and all its controls (gate on/off, multi-rollout, budget,
|
||||
preferences, optimizer/target split) are identical across platforms — see
|
||||
[`../../docs/sleep/CONTROLLABLE_DREAMING.md`](../../docs/sleep/CONTROLLABLE_DREAMING.md).
|
||||
25
plugins/copilot/copilot-instructions.snippet.md
Normal file
25
plugins/copilot/copilot-instructions.snippet.md
Normal file
@@ -0,0 +1,25 @@
|
||||
<!--
|
||||
Copy this block into your repo's .github/copilot-instructions.md so Copilot
|
||||
knows the SkillOpt-Sleep tools exist. (Copilot reads copilot-instructions.md
|
||||
automatically as ambient guidance.)
|
||||
-->
|
||||
|
||||
## SkillOpt-Sleep (offline self-evolution)
|
||||
|
||||
This project has SkillOpt-Sleep available via an MCP server (`skillopt-sleep`).
|
||||
It gives the agent a nightly "sleep cycle": it reviews past sessions, replays
|
||||
recurring tasks offline, and consolidates validated memory + skills behind a
|
||||
held-out gate.
|
||||
|
||||
When the user asks to "run the sleep cycle", "review my past sessions", "learn
|
||||
my preferences", or "make the agent improve from past usage", use the MCP tools:
|
||||
|
||||
- `sleep_status` — what's happened + the latest staged proposal
|
||||
- `sleep_dry_run` — safe preview, stages nothing
|
||||
- `sleep_run` — full cycle, stages a reviewed proposal (nothing live changes)
|
||||
- `sleep_adopt` — apply the staged proposal (backs up first)
|
||||
- `sleep_harvest` — list mined recurring tasks
|
||||
|
||||
Always show the user the held-out baseline → candidate score and the proposed
|
||||
edits before suggesting `sleep_adopt`. Never hand-edit the user's memory/skill
|
||||
files; only `sleep_adopt` does that, with a backup.
|
||||
11
plugins/copilot/mcp-config.example.json
Normal file
11
plugins/copilot/mcp-config.example.json
Normal file
@@ -0,0 +1,11 @@
|
||||
{
|
||||
"mcpServers": {
|
||||
"skillopt-sleep": {
|
||||
"command": "python3",
|
||||
"args": ["plugins/copilot/mcp_server.py"],
|
||||
"env": {
|
||||
"SKILLOPT_SLEEP_REPO": "${workspaceFolder}"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
128
plugins/copilot/mcp_server.py
Executable file
128
plugins/copilot/mcp_server.py
Executable file
@@ -0,0 +1,128 @@
|
||||
#!/usr/bin/env python3
|
||||
"""SkillOpt-Sleep — minimal MCP server (stdio, stdlib-only).
|
||||
|
||||
Exposes the sleep engine as MCP tools so any MCP-capable client (GitHub Copilot
|
||||
CLI / VS Code, Claude Desktop, etc.) can drive it. No third-party deps: speaks
|
||||
JSON-RPC 2.0 over stdio with just the handful of MCP methods clients need.
|
||||
|
||||
Tools exposed:
|
||||
- sleep_status : how many nights have run + the latest staged proposal
|
||||
- sleep_dry_run : harvest+mine+replay, report only (no staging)
|
||||
- sleep_run : full cycle, stages a proposal (nothing live changes)
|
||||
- sleep_adopt : apply the latest staged proposal (with backup)
|
||||
- sleep_harvest : debug — list mined recurring tasks
|
||||
|
||||
Each tool shells out to `python -m skillopt_sleep <action> ...` and returns its
|
||||
stdout. Configure your client to launch: python plugins/copilot/mcp_server.py
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
REPO_ROOT = os.environ.get("SKILLOPT_SLEEP_REPO") or os.path.abspath(
|
||||
os.path.join(os.path.dirname(__file__), "..", "..")
|
||||
)
|
||||
PROTOCOL_VERSION = "2024-11-05"
|
||||
|
||||
TOOLS = [
|
||||
{"name": "sleep_status", "action": "status",
|
||||
"description": "Show how many SkillOpt-Sleep nights have run and the latest staged proposal."},
|
||||
{"name": "sleep_dry_run", "action": "dry-run",
|
||||
"description": "Preview a sleep cycle (harvest+mine+replay) without staging anything."},
|
||||
{"name": "sleep_run", "action": "run",
|
||||
"description": "Run a full sleep cycle; stages a reviewed proposal. Nothing live changes until adopt."},
|
||||
{"name": "sleep_adopt", "action": "adopt",
|
||||
"description": "Apply the latest staged proposal to CLAUDE.md/SKILL.md (backs up first)."},
|
||||
{"name": "sleep_harvest", "action": "harvest",
|
||||
"description": "Debug: list the recurring tasks mined from recent sessions."},
|
||||
]
|
||||
_BY_NAME = {t["name"]: t for t in TOOLS}
|
||||
|
||||
_TOOL_SCHEMA = {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string", "description": "Project dir to evolve (default: cwd)."},
|
||||
"backend": {"type": "string", "enum": ["mock", "claude", "codex"],
|
||||
"description": "mock = no API spend (default); claude/codex = real."},
|
||||
"scope": {"type": "string", "enum": ["invoked", "all"]},
|
||||
},
|
||||
"additionalProperties": False,
|
||||
}
|
||||
|
||||
|
||||
def _run_engine(action: str, args: dict) -> str:
|
||||
py = sys.executable or "python3"
|
||||
cmd = [py, "-m", "skillopt_sleep", action]
|
||||
if args.get("project"):
|
||||
cmd += ["--project", str(args["project"])]
|
||||
if args.get("backend"):
|
||||
cmd += ["--backend", str(args["backend"])]
|
||||
if args.get("scope"):
|
||||
cmd += ["--scope", str(args["scope"])]
|
||||
try:
|
||||
proc = subprocess.run(cmd, cwd=REPO_ROOT, capture_output=True, text=True, timeout=3600)
|
||||
except Exception as e: # noqa: BLE001
|
||||
return f"[error] failed to run engine: {e}"
|
||||
out = (proc.stdout or "").strip()
|
||||
err = (proc.stderr or "").strip()
|
||||
return out + (("\n[stderr]\n" + err) if err else "")
|
||||
|
||||
|
||||
def _result(id_, result):
|
||||
return {"jsonrpc": "2.0", "id": id_, "result": result}
|
||||
|
||||
|
||||
def _error(id_, code, message):
|
||||
return {"jsonrpc": "2.0", "id": id_, "error": {"code": code, "message": message}}
|
||||
|
||||
|
||||
def handle(req: dict):
|
||||
method = req.get("method")
|
||||
id_ = req.get("id")
|
||||
if method == "initialize":
|
||||
return _result(id_, {
|
||||
"protocolVersion": PROTOCOL_VERSION,
|
||||
"capabilities": {"tools": {}},
|
||||
"serverInfo": {"name": "skillopt-sleep", "version": "0.1.0"},
|
||||
})
|
||||
if method in ("notifications/initialized", "initialized"):
|
||||
return None # notification, no response
|
||||
if method == "tools/list":
|
||||
return _result(id_, {"tools": [
|
||||
{"name": t["name"], "description": t["description"], "inputSchema": _TOOL_SCHEMA}
|
||||
for t in TOOLS
|
||||
]})
|
||||
if method == "tools/call":
|
||||
params = req.get("params") or {}
|
||||
name = params.get("name")
|
||||
tool = _BY_NAME.get(name)
|
||||
if not tool:
|
||||
return _error(id_, -32602, f"unknown tool: {name}")
|
||||
text = _run_engine(tool["action"], params.get("arguments") or {})
|
||||
return _result(id_, {"content": [{"type": "text", "text": text}]})
|
||||
if method == "ping":
|
||||
return _result(id_, {})
|
||||
return _error(id_, -32601, f"method not found: {method}")
|
||||
|
||||
|
||||
def main() -> int:
|
||||
for line in sys.stdin:
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
req = json.loads(line)
|
||||
except Exception:
|
||||
continue
|
||||
resp = handle(req)
|
||||
if resp is not None:
|
||||
sys.stdout.write(json.dumps(resp) + "\n")
|
||||
sys.stdout.flush()
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
46
plugins/run-sleep.sh
Executable file
46
plugins/run-sleep.sh
Executable file
@@ -0,0 +1,46 @@
|
||||
#!/usr/bin/env bash
|
||||
# SkillOpt-Sleep shared runner — used by all platform plugins (Claude Code,
|
||||
# Codex, Copilot). Resolves the repo root (which contains the skillopt_sleep
|
||||
# package), picks a Python >= 3.10, and execs the engine CLI.
|
||||
#
|
||||
# Usage: run-sleep.sh <run|dry-run|status|adopt|harvest|...> [args...]
|
||||
set -euo pipefail
|
||||
|
||||
# This script lives at <repo>/plugins/run-sleep.sh, so the repo root (which
|
||||
# holds skillopt_sleep/) is one level up. CLAUDE_PLUGIN_ROOT (if set by Claude
|
||||
# Code) points at the plugin dir; the engine is then two levels above it.
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
if [ -d "$SCRIPT_DIR/../skillopt_sleep" ]; then
|
||||
REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
|
||||
elif [ -n "${CLAUDE_PLUGIN_ROOT:-}" ] && [ -d "$CLAUDE_PLUGIN_ROOT/../../skillopt_sleep" ]; then
|
||||
REPO_ROOT="$(cd "$CLAUDE_PLUGIN_ROOT/../.." && pwd)"
|
||||
elif [ -n "${SKILLOPT_SLEEP_REPO:-}" ] && [ -d "$SKILLOPT_SLEEP_REPO/skillopt_sleep" ]; then
|
||||
REPO_ROOT="$SKILLOPT_SLEEP_REPO"
|
||||
else
|
||||
# last resort: search upward from CWD
|
||||
d="$PWD"
|
||||
while [ "$d" != "/" ]; do
|
||||
[ -d "$d/skillopt_sleep" ] && { REPO_ROOT="$d"; break; }
|
||||
d="$(dirname "$d")"
|
||||
done
|
||||
fi
|
||||
if [ -z "${REPO_ROOT:-}" ]; then
|
||||
echo "[sleep] ERROR: could not locate the skillopt_sleep package. Set SKILLOPT_SLEEP_REPO to the repo root." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
PY=""
|
||||
for cand in python3.12 python3.11 python3.10 python3; do
|
||||
if command -v "$cand" >/dev/null 2>&1; then
|
||||
ver="$("$cand" -c 'import sys; print("%d%d" % sys.version_info[:2])' 2>/dev/null || echo 0)"
|
||||
if [ "${ver:-0}" -ge 310 ]; then PY="$cand"; break; fi
|
||||
fi
|
||||
done
|
||||
if [ -z "$PY" ]; then
|
||||
echo "[sleep] ERROR: need Python >= 3.10 (found none)." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [ "$#" -eq 0 ]; then set -- status; fi
|
||||
cd "$REPO_ROOT"
|
||||
exec "$PY" -m skillopt_sleep "$@"
|
||||
Reference in New Issue
Block a user