mirror of
https://github.com/github/spec-kit.git
synced 2026-07-03 12:28:06 +08:00
* feat(workflows): add continue_on_error step field
Adds an optional `continue_on_error: bool` field on every step.
When set to `true` and the step fails, the engine records the
result (`exit_code`, `stderr` on `steps.<id>.output` plus `status`
as a sibling key on `steps.<id>`) and continues to the next sibling
step instead of halting the run. Downstream `if`, `switch`, or
`gate` steps can then branch on
`{{ steps.<id>.output.exit_code }}` to route the recovery path.
Engine details
--------------
`WorkflowEngine._execute_steps` now consults the step config when a
step returns `StepStatus.FAILED`:
- Gate aborts (`output.aborted`) always halt the run — operator
decisions take precedence over the flag.
- Otherwise, if `continue_on_error` is the literal `True`, log a
`step_continue_on_error` event and proceed to the next sibling.
The runtime check uses identity comparison (`is True`) rather
than truthiness, so truthy non-bool values like the string
`"true"` cannot silently change run semantics even if a caller
bypasses `validate_workflow()`.
- Otherwise, behave as before: log `step_failed`, set
`RunStatus.FAILED`, and return.
Validation
----------
`_validate_steps` rejects non-bool values for `continue_on_error`.
Coerced strings like `"true"` are not accepted so authoring
mistakes surface at validation time rather than silently changing
run semantics.
Tests
-----
`TestContinueOnError` in `tests/test_workflows.py` (8 tests):
- `test_undeclared_failure_halts_run` — default halt behaviour.
- `test_declared_and_fired_continues_run` — flag + fail → continue.
- `test_declared_but_step_succeeded_is_noop` — flag + success → no-op.
- `test_if_branch_routes_around_failure` — end-to-end recovery.
- `test_gate_abort_still_halts_with_continue_on_error` — abort
always halts.
- `test_validation_rejects_non_bool_continue_on_error` — `"true"`
rejected at validation.
- `test_validation_accepts_bool_continue_on_error` — `true`/`false`
pass cleanly.
- `test_engine_ignores_truthy_non_bool_continue_on_error` —
defense-in-depth: engine ignores string `"true"` even when
validation is bypassed.
Rebased onto current upstream/main (post #2664 merge); the new
`TestContinueOnError` class sits immediately after upstream's
`TestContextRunId` so the two feature suites coexist cleanly.
Closes #2591.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(workflows): restore runtime context section, clarify gate prompt
Two Copilot findings on d0b9e00:
1. The `### Runtime Context` documentation for `{{ context.* }}` was
lost during the rebase onto current main (the squash dropped the
anchor where #2664 had added it). Restored under `## Expressions`
so users can find `context.run_id` semantics and examples.
2. The continue_on_error example gate had message "Retry or skip?"
but used the default `options: [approve, reject]` with `on_reject:
skip`, which implied an automatic retry path that gates do not
provide. Reworded the message to match the actual approve/reject
semantics and added an explicit note that retry requires either
custom gate options + downstream branching or a wrapper loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(workflows): clarify continue_on_error scope — returned FAILED only
Copilot finding on d0b9e00:
The README's "Error Handling" intro implied `continue_on_error` covers
"any other runtime error raised during step execution", but the engine
only consults the flag when a step returns `StepResult(status=FAILED, ...)`.
Exceptions raised out of `step_impl.execute()` propagate to
`WorkflowEngine.execute()`, where the catch-all logs `workflow_failed`
and re-raises — the step result is never recorded, and the flag is
never consulted.
Audited the whole PR diff for the same overclaim:
1. workflows/README.md — main fix. Reworded the Error Handling intro to
"any step that returns StepResult(status=FAILED, ...)" and promoted
the parenthetical structural-validation note into the Notes block.
Added a new "Scope: returned failures only" note that names the
exception path explicitly and tells step authors how to bring the
flag into scope for exceptional code (catch internally and return
FAILED with the failure encoded in `output`).
2. tests/test_workflows.py — section comment used "when an executable
step fails", same ambiguity. Tightened to "when a step returns
StepResult(status=FAILED, ...)" and added a sentence calling out
that unhandled exceptions are out of scope.
3. src/specify_cli/workflows/engine.py — already correct ("any step
that returns FAILED" in the validator comment; "lets the pipeline
route around the failure" in the execute path). No change.
Engine semantics and test bodies are unchanged. Docs-only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(workflows): clarify on_reject:skip semantics — engine returns COMPLETED, not auto-skip
Copilot finding on b8982a7:
The README example's gate message said "reject to skip the rest of this
branch", and the explanatory paragraph claimed [approve, reject] map
to "continue" vs "skip the rest of this branch". The engine does not
implement automatic branch-skipping. `on_reject: skip` returns
`StepStatus.COMPLETED` (gate/__init__.py:65-66); the next sibling step
runs unconditionally unless the author wires a downstream `if` reading
`{{ steps.<gate-id>.output.choice }}`.
Two fixes:
1. Restructured the YAML example so it actually demonstrates the
manual-branching pattern: added a `recover` if-step after the gate
that conditions on `steps.review.output.choice == 'approve'`. Now
the example shows the real workflow author's responsibility instead
of implying the engine does it.
2. Replaced the trailing paragraph with three precise notes:
- both gate options return COMPLETED; `on_reject: skip` controls
abort behaviour only, not sibling-skipping
- all three `on_reject` values enumerated with their actual engine
semantics (FAILED+aborted / COMPLETED / PAUSED)
- the original retry-loop guidance retained as the third bullet
Updated the gate message in the example to match — "reject to leave the
failure recorded and move on" instead of "reject to skip the rest of
this branch".
Audited the whole PR diff for the same overclaim: no other instance.
Engine semantics, validation, and test bodies are unchanged. Docs-only.
161/161 tests/test_workflows.py pass locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(workflows): clarify gate's role — surfaces, doesn't programmatically branch
Audit follow-up to 393ac6b — three sites repeated the same minor
overclaim about gates being one of the "branch on it" step types
alongside `if` and `switch`:
1. workflows/README.md (the "downstream `if`, `switch`, or `gate`
steps can branch on it" sentence introducing the example)
2. engine.py:236 (validator inline comment)
3. engine.py:657 (execute-path inline comment)
A `gate` step does not have a `condition` or `expression` field — it
only evaluates expressions for `message` and `show_file` (gate/__init__.py:29,36).
Programmatic branching happens in `if`/`switch`; a gate surfaces the
value to a human operator via message interpolation, and the operator's
choice is recorded in `output.choice` for a *subsequent* `if`/`switch`
to route on.
Reworded all three sites consistently: "a downstream `if` or `switch`
can branch on it (or a `gate` can surface it to the operator via
message interpolation)". The README example already demonstrates this
distinction — the gate carries `{{ }}` template variables in its
message and the `recover` if-step downstream is what actually branches
on the choice.
Engine semantics, validation, and test bodies are unchanged. Docs-only
on the README; comment-only on engine.py.
161/161 tests/test_workflows.py pass locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(workflows): use qualified StepStatus.* instead of bare FAILED/COMPLETED/PAUSED
Three Copilot inline comments on workflows/README.md lines 226, 282, 288
flagged that ``StepResult(status=FAILED, ...)`` is not valid Python —
``StepResult.status`` is a ``StepStatus`` enum value, so the
documented form should be ``StepStatus.FAILED``.
Audited the whole PR diff for the same shorthand. The bare unqualified
form appears in three files added/modified by this PR:
1. workflows/README.md (6 sites) — three ``StepResult(status=FAILED, ...)``
parentheticals, plus the on_reject Notes bullet listing the three
step statuses (``FAILED``, ``COMPLETED``, ``PAUSED``).
2. tests/test_workflows.py (4 sites) — section header for
TestContinueOnError, two test-method docstrings, one inline comment
about a gate's TTY-fallback behaviour.
3. src/specify_cli/workflows/engine.py (1 site) — the validator inline
comment added in d0b9e00 said "returns FAILED" where the engine
code itself uses ``StepStatus.FAILED``.
All 11 sites normalised to the qualified ``StepStatus.<name>`` form so
the docs / test docstrings / inline comments match what readers will
actually find in the engine code and the tests. Engine semantics,
validation, and test bodies are unchanged.
161/161 tests/test_workflows.py pass locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
444 lines
12 KiB
Markdown
444 lines
12 KiB
Markdown
# Workflows
|
|
|
|
Workflows are multi-step, resumable automation pipelines defined in YAML. They orchestrate Spec Kit commands across integrations, evaluate control flow, and pause at human review gates — enabling end-to-end Spec-Driven Development cycles without manual step-by-step invocation.
|
|
|
|
## How It Works
|
|
|
|
A workflow definition declares a sequence of steps. The engine executes them in order, dispatching commands to AI integrations, running shell commands, evaluating conditions for branching, and pausing at gates for human review. State is persisted after each step, so workflows can be resumed after interruption.
|
|
|
|
```yaml
|
|
steps:
|
|
- id: specify
|
|
command: speckit.specify
|
|
input:
|
|
args: "{{ inputs.spec }}"
|
|
|
|
- id: review
|
|
type: gate
|
|
message: "Review the spec before planning."
|
|
options: [approve, reject]
|
|
on_reject: abort
|
|
|
|
- id: plan
|
|
command: speckit.plan
|
|
```
|
|
|
|
For detailed architecture and internals, see [ARCHITECTURE.md](ARCHITECTURE.md).
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Search available workflows
|
|
specify workflow search
|
|
|
|
# Install the built-in SDD workflow
|
|
specify workflow add speckit
|
|
|
|
# Or run directly from a local YAML file
|
|
specify workflow run ./workflow.yml --input spec="Build a user authentication system with OAuth support"
|
|
|
|
# Run an installed workflow with inputs
|
|
specify workflow run speckit --input spec="Build a user authentication system with OAuth support"
|
|
|
|
# Check run status
|
|
specify workflow status
|
|
|
|
# Resume after a gate pause
|
|
specify workflow resume <run_id>
|
|
|
|
# Get detailed workflow info
|
|
specify workflow info speckit
|
|
|
|
# Remove a workflow
|
|
specify workflow remove speckit
|
|
```
|
|
|
|
## Running Workflows
|
|
|
|
### From an Installed Workflow
|
|
|
|
```bash
|
|
specify workflow add speckit
|
|
specify workflow run speckit --input spec="Build a user authentication system with OAuth support"
|
|
```
|
|
|
|
### From a Local YAML File
|
|
|
|
```bash
|
|
specify workflow run ./my-workflow.yml --input spec="Build a user authentication system with OAuth support"
|
|
```
|
|
|
|
### Multiple Inputs
|
|
|
|
```bash
|
|
specify workflow run speckit \
|
|
--input spec="Build a user authentication system with OAuth support" \
|
|
--input scope="backend-only"
|
|
```
|
|
|
|
## Step Types
|
|
|
|
Workflows support 10 built-in step types:
|
|
|
|
### Command Steps (default)
|
|
|
|
Invoke an installed Spec Kit command by name via the integration CLI:
|
|
|
|
```yaml
|
|
- id: specify
|
|
command: speckit.specify
|
|
input:
|
|
args: "{{ inputs.spec }}"
|
|
integration: claude # Optional: override workflow default
|
|
model: "claude-sonnet-4-20250514" # Optional: override model
|
|
```
|
|
|
|
### Prompt Steps
|
|
|
|
Send an arbitrary inline prompt to an integration CLI (no command file needed):
|
|
|
|
```yaml
|
|
- id: security-review
|
|
type: prompt
|
|
prompt: "Review {{ inputs.file }} for security vulnerabilities"
|
|
integration: claude
|
|
```
|
|
|
|
### Shell Steps
|
|
|
|
Run a shell command and capture output:
|
|
|
|
```yaml
|
|
- id: run-tests
|
|
type: shell
|
|
run: "cd {{ inputs.project_dir }} && npm test"
|
|
```
|
|
|
|
### Gate Steps
|
|
|
|
Pause for human review. The workflow resumes when `specify workflow resume` is called:
|
|
|
|
```yaml
|
|
- id: review-spec
|
|
type: gate
|
|
message: "Review the generated spec before planning."
|
|
options: [approve, edit, reject]
|
|
on_reject: abort
|
|
```
|
|
|
|
### If/Then/Else Steps
|
|
|
|
Conditional branching based on an expression:
|
|
|
|
```yaml
|
|
- id: check-scope
|
|
type: if
|
|
condition: "{{ inputs.scope == 'full' }}"
|
|
then:
|
|
- id: full-plan
|
|
command: speckit.plan
|
|
else:
|
|
- id: quick-plan
|
|
command: speckit.plan
|
|
options:
|
|
quick: true
|
|
```
|
|
|
|
### Switch Steps
|
|
|
|
Multi-branch dispatch on an expression value:
|
|
|
|
```yaml
|
|
- id: route
|
|
type: switch
|
|
expression: "{{ steps.review.output.choice }}"
|
|
cases:
|
|
approve:
|
|
- id: plan
|
|
command: speckit.plan
|
|
reject:
|
|
- id: log
|
|
type: shell
|
|
run: "echo 'Rejected'"
|
|
default:
|
|
- id: fallback
|
|
type: gate
|
|
message: "Unexpected choice"
|
|
```
|
|
|
|
### While Loop Steps
|
|
|
|
Repeat steps while a condition is truthy:
|
|
|
|
```yaml
|
|
- id: retry
|
|
type: while
|
|
condition: "{{ steps.run-tests.output.exit_code != 0 }}"
|
|
max_iterations: 5
|
|
steps:
|
|
- id: fix
|
|
command: speckit.implement
|
|
```
|
|
|
|
### Do-While Loop Steps
|
|
|
|
Execute steps at least once, then repeat while condition holds:
|
|
|
|
```yaml
|
|
- id: refine
|
|
type: do-while
|
|
condition: "{{ steps.review.output.choice == 'edit' }}"
|
|
max_iterations: 3
|
|
steps:
|
|
- id: revise
|
|
command: speckit.specify
|
|
```
|
|
|
|
### Fan-Out Steps
|
|
|
|
Dispatch a step template for each item in a collection (sequential):
|
|
|
|
```yaml
|
|
- id: parallel-impl
|
|
type: fan-out
|
|
items: "{{ steps.tasks.output.task_list }}"
|
|
max_concurrency: 3
|
|
step:
|
|
id: impl
|
|
command: speckit.implement
|
|
```
|
|
|
|
### Fan-In Steps
|
|
|
|
Aggregate results from fan-out steps:
|
|
|
|
```yaml
|
|
- id: collect
|
|
type: fan-in
|
|
wait_for: [parallel-impl]
|
|
output: {}
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
By default, any step that returns `StepResult(status=StepStatus.FAILED, ...)`
|
|
at runtime halts the entire run — most commonly a `shell` or
|
|
`command` step exiting non-zero. Set `continue_on_error: true` on
|
|
a step to record its result and continue to the next sibling step
|
|
instead. When the failure was a non-zero exit, the exit code
|
|
remains available on `steps.<id>.output.exit_code` so a downstream
|
|
`if` or `switch` can branch on it (or a `gate` can surface it to
|
|
the operator via `{{ }}` interpolation in `message`):
|
|
|
|
```yaml
|
|
- id: heavy-thing
|
|
type: command
|
|
integration: claude
|
|
command: speckit.heavy-thing
|
|
continue_on_error: true
|
|
|
|
- id: check-result
|
|
type: if
|
|
condition: "{{ steps.heavy-thing.output.exit_code != 0 }}"
|
|
then:
|
|
- id: review
|
|
type: gate
|
|
message: "Step failed (exit {{ steps.heavy-thing.output.exit_code }}). Approve to run the recovery path, or reject to leave the failure recorded and move on."
|
|
on_reject: skip
|
|
- id: recover
|
|
type: if
|
|
condition: "{{ steps.review.output.choice == 'approve' }}"
|
|
then:
|
|
- id: rerun
|
|
command: speckit.recovery
|
|
else:
|
|
- id: next-thing
|
|
command: speckit.next-thing
|
|
```
|
|
|
|
A few things worth knowing about that example:
|
|
|
|
- Both gate options (`approve`, `reject`) return `StepStatus.COMPLETED`;
|
|
`on_reject: skip` controls only whether the engine aborts on reject
|
|
(it doesn't, with `skip`) — it does **not** auto-skip subsequent
|
|
sibling steps in the `then:` list. Downstream branching is the
|
|
workflow author's responsibility: read
|
|
`{{ steps.<gate-id>.output.choice }}` in a follow-up `if`, `switch`,
|
|
or expression, as the `recover` step above does.
|
|
- `on_reject` has three values: `abort` (default — reject → `StepStatus.FAILED`
|
|
with `output.aborted = True`, halts the run), `skip` (reject →
|
|
`StepStatus.COMPLETED`, author handles branching as shown), and `retry`
|
|
(reject → `StepStatus.PAUSED` so the next `specify workflow resume` re-runs
|
|
the gate).
|
|
- Gates do not automatically re-run the failed step. To express a
|
|
retry path, either define custom gate options and branch on the
|
|
choice downstream, or wrap the failing step in your own loop.
|
|
|
|
**Notes:**
|
|
|
|
- The field must be a literal boolean (`true` / `false`); coerced
|
|
strings like `"true"` are rejected at validation time.
|
|
- **Scope: returned failures only.** The flag applies to step results
|
|
with `status=StepStatus.FAILED`. Unhandled exceptions raised out of a step's
|
|
`execute()` method are caught one level up by `WorkflowEngine.execute()`,
|
|
logged as `workflow_failed`, and abort the run regardless of
|
|
`continue_on_error`. If a step author wants the flag to cover an
|
|
exceptional path, the step must catch the exception internally and
|
|
return `StepResult(status=StepStatus.FAILED, ...)` with the failure encoded in
|
|
`output` (e.g. `exit_code`, `stderr`, or a custom field).
|
|
- Gate aborts (`on_reject: abort` chosen by the operator) always halt
|
|
the run — `continue_on_error` does not override them. The flag is
|
|
for transient/expected step failures, not for overriding deliberate
|
|
operator decisions.
|
|
- Structural validation runs up-front: `specify workflow run` rejects
|
|
invalid workflow definitions before the run is created, so
|
|
validation failures never reach this code path.
|
|
- When the flag is omitted, behaviour is byte-equivalent to before
|
|
this feature.
|
|
|
|
## Expressions
|
|
|
|
Workflow definitions use `{{ expression }}` syntax for dynamic values:
|
|
|
|
```yaml
|
|
# Access inputs
|
|
args: "{{ inputs.spec }}"
|
|
|
|
# Access previous step outputs
|
|
args: "{{ steps.specify.output.file }}"
|
|
|
|
# Comparisons
|
|
condition: "{{ steps.run-tests.output.exit_code != 0 }}"
|
|
|
|
# Filters
|
|
message: "{{ status | default('pending') }}"
|
|
```
|
|
|
|
Supported filters: `default`, `join`, `contains`, `map`.
|
|
|
|
### Runtime Context
|
|
|
|
`{{ context.* }}` exposes engine-managed runtime metadata for the
|
|
current run:
|
|
|
|
| Variable | Description |
|
|
|----------|-------------|
|
|
| `context.run_id` | The current workflow run id (the same value Spec Kit prints as `Run ID:` at the end of `workflow run`). Auto-generated runs are 8-character hex from `uuid4`; operator-supplied ids may be any alphanumeric string with hyphens or underscores. Empty string outside a run context. |
|
|
|
|
```yaml
|
|
# Stamp telemetry events with the run id for cross-system join.
|
|
- id: emit-event
|
|
type: shell
|
|
run: 'echo "{\"run_id\":\"{{ context.run_id }}\",\"event\":\"started\"}" >> events.jsonl'
|
|
|
|
# Per-run scratch directory.
|
|
- id: prep-scratch
|
|
type: shell
|
|
run: 'mkdir -p /tmp/run-{{ context.run_id }}'
|
|
|
|
# Pass run id into a command for artifact metadata.
|
|
- id: tag-artifact
|
|
command: speckit.specify
|
|
input:
|
|
args: "{{ context.run_id }}"
|
|
```
|
|
|
|
## Input Types
|
|
|
|
Workflow inputs are type-checked and coerced from CLI string values:
|
|
|
|
```yaml
|
|
inputs:
|
|
spec:
|
|
type: string
|
|
required: true
|
|
prompt: "Describe what you want to build"
|
|
task_count:
|
|
type: number
|
|
default: 5
|
|
dry_run:
|
|
type: boolean
|
|
default: false
|
|
scope:
|
|
type: string
|
|
default: "full"
|
|
enum: ["full", "backend-only", "frontend-only"]
|
|
```
|
|
|
|
| Type | Accepts | Example |
|
|
|------|---------|---------|
|
|
| `string` | Any string | `"user-auth"` |
|
|
| `number` | Numeric strings → int/float | `"42"` → `42` |
|
|
| `boolean` | `true`/`1`/`yes` → `True`, `false`/`0`/`no` → `False` | `"true"` → `True` |
|
|
|
|
## State and Resume
|
|
|
|
Every workflow run persists state to `.specify/workflows/runs/<run_id>/`:
|
|
|
|
```bash
|
|
# List all runs with status
|
|
specify workflow status
|
|
|
|
# Check a specific run
|
|
specify workflow status <run_id>
|
|
|
|
# Resume a paused run (after approving a gate)
|
|
specify workflow resume <run_id>
|
|
|
|
# Resume a failed run (retries from the failed step)
|
|
specify workflow resume <run_id>
|
|
```
|
|
|
|
Run states: `created` → `running` → `completed` | `paused` | `failed` | `aborted`
|
|
|
|
## Catalog Management
|
|
|
|
Workflows are discovered through catalogs. By default, Spec Kit uses the official and community catalogs:
|
|
|
|
> [!NOTE]
|
|
> Community workflows are independently created and maintained by their respective authors. GitHub and the Spec Kit maintainers may review pull requests that add entries to the community catalog for formatting and structure, but they do **not review, audit, endorse, or support the workflow definitions themselves**. Review workflow source before installation and use at your own discretion.
|
|
|
|
```bash
|
|
# List active catalogs
|
|
specify workflow catalog list
|
|
|
|
# Add a custom catalog
|
|
specify workflow catalog add https://example.com/catalog.json --name my-org
|
|
|
|
# Remove a catalog
|
|
specify workflow catalog remove <index>
|
|
```
|
|
|
|
## Creating a Workflow
|
|
|
|
1. Create a `workflow.yml` following the schema above
|
|
2. Test locally with `specify workflow run ./workflow.yml --input key=value`
|
|
3. Verify with `specify workflow info ./workflow.yml`
|
|
4. See [PUBLISHING.md](PUBLISHING.md) to submit to the catalog
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Description |
|
|
|----------|-------------|
|
|
| `SPECKIT_WORKFLOW_CATALOG_URL` | Override the catalog URL (replaces all defaults) |
|
|
|
|
## Configuration Files
|
|
|
|
| File | Scope | Description |
|
|
|------|-------|-------------|
|
|
| `.specify/workflow-catalogs.yml` | Project | Custom catalog stack for this project |
|
|
| `~/.specify/workflow-catalogs.yml` | User | Custom catalog stack for all projects |
|
|
|
|
## Repository Layout
|
|
|
|
```
|
|
workflows/
|
|
├── ARCHITECTURE.md # Internal architecture documentation
|
|
├── PUBLISHING.md # Guide for submitting workflows to the catalog
|
|
├── README.md # This file
|
|
├── catalog.json # Official workflow catalog
|
|
├── catalog.community.json # Community workflow catalog
|
|
└── speckit/ # Built-in SDD cycle workflow
|
|
└── workflow.yml
|
|
```
|