Files
github-spec-kit/workflows
Copilot c52ccd7dc7 Add workflow step catalog — community-installable step types (#2394)
* Initial plan

* Add workflow step catalog: StepRegistry, StepCatalog, CLI commands, and tests

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/2885e646-477d-4df8-b9a3-06d8cb29e748

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Potential fix for pull request finding 'An assert statement has a side-effect'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Address PR review: path traversal, cache robustness, collision check, failed-to-load display

- Add resolve()+relative_to() path traversal guards in workflow_step_add and
  workflow_step_remove to prevent directory escape via step_id
- Harden _is_url_cache_valid in both StepCatalog and WorkflowCatalog to
  coerce fetched_at to float and catch TypeError/ValueError
- Check STEP_REGISTRY and StepRegistry before installing to prevent
  collisions with built-in step types or already-installed steps
- Show 'Custom (installed, failed to load)' section in workflow step list
  for steps in the registry that failed to load into STEP_REGISTRY

* Fix StepRegistry shape validation and StepCatalog empty-YAML handling

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Polish: rename _default to default_registry, strengthen unreadable-file test

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Address PR review: atomic install, hostname validation, cache resilience, no dynamic imports in list/info

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Fix shutil.move with existing step_dir: remove before move to avoid subdirectory nesting

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Call load_custom_steps at execution time; enforce hostname in _safe_fetch and _validate_url

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/73865880-fb25-4061-a43e-4e4b4d1c4de6

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Wrap YAML parsing in try/except; atomic step install via os.rename() under same fs

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/ff915bc5-ec7e-4e6a-b505-35f5795250df

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Validate YAML root is a dict in _load_catalog_config and workflow_step_add; fix WorkflowCatalog hostname validation

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Fix load_custom_steps() package imports and add reserved step ID validation

* Move _re/_sys imports out of loop and _RESERVED_STEP_IDS to module level

* Address review: collision-resistant module names, extra_files support, remove orphan dir

* Harden extra_files: warn on non-dict, resolve symlinks in path traversal check

* Switch _safe_fetch and StepCatalog._fetch_single_catalog to use open_url for auth consistency

* Harden step_id validation against path-segment tricks; raise on StepRegistry.save() OSError

* Clean up sys.modules on broken step packages; handle StepValidationError in step add/remove

* Address review thread: int-coerce priorities, sys.modules cleanup, _require_specify_project, registry-first remove

* fix: normalize workflow step catalog metadata fallbacks

* fix: address latest workflow step and catalog review findings

* Handle non-string extra_files keys in workflow step add

* Harden StepRegistry symlink reads and extra_files path/URL validation

* Harden custom step loader and step remove against symlinks and OSError

* Fix StepCatalog.search() to coerce non-string fields before joining

* Fix WorkflowCatalog YAML parsing error handling and isinstance checks

* Harden step registry save and custom step/catalog ID handling

* Harden cache validation and staging OSError handling

* Address review: reorder symlink guard and split mixed test

- Move symlink-parent check before is_dir() in load_custom_steps() so
  we never stat an external target through a symlink
- Split test_get_merged_steps_normalizes_list_ids_to_strings into two
  focused tests: one for list-id normalization, one for get_step_info
  return values

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review: symlink-before-stat in loader, restore registry on rmtree failure

- load_custom_steps(): check is_symlink() before is_dir() on step
  directories so symlinked entries are skipped without statting external
  targets
- workflow_step_remove: restore the registry entry when shutil.rmtree()
  fails so filesystem and registry state stay consistent and a future
  'step add' isn't blocked

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Harden step_id validation and file-write error handling

- _validate_step_id_or_exit: reject whitespace-only/padded IDs,
  Windows-invalid characters (<>:"|?*), control characters, trailing
  dots/spaces, and Windows reserved device names (con, nul, etc.)
- Wrap step.yml/__init__.py staging writes in OSError handler
- Wrap extra_files disk writes (mkdir + write_bytes) in OSError handler
  that names the failing relative path
- Registry rollback on rmtree failure: restore verbatim metadata and
  emit a warning if the restore itself fails

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review: cache symlink guard, verbatim registry rollback, Windows test fix

- StepCatalog: add _is_cache_path_safe() guard that checks for symlinks
  in .specify/workflows/steps/.cache path; skip cache reads and writes
  when any component is symlinked to prevent writes outside project root
- Registry rollback: write metadata directly to registry.data['steps']
  and call save() instead of using add() which overwrites timestamps
- temp_dir fixture: use ignore_errors=True on Windows to avoid flaky
  teardown from locked file handles (WinError 32)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Simplify exec_module call by removing redundant nested try/except

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix empty YAML tolerance in WorkflowCatalog.add_catalog, scope ignore_errors to Windows

- WorkflowCatalog.add_catalog(): treat None from yaml.safe_load() (empty
  file) as an empty mapping instead of raising 'corrupted'
- temp_dir fixture: limit ignore_errors to sys.platform == 'win32' so
  real cleanup issues surface on non-Windows platforms

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Chain exceptions in _load_catalog_config for both catalog classes

Add 'from exc' to preserve root cause in tracebacks while keeping
clean user-facing messages.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Make default catalog tests hermetic by isolating HOME

Monkeypatch Path.home() to project_dir and clear catalog env vars so
tests don't break on machines with a real ~/.specify/step-catalogs.yml
or ~/.specify/workflow-catalogs.yml.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix falsy ID handling in _get_merged_steps for list-based catalogs

Check for None explicitly instead of using 'or' which drops valid
falsy IDs like 0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Compare reserved step IDs case-insensitively for filesystem safety

On case-insensitive filesystems (Windows, common macOS), variants like
STEP-REGISTRY.JSON would collide with the actual registry file.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add explanatory comments to intentional empty except blocks

Document why cache-read failures are silently ignored in both
WorkflowCatalog and StepCatalog _fetch_single_catalog methods.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-16 18:03:45 -05:00
..

Workflows

Workflows are multi-step, resumable automation pipelines defined in YAML. They orchestrate Spec Kit commands across integrations, evaluate control flow, and pause at human review gates — enabling end-to-end Spec-Driven Development cycles without manual step-by-step invocation.

How It Works

A workflow definition declares a sequence of steps. The engine executes them in order, dispatching commands to AI integrations, running shell commands, evaluating conditions for branching, and pausing at gates for human review. State is persisted after each step, so workflows can be resumed after interruption.

steps:
  - id: specify
    command: speckit.specify
    input:
      args: "{{ inputs.spec }}"

  - id: review
    type: gate
    message: "Review the spec before planning."
    options: [approve, reject]
    on_reject: abort

  - id: plan
    command: speckit.plan

For detailed architecture and internals, see ARCHITECTURE.md.

Quick Start

# Search available workflows
specify workflow search

# Install the built-in SDD workflow
specify workflow add speckit

# Or run directly from a local YAML file
specify workflow run ./workflow.yml --input spec="Build a user authentication system with OAuth support"

# Run an installed workflow with inputs
specify workflow run speckit --input spec="Build a user authentication system with OAuth support"

# Check run status
specify workflow status

# Resume after a gate pause
specify workflow resume <run_id>

# Get detailed workflow info
specify workflow info speckit

# Remove a workflow
specify workflow remove speckit

Running Workflows

From an Installed Workflow

specify workflow add speckit
specify workflow run speckit --input spec="Build a user authentication system with OAuth support"

From a Local YAML File

specify workflow run ./my-workflow.yml --input spec="Build a user authentication system with OAuth support"

Multiple Inputs

specify workflow run speckit \
  --input spec="Build a user authentication system with OAuth support" \
  --input scope="backend-only"

Step Types

Workflows support 10 built-in step types:

Command Steps (default)

Invoke an installed Spec Kit command by name via the integration CLI:

- id: specify
  command: speckit.specify
  input:
    args: "{{ inputs.spec }}"
  integration: claude        # Optional: override workflow default
  model: "claude-sonnet-4-20250514"   # Optional: override model

Prompt Steps

Send an arbitrary inline prompt to an integration CLI (no command file needed):

- id: security-review
  type: prompt
  prompt: "Review {{ inputs.file }} for security vulnerabilities"
  integration: claude

Shell Steps

Run a shell command and capture output:

- id: run-tests
  type: shell
  run: "cd {{ inputs.project_dir }} && npm test"

Gate Steps

Pause for human review. The workflow resumes when specify workflow resume is called:

- id: review-spec
  type: gate
  message: "Review the generated spec before planning."
  options: [approve, edit, reject]
  on_reject: abort

If/Then/Else Steps

Conditional branching based on an expression:

- id: check-scope
  type: if
  condition: "{{ inputs.scope == 'full' }}"
  then:
    - id: full-plan
      command: speckit.plan
  else:
    - id: quick-plan
      command: speckit.plan
      options:
        quick: true

Switch Steps

Multi-branch dispatch on an expression value:

- id: route
  type: switch
  expression: "{{ steps.review.output.choice }}"
  cases:
    approve:
      - id: plan
        command: speckit.plan
    reject:
      - id: log
        type: shell
        run: "echo 'Rejected'"
  default:
    - id: fallback
      type: gate
      message: "Unexpected choice"

While Loop Steps

Repeat steps while a condition is truthy:

- id: retry
  type: while
  condition: "{{ steps.run-tests.output.exit_code != 0 }}"
  max_iterations: 5
  steps:
    - id: fix
      command: speckit.implement

Do-While Loop Steps

Execute steps at least once, then repeat while condition holds:

- id: refine
  type: do-while
  condition: "{{ steps.review.output.choice == 'edit' }}"
  max_iterations: 3
  steps:
    - id: revise
      command: speckit.specify

Fan-Out Steps

Dispatch a step template for each item in a collection (sequential):

- id: parallel-impl
  type: fan-out
  items: "{{ steps.tasks.output.task_list }}"
  max_concurrency: 3
  step:
    id: impl
    command: speckit.implement

Fan-In Steps

Aggregate results from fan-out steps:

- id: collect
  type: fan-in
  wait_for: [parallel-impl]
  output: {}

Error Handling

By default, any step that returns StepResult(status=StepStatus.FAILED, ...) at runtime halts the entire run — most commonly a shell or command step exiting non-zero. Set continue_on_error: true on a step to record its result and continue to the next sibling step instead. When the failure was a non-zero exit, the exit code remains available on steps.<id>.output.exit_code so a downstream if or switch can branch on it (or a gate can surface it to the operator via {{ }} interpolation in message):

- id: heavy-thing
  type: command
  integration: claude
  command: speckit.heavy-thing
  continue_on_error: true

- id: check-result
  type: if
  condition: "{{ steps.heavy-thing.output.exit_code != 0 }}"
  then:
    - id: review
      type: gate
      message: "Step failed (exit {{ steps.heavy-thing.output.exit_code }}). Approve to run the recovery path, or reject to leave the failure recorded and move on."
      on_reject: skip
    - id: recover
      type: if
      condition: "{{ steps.review.output.choice == 'approve' }}"
      then:
        - id: rerun
          command: speckit.recovery
  else:
    - id: next-thing
      command: speckit.next-thing

A few things worth knowing about that example:

  • Both gate options (approve, reject) return StepStatus.COMPLETED; on_reject: skip controls only whether the engine aborts on reject (it doesn't, with skip) — it does not auto-skip subsequent sibling steps in the then: list. Downstream branching is the workflow author's responsibility: read {{ steps.<gate-id>.output.choice }} in a follow-up if, switch, or expression, as the recover step above does.
  • on_reject has three values: abort (default — reject → StepStatus.FAILED with output.aborted = True, halts the run), skip (reject → StepStatus.COMPLETED, author handles branching as shown), and retry (reject → StepStatus.PAUSED so the next specify workflow resume re-runs the gate).
  • Gates do not automatically re-run the failed step. To express a retry path, either define custom gate options and branch on the choice downstream, or wrap the failing step in your own loop.

Notes:

  • The field must be a literal boolean (true / false); coerced strings like "true" are rejected at validation time.
  • Scope: returned failures only. The flag applies to step results with status=StepStatus.FAILED. Unhandled exceptions raised out of a step's execute() method are caught one level up by WorkflowEngine.execute(), logged as workflow_failed, and abort the run regardless of continue_on_error. If a step author wants the flag to cover an exceptional path, the step must catch the exception internally and return StepResult(status=StepStatus.FAILED, ...) with the failure encoded in output (e.g. exit_code, stderr, or a custom field).
  • Gate aborts (on_reject: abort chosen by the operator) always halt the run — continue_on_error does not override them. The flag is for transient/expected step failures, not for overriding deliberate operator decisions.
  • Structural validation runs up-front: specify workflow run rejects invalid workflow definitions before the run is created, so validation failures never reach this code path.
  • When the flag is omitted, behaviour is byte-equivalent to before this feature.

Expressions

Workflow definitions use {{ expression }} syntax for dynamic values:

# Access inputs
args: "{{ inputs.spec }}"

# Access previous step outputs
args: "{{ steps.specify.output.file }}"

# Comparisons
condition: "{{ steps.run-tests.output.exit_code != 0 }}"

# Filters
message: "{{ status | default('pending') }}"

Supported filters: default, join, contains, map.

Runtime Context

{{ context.* }} exposes engine-managed runtime metadata for the current run:

Variable Description
context.run_id The current workflow run id (the same value Spec Kit prints as Run ID: at the end of workflow run). Auto-generated runs are 8-character hex from uuid4; operator-supplied ids may be any alphanumeric string with hyphens or underscores. Empty string outside a run context.
# Stamp telemetry events with the run id for cross-system join.
- id: emit-event
  type: shell
  run: 'echo "{\"run_id\":\"{{ context.run_id }}\",\"event\":\"started\"}" >> events.jsonl'

# Per-run scratch directory.
- id: prep-scratch
  type: shell
  run: 'mkdir -p /tmp/run-{{ context.run_id }}'

# Pass run id into a command for artifact metadata.
- id: tag-artifact
  command: speckit.specify
  input:
    args: "{{ context.run_id }}"

Input Types

Workflow inputs are type-checked and coerced from CLI string values:

inputs:
  spec:
    type: string
    required: true
    prompt: "Describe what you want to build"
  task_count:
    type: number
    default: 5
  dry_run:
    type: boolean
    default: false
  scope:
    type: string
    default: "full"
    enum: ["full", "backend-only", "frontend-only"]
Type Accepts Example
string Any string "user-auth"
number Numeric strings → int/float "42"42
boolean true/1/yesTrue, false/0/noFalse "true"True

State and Resume

Every workflow run persists state to .specify/workflows/runs/<run_id>/:

# List all runs with status
specify workflow status

# Check a specific run
specify workflow status <run_id>

# Resume a paused run (after approving a gate)
specify workflow resume <run_id>

# Resume a failed run (retries from the failed step)
specify workflow resume <run_id>

Run states: createdrunningcompleted | paused | failed | aborted

Catalog Management

Workflows are discovered through catalogs. By default, Spec Kit uses the official and community catalogs:

Note

Community workflows are independently created and maintained by their respective authors. GitHub and the Spec Kit maintainers may review pull requests that add entries to the community catalog for formatting and structure, but they do not review, audit, endorse, or support the workflow definitions themselves. Review workflow source before installation and use at your own discretion.

# List active catalogs
specify workflow catalog list

# Add a custom catalog
specify workflow catalog add https://example.com/catalog.json --name my-org

# Remove a catalog
specify workflow catalog remove <index>

Creating a Workflow

  1. Create a workflow.yml following the schema above
  2. Test locally with specify workflow run ./workflow.yml --input key=value
  3. Verify with specify workflow info ./workflow.yml
  4. See PUBLISHING.md to submit to the catalog

Environment Variables

Variable Description
SPECKIT_WORKFLOW_CATALOG_URL Override the catalog URL (replaces all defaults)

Configuration Files

File Scope Description
.specify/workflow-catalogs.yml Project Custom catalog stack for this project
~/.specify/workflow-catalogs.yml User Custom catalog stack for all projects

Repository Layout

workflows/
├── ARCHITECTURE.md                         # Internal architecture documentation
├── PUBLISHING.md                           # Guide for submitting workflows to the catalog
├── README.md                               # This file
├── catalog.json                            # Official workflow catalog
├── catalog.community.json                  # Community workflow catalog
└── speckit/                                # Built-in SDD cycle workflow
    └── workflow.yml