mirror of https://github.com/github/spec-kit.git synced 2026-07-03 12:28:06 +08:00

Files

Copilot 00bff788c9 Add init workflow step to bootstrap projects like specify init (#2838 )

* Initial plan

* Add init workflow step to bootstrap projects like `specify init`

* Address review: simplify stderr capture and extract VALID_SCRIPT_TYPES

* Address review: fail fast on non-empty dir, stdout fallback, README force fix

* Populate exit_code/stdout/stderr in non-empty-dir fast-fail

* fix: address three unresolved review comments in InitStep

- Use `with os.scandir(...)` context manager so the iterator is always
  closed even when `any()` short-circuits, preventing file-descriptor
  leaks in long-running workflow runs.
- Guard `os.chdir(prev_cwd)` in the `finally` block with a try/except
  so an `OSError` (e.g. directory deleted) doesn't bypass returning
  the captured `StepResult`.
- Reject non-string `script` values in `validate()` with a clear error
  message, rather than silently passing them through to become
  `--script True` at runtime.

* Potential fix for pull request finding 'Empty except'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* fix: remove no_git and branch_numbering options removed upstream

The --no-git and --branch-numbering flags were removed from `specify init`
on main. Update InitStep to drop these unsupported config fields and fix
tests accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address review — integration defaults, integration_options, engine-owned dirs

- Apply DEFAULT_INIT_INTEGRATION fallback when neither step config nor
  workflow context provides an integration, so output.integration always
  reflects the actual integration used.
- Add integration_options config field to support --integration-options
  passthrough (required for generic integration and --skills mode).
- Exclude .specify/ from the non-empty directory fast-fail check so that
  here: true works when the engine has already created its run-state
  directory before steps execute.
- Note: mix_stderr=False is not needed — Click 8.2+ captures stderr
  separately by default and the existing try/except handles access.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: implicitly add --force when only engine-owned dirs exist

When the workflow engine creates .specify/workflows/runs/ before steps
execute, the directory is technically non-empty. Previously, specify init
would prompt for confirmation (hanging in unattended mode) unless the
user explicitly set force: true. Now the step detects that only
engine-owned directories (.specify/) are present and implicitly adds
--force so init proceeds without user interaction.

Also fixes the test to exercise the implicit-force path rather than
passing force: True explicitly (which bypassed the check entirely).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: derive VALID_SCRIPT_TYPES from shared constant, fail fast on OSError, include all resolved fields in output

- Derive VALID_SCRIPT_TYPES from SCRIPT_TYPE_CHOICES in _agent_config
  so the valid set cannot drift from the specify init CLI.
- Fail fast with a clear error when os.scandir() raises OSError (e.g.
  permission denied) instead of silently treating the directory as empty.
- Include preset, force, and ignore_agent_tools in all output dicts
  (both fast-fail and normal paths) for consistent interpolation and
  debugging downstream.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: populate stderr from stdout on older Click, fix force comment wording

- When Click does not expose result.stderr (older versions where stderr
  is mixed into stdout), use stdout as stderr on non-zero exit so
  workflows can consistently read steps.<id>.output.stderr for errors.
- Update README inline comment for force: wording to say 'when target
  directory already exists' rather than 'non-empty directory', matching
  the actual specify init behavior for the project: form.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: build argv flags before early returns, use any() for dir scan

- Move argv flag-building (--integration, --script, --preset,
  --ignore-agent-tools) before the non-empty-dir and OSError early
  returns so output['argv'] always reflects the complete command.
- --force is appended after the check since it may be set implicitly.
- Replace list comprehension with any() generator expression to
  short-circuit without allocating a full list of DirEntry objects.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: only treat .specify as engine-owned when it is a real directory

A file or symlink named .specify should not be excluded from the
non-empty check. Use entry.is_dir(follow_symlinks=False) to ensure
only an actual directory is considered engine-owned content.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: guard implicit force for engine dirs only, fix integration fallback order

- Only set implicit --force when engine-owned directories (.specify/)
  are actually present. A completely empty directory no longer gets
  --force added unnecessarily.
- Fix integration resolution precedence: resolve step config expression
  first, then fall back to workflow default (also resolved), then to
  DEFAULT_INIT_INTEGRATION. Previously, a step expression resolving to
  falsy would bypass the workflow default entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Manfred Riem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

2026-06-17 11:46:51 -05:00

10 KiB

Raw Blame History

Workflow System Architecture

This document describes the internal architecture of the workflow engine — how definitions are parsed, steps are dispatched, state is persisted, and catalogs are resolved.

For usage instructions, see README.md.

Execution Model

When specify workflow run is invoked, the engine loads a YAML definition, resolves inputs, and dispatches steps sequentially through the step registry:

flowchart TD
    A["specify workflow run my-workflow"] --> B["WorkflowEngine.load_workflow()"]
    B --> C["WorkflowDefinition.from_yaml()"]
    C --> D["_resolve_inputs()"]
    D --> E["validate_workflow()"]
    E --> F["RunState.create()"]
    F --> G["_execute_steps()"]
    G --> H{Step type?}
    H -- command --> I["CommandStep.execute()"]
    H -- shell --> J["ShellStep.execute()"]
    H -- gate --> K["GateStep.execute()"]
    H -- "if" --> L["IfThenStep.execute()"]
    H -- switch --> M["SwitchStep.execute()"]
    H -- "while/do-while" --> N["Loop steps"]
    H -- "fan-out/fan-in" --> O["Fan-out/fan-in"]

    I --> P{Result status?}
    J --> P
    K --> P
    L --> P
    M --> P
    N --> P
    O --> P
    P -- COMPLETED --> Q{Has next_steps?}
    P -- PAUSED --> R["Save state → exit"]
    P -- FAILED --> S["Log error → exit"]
    Q -- Yes --> G
    Q -- No --> T{More steps?}
    T -- Yes --> G
    T -- No --> U["Status = COMPLETED"]

    style R fill:#ff9800,color:#fff
    style S fill:#f44336,color:#fff
    style U fill:#4caf50,color:#fff

Sequential Execution

Steps execute sequentially. Each step receives a StepContext containing resolved inputs, accumulated step results, and workflow-level defaults. After execution, the step's output is stored in context.steps[step_id] and made available to subsequent steps via expressions like {{ steps.specify.output.file }}.

Nested Steps (Control Flow)

Steps like if, switch, while, and do-while return next_steps — inline step definitions that the engine executes recursively via _execute_steps(). Nested steps share the same StepContext and RunState, so their outputs are visible to later top-level steps.

State Persistence and Resume

The engine saves RunState to disk after each step, enabling resume from the exact point of interruption:

flowchart LR
    A["CREATED"] --> B["RUNNING"]
    B --> C["COMPLETED"]
    B --> D["PAUSED"]
    B --> E["FAILED"]
    B --> F["ABORTED"]
    D -- "resume()" --> B
    E -- "resume()" --> B

When a gate step pauses execution, the engine persists current_step_index and all accumulated step_results. On specify workflow resume <run_id>, the engine restores the context and continues from the paused step.

Note: Resume tracking is at the top-level step index only. If a nested step (inside if/switch/while) pauses, resume re-runs the parent control-flow step and its nested body. A nested step-path stack for exact resume is a planned enhancement.

Step Types

The engine ships with 11 built-in step types, each in its own subpackage under src/specify_cli/workflows/steps/:

Type Key	Class	Purpose	Returns `next_steps`?
`command`	`CommandStep`	Invoke an installed Spec Kit command via integration CLI	No
`prompt`	`PromptStep`	Send an arbitrary inline prompt to integration CLI	No
`shell`	`ShellStep`	Run a shell command, capture output	No
`init`	`InitStep`	Bootstrap a project (equivalent to `specify init`)	No
`gate`	`GateStep`	Interactive human review/approval	No (pauses in CI)
`if`	`IfThenStep`	Conditional branching (then/else)	Yes
`switch`	`SwitchStep`	Multi-branch dispatch on expression	Yes
`while`	`WhileStep`	Loop while condition is truthy	Yes (if true)
`do-while`	`DoWhileStep`	Loop, always runs body at least once	Yes (always)
`fan-out`	`FanOutStep`	Dispatch per item over a collection	No (engine expands)
`fan-in`	`FanInStep`	Aggregate results from fan-out	No

Step Registry

All step types register into STEP_REGISTRY via _register_builtin_steps() in src/specify_cli/workflows/__init__.py. The registry maps type_key strings to step instances:

STEP_REGISTRY: dict[str, StepBase]  # e.g., {"command": CommandStep(), "gate": GateStep(), ...}

Registration is explicit — each step class is imported and instantiated. New step types follow the same pattern: subclass StepBase, set type_key, implement execute() and optionally validate().

Expression Engine

Workflow definitions use Jinja2-like {{ expression }} syntax for dynamic values. The expression engine in src/specify_cli/workflows/expressions.py supports:

Feature	Syntax	Example
Variable access	`{{ inputs.name }}`	Dot-path traversal into context
Step outputs	`{{ steps.plan.output.file }}`	Access previous step results
Comparisons	`==`, `!=`, `>`, `<`, `>=`, `<=`	`{{ count > 5 }}`
Boolean logic	`and`, `or`, `not`	`{{ items and status == 'ok' }}`
Membership	`in`, `not in`	`{{ 'error' not in status }}`
Literals	strings, numbers, booleans, lists	`{{ true }}`, `{{ [1, 2] }}`
Filter: `default`	`{{ val \| default('fallback') }}`	Fallback for None/empty
Filter: `join`	`{{ list \| join(', ') }}`	Join list elements
Filter: `contains`	`{{ text \| contains('sub') }}`	Substring/membership check
Filter: `map`	`{{ list \| map('attr') }}`	Extract attribute from each item

Single expressions ({{ expr }} only) return typed values. Mixed templates ("text {{ expr }} more") return interpolated strings.

Namespace

The expression evaluator builds a namespace from the StepContext:

Key	Source	Available when
`inputs`	Resolved workflow inputs	Always
`steps`	Accumulated step results	After first step
`item`	Current iteration item	Inside fan-out
`fan_in`	Aggregated results	Inside fan-in

Input Resolution

When a workflow is executed, _resolve_inputs() validates and coerces provided values against the inputs: schema:

Declared Type	Coercion	Example
`string`	None (pass-through)	`"my-feature"`
`number`	`float()` → `int()` if whole	`"42"` → `42`
`boolean`	`"true"/"1"/"yes"` → `True`	`"false"` → `False`
`enum`	Validates against allowed values	`["full", "backend-only"]`

Missing required inputs raise ValueError. Inputs with default values use the default when not provided.

Catalog System

flowchart TD
    A["specify workflow search"] --> B["WorkflowCatalog.get_active_catalogs()"]
    B --> C{SPECKIT_WORKFLOW_CATALOG_URL set?}
    C -- Yes --> D["Single custom catalog"]
    C -- No --> E{.specify/workflow-catalogs.yml exists?}
    E -- Yes --> F["Project-level catalog stack"]
    E -- No --> G{"~/.specify/workflow-catalogs.yml exists?"}
    G -- Yes --> H["User-level catalog stack"]
    G -- No --> I["Built-in defaults"]
    I --> J["default (install allowed)"]
    I --> K["community (discovery only)"]

    style D fill:#ff9800,color:#fff
    style F fill:#2196f3,color:#fff
    style H fill:#2196f3,color:#fff
    style J fill:#4caf50,color:#fff
    style K fill:#9e9e9e,color:#fff

Catalogs are fetched with a 1-hour cache (per-URL, SHA256-hashed cache files in .specify/workflows/.cache/). Each catalog entry has a priority (for merge ordering) and install_allowed flag.

When specify workflow add <id> installs from catalog, it downloads the workflow YAML from the catalog entry's url field into .specify/workflows/<id>/workflow.yml.

State and Configuration Locations

Component	Location	Format	Purpose
Workflow definitions	`.specify/workflows/{id}/workflow.yml`	YAML	Installed workflow definitions
Workflow registry	`.specify/workflows/workflow-registry.json`	JSON	Installed workflows metadata
Run state	`.specify/workflows/runs/{run_id}/state.json`	JSON	Persisted execution state
Run inputs	`.specify/workflows/runs/{run_id}/inputs.json`	JSON	Resolved input values
Run log	`.specify/workflows/runs/{run_id}/log.jsonl`	JSONL	Append-only event log
Catalog cache	`.specify/workflows/.cache/*.json`	JSON	Cached catalog entries (1hr TTL)
Project catalogs	`.specify/workflow-catalogs.yml`	YAML	Project-level catalog sources
User catalogs	`~/.specify/workflow-catalogs.yml`	YAML	User-level catalog sources

Module Structure

src/specify_cli/
├── workflows/
│   ├── __init__.py          # STEP_REGISTRY + _register_builtin_steps()
│   ├── base.py              # StepBase, StepContext, StepResult, StepStatus, RunStatus
│   ├── catalog.py           # WorkflowCatalog, WorkflowCatalogEntry, WorkflowRegistry
│   ├── engine.py            # WorkflowDefinition, WorkflowEngine, RunState, validate_workflow()
│   ├── expressions.py       # evaluate_expression(), evaluate_condition(), filters
│   └── steps/
│       ├── command/         # Dispatch command to AI integration
│       ├── shell/           # Run shell command
│       ├── init/            # Bootstrap a project (specify init)
│       ├── gate/            # Human review checkpoint
│       ├── if_then/         # Conditional branching
│       ├── prompt/          # Arbitrary inline prompts
│       ├── switch/          # Multi-branch dispatch
│       ├── while_loop/      # While loop
│       ├── do_while/        # Do-while loop
│       ├── fan_out/         # Sequential per-item dispatch
│       └── fan_in/          # Result aggregation
└── __init__.py              # CLI commands: specify workflow run/resume/status/
                             #   list/add/remove/search/info,
                             #   specify workflow catalog list/add/remove

10 KiB Raw Blame History