mirror of
https://github.com/github/spec-kit.git
synced 2026-07-03 12:28:06 +08:00
* Initial plan * Add workflow engine with step registry, expression engine, catalog system, and CLI commands Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Add comprehensive tests for workflow engine (94 tests) Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address review feedback: do-while condition preservation and URL scheme validation Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address review feedback, add CLI dispatch, interactive gates, and docs Review comments (7/7): - Add explanatory comment to empty except block - Implement workflow catalog download with cleanup on failure - Add input type coercion for number/boolean/enum - Fix example workflow to remove non-existent output references - Fix while_loop and if_then condition defaults (string 'false' → bool False) - Fix resume step index tracking with step_offset parameter CLI dispatch: - Add build_exec_args() and dispatch_command() to IntegrationBase - Override for Claude (skills: /speckit-specify), Gemini (-m flag), Codex (codex exec), Copilot (--agent speckit.specify) - CommandStep invokes installed commands by name via integration CLI - Add PromptStep for arbitrary inline prompts (10th step type) - Stream CLI output live to terminal (no silent blocking) - Remove timeout when streaming (user can Ctrl+C) - Ctrl+C saves state as PAUSED for clean resume Interactive gates: - Gate steps prompt [1] approve [2] reject in TTY - Fall back to PAUSED in non-interactive environments - Resume re-executes the gate for interactive prompting Documentation: - workflows/README.md — user guide - workflows/ARCHITECTURE.md — internals with Mermaid diagrams - workflows/PUBLISHING.md — catalog submission guide Tests: 94 → 122 workflow tests, 1362 total (all passing) * Fix ruff lint errors: unused imports, f-string placeholders, undefined name * Address second review: registry-backed validation, shell failures, loop/fan-out execution, URL validation - VALID_STEP_TYPES now queries STEP_REGISTRY dynamically - Shell step returns FAILED on non-zero exit code - Persist workflow YAML in run directory for reliable resume - Resume loads from run copy, falls back to installed workflow - Engine iterates while/do-while loops up to max_iterations - Engine expands fan-out per item with context.item - HTTPS URL validation for catalog workflow installs (HTTP allowed for localhost) - Fix catalog merge priority docstring (lower number wins) - Fix dispatch_command docstring (no build_exec_args_for_command) - Gate on_reject=retry pauses for re-prompt on resume - Update docs to 10 step types, add prompt step to tables and README * Potential fix for pull request finding 'Empty except' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Address third review: fan-out IDs, catalog guards, shell coercion, docs - Fan-out generates unique per-item step IDs and collects results - Catalog merge skips non-dict workflow entries (malformed data guard) - Shell step coerces run_cmd to str after expression evaluation - urlopen timeout=30 for catalog workflow installs - yaml.dump with sort_keys=False, allow_unicode=True for catalog configs - Document streaming timeout as intentionally unbounded (user Ctrl+C) - Document --allow-all-tools as required for non-interactive + future enhancement - Update test docstring and PUBLISHING.md to 10 step types with prompt * Validate final URL after redirects in catalog fetch urlopen follows redirects, so validate the response URL against the same HTTPS/localhost rules to prevent redirect-based downgrade attacks. * Address fourth review: filter arg eval, tags normalization, install redirect check - Filter arguments now evaluated via _evaluate_simple_expression() so default(42) returns int not string - Tags normalized: non-list/non-string values handled gracefully - Install URL redirect validation (same as catalog fetch) - Remove unused 'skipped' variable in catalog config parsing - Author 'github' → 'GitHub' in example workflow - Document nested step resume limitation (re-runs parent step) * Add explanatory comment to empty except ValueError block * Address fifth review: expression parsing, fan-out output, URL install, gate options - Move string literal parsing before operator detection in expressions so quoted strings with operators (e.g. 'a in b') are not mis-parsed - Fan-out: remove max_concurrency from persisted output, fix docstring to reflect sequential execution - workflow add: support URL sources with HTTPS/redirect validation, validate workflow ID is non-empty before writing files - Deduplicate local install logic via _validate_and_install_local() - Remove 'edit' gate option from speckit workflow (not implemented) * Add comments to empty except ValueError blocks in URL install * Address sixth review: operator precedence, fan_in cleanup, registry resilience, docs - Fix or/and operator precedence (or parsed first = lower precedence) - Restore context.fan_in after fan-in step completes - Catch JSONDecodeError in registry load for corrupted files - Replace print() with on_step_start callback (library-safe) - Gate validation warns when on_reject set but no reject option - Shell step: document shell=True security tradeoff - README: sdd-pipeline → speckit, parallel → sequential for fan-out - ARCHITECTURE.md: parallel → fan-out/fan-in in diagram * Address seventh review: string literal before pipe, type annotations, validate on install - Move string literal check above pipe filter parsing so 'a | b' works - Fix type annotations: input_values list[str] | None, run_id str | None - Run validate_workflow() before installing from local path/URL - Remove duplicate string literal check from expression parser * Address eighth review: fan-out namespaced IDs, early return, catalog validation - Fan-out per-item step IDs use _fanout_{step_id}_{base}_{idx} namespace to avoid collisions with user-defined step IDs - Early return after fan-out loop when state is paused/failed/aborted - Catalog installs parse + validate downloaded YAML before registering, using definition metadata instead of catalog entry for registry * Address ninth review: populate catalog, fix indentation, priority, README - Add speckit workflow entry to catalog.json so it's discoverable - Fix shell step output dict indentation - Catalog add_catalog priority derived from max existing + 1 - README Quick Start clarified with install + local file examples * Address tenth review: max_iterations validation, catalog config guard, version alignment - Validate max_iterations is int >= 1 in while and do-while steps - Guard add_catalog against corrupted config (non-dict/non-list) - Align speckit_version requirement to >=0.6.1 (current package version) - Fan-out template validation uses separate seen_ids set to avoid false duplication errors with user-defined step IDs * Address eleventh review: command step fails without CLI, ID mismatch warning, state persistence - Command step returns FAILED when CLI not installed (was silent COMPLETED) - Catalog install warns on workflow ID vs catalog key mismatch - Engine persists state.save() before returning on unknown step type - Update tests to expect FAILED for command steps without CLI - Integration tests use shell steps for CLI-independent execution * Address twelfth review: type annotations, version examples, streaming docs, requires - Fix workflow_search type annotations (str | None) - PUBLISHING.md: speckit_version >=0.15.0 → >=0.6.1 - Document that exit_code is captured and referenceable by later steps - Mark requires as declared-but-not-enforced (planned enhancement) - Note full stdout/stderr capture as planned enhancement * Enforce catalog key matches workflow ID (fail instead of warn) * Bundle speckit workflow: auto-install during specify init - Add workflows/speckit to pyproject.toml force-include for wheel builds - Add _locate_bundled_workflow() helper (mirrors _locate_bundled_extension) - Auto-install speckit workflow during specify init (after git extension) - Update all integration file inventory tests to expect workflow files * Address fourteenth review: prompt fails without CLI, resolved step data, fan-out normalization - PromptStep returns FAILED when CLI not installed (was silent COMPLETED) - Engine step_data prefers resolved values from step output - Fan-out normalizes output.results=[] for empty item lists - subprocess.run inherits stdout/stderr (no explicit sys.stdout) - Registry tests use issubset for extensibility * Address fifteenth review: fan_in docstring, gate defaults, validation guards, reserved prefix - FanInStep docstring: aggregate-only, no blocking semantics - FanInStep: guard output_config as dict, handle None - Gate validate: use same default options as execute - Validate inputs is dict and steps is list before iterating - Reserve _fanout_ prefix in step ID validation - PUBLISHING.md: remove unenforced checklist items, add _fanout_ note * Address sixteenth review: docs regex, fan_in try/finally, hyphenated dot-path keys - PUBLISHING.md: update ID regex docs to match implementation (single-char OK) - FanInStep: wrap expression evaluation in try/finally for context.fan_in - Expression dot-path: allow hyphens in keys before list index (e.g. run-tests[0]) * Make speckit workflow integration-agnostic, document Copilot CLI requirement - Workflow integration selectable via input (default: claude) - Each command step uses {{ inputs.integration }} instead of hardcoded copilot - Copilot docstring documents CLI requirement for workflow dispatch - Added install_url for Copilot CLI docs * Address seventeenth review: project checks, catalog robustness - Add .specify/ project check to workflow run/resume/status/search/info - remove_catalog validates config shape (dict + list) before indexing - _fetch_single_catalog validates response is a dict - _get_merged_workflows raises when all catalogs fail to fetch - add_catalog guards against non-dict catalog entries in config * Address eighteenth review: condition coercion, gate abort result, while default, cache guard, resume log - evaluate_condition treats plain 'false'/'true' strings as booleans - Gate abort returns StepResult(FAILED) instead of raising exception so step output is persisted in state for inspection - while_loop max_iterations optional (default 10), validation aligned - Catalog cache fallback catches invalid JSON gracefully - resume() appends workflow_finished log entry like execute() * Address nineteenth review: allow-all-tools opt-in, empty catalogs, abort dead code, while docstring - --allow-all-tools controlled by SPECKIT_ALLOW_ALL_TOOLS env var (default: 1) Set to 0 to disable automatic tool approval for Copilot CLI - Empty catalogs list falls back to built-in defaults (not an error) - Remove unreachable WorkflowAbortError catches from execute/resume (gate abort now returns StepResult(FAILED) instead of raising) - while_loop docstring updated: max_iterations is optional (default 10) * Address twentieth review: gate abort maps to ABORTED status, do-while max_iterations optional - Engine detects output.aborted from gate step and sets RunStatus.ABORTED (was unreachable — gate abort returned FAILED but status was always FAILED) - do-while max_iterations now optional (default 10), aligned with while_loop - do-while docstring and validation updated accordingly * Coerce default_options to dict, align bundled workflow ID regex with validator * Gate validates string options, prompt uses resolved integration, loop normalizes max_iterations * Use parentId:childId convention for nested step IDs - Fan-out per-item IDs use parentId:templateId:index (e.g. parallel:impl:0) - Reserve ':' in user step IDs (validation rejects) - Replaces _fanout_ prefix with cleaner namespacing - Expressions like {{ steps.parallel:impl:0.output.file }} work naturally * Validate workflow version is semantic versioning (X.Y.Z) * Schema version validation, strict semver, load_workflow docstring, preserve max_concurrency - Validate schema_version is '1.0' (reject unknown future schemas) - Strict semver regex: ^\d+\.\d+\.\d+$ (rejects 1.0.0beta etc.) - load_workflow docstring: 'parsed' not 'validated' - Keep max_concurrency in fan-out output (was dropped) - do_while docstring: engine re-evaluates step_config condition - ARCHITECTURE.md: document nested resume limitation * Path traversal prevention, loop step ID namespacing - RunState validates run_id is alphanumeric+hyphens (no path separators) - workflow_add validates catalog source doesn't escape workflows_dir - Loop iterations namespace nested step IDs as parentId:childId:iteration so multiple iterations don't overwrite each other in context/state --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
212 lines
9.8 KiB
Markdown
212 lines
9.8 KiB
Markdown
# Workflow System Architecture
|
|
|
|
This document describes the internal architecture of the workflow engine — how definitions are parsed, steps are dispatched, state is persisted, and catalogs are resolved.
|
|
|
|
For usage instructions, see [README.md](README.md).
|
|
|
|
## Execution Model
|
|
|
|
When `specify workflow run` is invoked, the engine loads a YAML definition, resolves inputs, and dispatches steps sequentially through the step registry:
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
A["specify workflow run my-workflow"] --> B["WorkflowEngine.load_workflow()"]
|
|
B --> C["WorkflowDefinition.from_yaml()"]
|
|
C --> D["_resolve_inputs()"]
|
|
D --> E["validate_workflow()"]
|
|
E --> F["RunState.create()"]
|
|
F --> G["_execute_steps()"]
|
|
G --> H{Step type?}
|
|
H -- command --> I["CommandStep.execute()"]
|
|
H -- shell --> J["ShellStep.execute()"]
|
|
H -- gate --> K["GateStep.execute()"]
|
|
H -- "if" --> L["IfThenStep.execute()"]
|
|
H -- switch --> M["SwitchStep.execute()"]
|
|
H -- "while/do-while" --> N["Loop steps"]
|
|
H -- "fan-out/fan-in" --> O["Fan-out/fan-in"]
|
|
|
|
I --> P{Result status?}
|
|
J --> P
|
|
K --> P
|
|
L --> P
|
|
M --> P
|
|
N --> P
|
|
O --> P
|
|
P -- COMPLETED --> Q{Has next_steps?}
|
|
P -- PAUSED --> R["Save state → exit"]
|
|
P -- FAILED --> S["Log error → exit"]
|
|
Q -- Yes --> G
|
|
Q -- No --> T{More steps?}
|
|
T -- Yes --> G
|
|
T -- No --> U["Status = COMPLETED"]
|
|
|
|
style R fill:#ff9800,color:#fff
|
|
style S fill:#f44336,color:#fff
|
|
style U fill:#4caf50,color:#fff
|
|
```
|
|
|
|
### Sequential Execution
|
|
|
|
Steps execute sequentially. Each step receives a `StepContext` containing resolved inputs, accumulated step results, and workflow-level defaults. After execution, the step's output is stored in `context.steps[step_id]` and made available to subsequent steps via expressions like `{{ steps.specify.output.file }}`.
|
|
|
|
### Nested Steps (Control Flow)
|
|
|
|
Steps like `if`, `switch`, `while`, and `do-while` return `next_steps` — inline step definitions that the engine executes recursively via `_execute_steps()`. Nested steps share the same `StepContext` and `RunState`, so their outputs are visible to later top-level steps.
|
|
|
|
### State Persistence and Resume
|
|
|
|
The engine saves `RunState` to disk after each step, enabling resume from the exact point of interruption:
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
A["CREATED"] --> B["RUNNING"]
|
|
B --> C["COMPLETED"]
|
|
B --> D["PAUSED"]
|
|
B --> E["FAILED"]
|
|
B --> F["ABORTED"]
|
|
D -- "resume()" --> B
|
|
E -- "resume()" --> B
|
|
```
|
|
|
|
When a `gate` step pauses execution, the engine persists `current_step_index` and all accumulated `step_results`. On `specify workflow resume <run_id>`, the engine restores the context and continues from the paused step.
|
|
|
|
> **Note:** Resume tracking is at the top-level step index only. If a
|
|
> nested step (inside `if`/`switch`/`while`) pauses, resume re-runs
|
|
> the parent control-flow step and its nested body. A nested step-path
|
|
> stack for exact resume is a planned enhancement.
|
|
|
|
## Step Types
|
|
|
|
The engine ships with 10 built-in step types, each in its own subpackage under `src/specify_cli/workflows/steps/`:
|
|
|
|
| Type Key | Class | Purpose | Returns `next_steps`? |
|
|
|----------|-------|---------|-----------------------|
|
|
| `command` | `CommandStep` | Invoke an installed Spec Kit command via integration CLI | No |
|
|
| `prompt` | `PromptStep` | Send an arbitrary inline prompt to integration CLI | No |
|
|
| `shell` | `ShellStep` | Run a shell command, capture output | No |
|
|
| `gate` | `GateStep` | Interactive human review/approval | No (pauses in CI) |
|
|
| `if` | `IfThenStep` | Conditional branching (then/else) | Yes |
|
|
| `switch` | `SwitchStep` | Multi-branch dispatch on expression | Yes |
|
|
| `while` | `WhileStep` | Loop while condition is truthy | Yes (if true) |
|
|
| `do-while` | `DoWhileStep` | Loop, always runs body at least once | Yes (always) |
|
|
| `fan-out` | `FanOutStep` | Dispatch per item over a collection | No (engine expands) |
|
|
| `fan-in` | `FanInStep` | Aggregate results from fan-out | No |
|
|
|
|
## Step Registry
|
|
|
|
All step types register into `STEP_REGISTRY` via `_register_builtin_steps()` in `src/specify_cli/workflows/__init__.py`. The registry maps `type_key` strings to step instances:
|
|
|
|
```python
|
|
STEP_REGISTRY: dict[str, StepBase] # e.g., {"command": CommandStep(), "gate": GateStep(), ...}
|
|
```
|
|
|
|
Registration is explicit — each step class is imported and instantiated. New step types follow the same pattern: subclass `StepBase`, set `type_key`, implement `execute()` and optionally `validate()`.
|
|
|
|
## Expression Engine
|
|
|
|
Workflow definitions use Jinja2-like `{{ expression }}` syntax for dynamic values. The expression engine in `src/specify_cli/workflows/expressions.py` supports:
|
|
|
|
| Feature | Syntax | Example |
|
|
|---------|--------|---------|
|
|
| Variable access | `{{ inputs.name }}` | Dot-path traversal into context |
|
|
| Step outputs | `{{ steps.plan.output.file }}` | Access previous step results |
|
|
| Comparisons | `==`, `!=`, `>`, `<`, `>=`, `<=` | `{{ count > 5 }}` |
|
|
| Boolean logic | `and`, `or`, `not` | `{{ items and status == 'ok' }}` |
|
|
| Membership | `in`, `not in` | `{{ 'error' not in status }}` |
|
|
| Literals | strings, numbers, booleans, lists | `{{ true }}`, `{{ [1, 2] }}` |
|
|
| Filter: `default` | `{{ val \| default('fallback') }}` | Fallback for None/empty |
|
|
| Filter: `join` | `{{ list \| join(', ') }}` | Join list elements |
|
|
| Filter: `contains` | `{{ text \| contains('sub') }}` | Substring/membership check |
|
|
| Filter: `map` | `{{ list \| map('attr') }}` | Extract attribute from each item |
|
|
|
|
**Single expressions** (`{{ expr }}` only) return typed values. **Mixed templates** (`"text {{ expr }} more"`) return interpolated strings.
|
|
|
|
### Namespace
|
|
|
|
The expression evaluator builds a namespace from the `StepContext`:
|
|
|
|
| Key | Source | Available when |
|
|
|-----|--------|----------------|
|
|
| `inputs` | Resolved workflow inputs | Always |
|
|
| `steps` | Accumulated step results | After first step |
|
|
| `item` | Current iteration item | Inside fan-out |
|
|
| `fan_in` | Aggregated results | Inside fan-in |
|
|
|
|
## Input Resolution
|
|
|
|
When a workflow is executed, `_resolve_inputs()` validates and coerces provided values against the `inputs:` schema:
|
|
|
|
| Declared Type | Coercion | Example |
|
|
|---------------|----------|---------|
|
|
| `string` | None (pass-through) | `"my-feature"` |
|
|
| `number` | `float()` → `int()` if whole | `"42"` → `42` |
|
|
| `boolean` | `"true"/"1"/"yes"` → `True` | `"false"` → `False` |
|
|
| `enum` | Validates against allowed values | `["full", "backend-only"]` |
|
|
|
|
Missing required inputs raise `ValueError`. Inputs with `default` values use the default when not provided.
|
|
|
|
## Catalog System
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
A["specify workflow search"] --> B["WorkflowCatalog.get_active_catalogs()"]
|
|
B --> C{SPECKIT_WORKFLOW_CATALOG_URL set?}
|
|
C -- Yes --> D["Single custom catalog"]
|
|
C -- No --> E{.specify/workflow-catalogs.yml exists?}
|
|
E -- Yes --> F["Project-level catalog stack"]
|
|
E -- No --> G{"~/.specify/workflow-catalogs.yml exists?"}
|
|
G -- Yes --> H["User-level catalog stack"]
|
|
G -- No --> I["Built-in defaults"]
|
|
I --> J["default (install allowed)"]
|
|
I --> K["community (discovery only)"]
|
|
|
|
style D fill:#ff9800,color:#fff
|
|
style F fill:#2196f3,color:#fff
|
|
style H fill:#2196f3,color:#fff
|
|
style J fill:#4caf50,color:#fff
|
|
style K fill:#9e9e9e,color:#fff
|
|
```
|
|
|
|
Catalogs are fetched with a 1-hour cache (per-URL, SHA256-hashed cache files in `.specify/workflows/.cache/`). Each catalog entry has a `priority` (for merge ordering) and `install_allowed` flag.
|
|
|
|
When `specify workflow add <id>` installs from catalog, it downloads the workflow YAML from the catalog entry's `url` field into `.specify/workflows/<id>/workflow.yml`.
|
|
|
|
## State and Configuration Locations
|
|
|
|
| Component | Location | Format | Purpose |
|
|
|-----------|----------|--------|---------|
|
|
| Workflow definitions | `.specify/workflows/{id}/workflow.yml` | YAML | Installed workflow definitions |
|
|
| Workflow registry | `.specify/workflows/workflow-registry.json` | JSON | Installed workflows metadata |
|
|
| Run state | `.specify/workflows/runs/{run_id}/state.json` | JSON | Persisted execution state |
|
|
| Run inputs | `.specify/workflows/runs/{run_id}/inputs.json` | JSON | Resolved input values |
|
|
| Run log | `.specify/workflows/runs/{run_id}/log.jsonl` | JSONL | Append-only event log |
|
|
| Catalog cache | `.specify/workflows/.cache/*.json` | JSON | Cached catalog entries (1hr TTL) |
|
|
| Project catalogs | `.specify/workflow-catalogs.yml` | YAML | Project-level catalog sources |
|
|
| User catalogs | `~/.specify/workflow-catalogs.yml` | YAML | User-level catalog sources |
|
|
|
|
## Module Structure
|
|
|
|
```
|
|
src/specify_cli/
|
|
├── workflows/
|
|
│ ├── __init__.py # STEP_REGISTRY + _register_builtin_steps()
|
|
│ ├── base.py # StepBase, StepContext, StepResult, StepStatus, RunStatus
|
|
│ ├── catalog.py # WorkflowCatalog, WorkflowCatalogEntry, WorkflowRegistry
|
|
│ ├── engine.py # WorkflowDefinition, WorkflowEngine, RunState, validate_workflow()
|
|
│ ├── expressions.py # evaluate_expression(), evaluate_condition(), filters
|
|
│ └── steps/
|
|
│ ├── command/ # Dispatch command to AI integration
|
|
│ ├── shell/ # Run shell command
|
|
│ ├── gate/ # Human review checkpoint
|
|
│ ├── if_then/ # Conditional branching
|
|
│ ├── prompt/ # Arbitrary inline prompts
|
|
│ ├── switch/ # Multi-branch dispatch
|
|
│ ├── while_loop/ # While loop
|
|
│ ├── do_while/ # Do-while loop
|
|
│ ├── fan_out/ # Sequential per-item dispatch
|
|
│ └── fan_in/ # Result aggregation
|
|
└── __init__.py # CLI commands: specify workflow run/resume/status/
|
|
# list/add/remove/search/info,
|
|
# specify workflow catalog list/add/remove
|
|
```
|