Files
github-spec-kit/src/specify_cli/workflows/engine.py
Copilot a00e679918 Add workflow engine with catalog system (#2158)
* Initial plan

* Add workflow engine with step registry, expression engine, catalog system, and CLI commands

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Add comprehensive tests for workflow engine (94 tests)

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Address review feedback: do-while condition preservation and URL scheme validation

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Address review feedback, add CLI dispatch, interactive gates, and docs

Review comments (7/7):
- Add explanatory comment to empty except block
- Implement workflow catalog download with cleanup on failure
- Add input type coercion for number/boolean/enum
- Fix example workflow to remove non-existent output references
- Fix while_loop and if_then condition defaults (string 'false' → bool False)
- Fix resume step index tracking with step_offset parameter

CLI dispatch:
- Add build_exec_args() and dispatch_command() to IntegrationBase
- Override for Claude (skills: /speckit-specify), Gemini (-m flag),
  Codex (codex exec), Copilot (--agent speckit.specify)
- CommandStep invokes installed commands by name via integration CLI
- Add PromptStep for arbitrary inline prompts (10th step type)
- Stream CLI output live to terminal (no silent blocking)
- Remove timeout when streaming (user can Ctrl+C)
- Ctrl+C saves state as PAUSED for clean resume

Interactive gates:
- Gate steps prompt [1] approve [2] reject in TTY
- Fall back to PAUSED in non-interactive environments
- Resume re-executes the gate for interactive prompting

Documentation:
- workflows/README.md — user guide
- workflows/ARCHITECTURE.md — internals with Mermaid diagrams
- workflows/PUBLISHING.md — catalog submission guide

Tests: 94 → 122 workflow tests, 1362 total (all passing)

* Fix ruff lint errors: unused imports, f-string placeholders, undefined name

* Address second review: registry-backed validation, shell failures, loop/fan-out execution, URL validation

- VALID_STEP_TYPES now queries STEP_REGISTRY dynamically
- Shell step returns FAILED on non-zero exit code
- Persist workflow YAML in run directory for reliable resume
- Resume loads from run copy, falls back to installed workflow
- Engine iterates while/do-while loops up to max_iterations
- Engine expands fan-out per item with context.item
- HTTPS URL validation for catalog workflow installs (HTTP allowed for localhost)
- Fix catalog merge priority docstring (lower number wins)
- Fix dispatch_command docstring (no build_exec_args_for_command)
- Gate on_reject=retry pauses for re-prompt on resume
- Update docs to 10 step types, add prompt step to tables and README

* Potential fix for pull request finding 'Empty except'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Address third review: fan-out IDs, catalog guards, shell coercion, docs

- Fan-out generates unique per-item step IDs and collects results
- Catalog merge skips non-dict workflow entries (malformed data guard)
- Shell step coerces run_cmd to str after expression evaluation
- urlopen timeout=30 for catalog workflow installs
- yaml.dump with sort_keys=False, allow_unicode=True for catalog configs
- Document streaming timeout as intentionally unbounded (user Ctrl+C)
- Document --allow-all-tools as required for non-interactive + future enhancement
- Update test docstring and PUBLISHING.md to 10 step types with prompt

* Validate final URL after redirects in catalog fetch

urlopen follows redirects, so validate the response URL against the
same HTTPS/localhost rules to prevent redirect-based downgrade attacks.

* Address fourth review: filter arg eval, tags normalization, install redirect check

- Filter arguments now evaluated via _evaluate_simple_expression() so
  default(42) returns int not string
- Tags normalized: non-list/non-string values handled gracefully
- Install URL redirect validation (same as catalog fetch)
- Remove unused 'skipped' variable in catalog config parsing
- Author 'github' → 'GitHub' in example workflow
- Document nested step resume limitation (re-runs parent step)

* Add explanatory comment to empty except ValueError block

* Address fifth review: expression parsing, fan-out output, URL install, gate options

- Move string literal parsing before operator detection in expressions
  so quoted strings with operators (e.g. 'a in b') are not mis-parsed
- Fan-out: remove max_concurrency from persisted output, fix docstring
  to reflect sequential execution
- workflow add: support URL sources with HTTPS/redirect validation,
  validate workflow ID is non-empty before writing files
- Deduplicate local install logic via _validate_and_install_local()
- Remove 'edit' gate option from speckit workflow (not implemented)

* Add comments to empty except ValueError blocks in URL install

* Address sixth review: operator precedence, fan_in cleanup, registry resilience, docs

- Fix or/and operator precedence (or parsed first = lower precedence)
- Restore context.fan_in after fan-in step completes
- Catch JSONDecodeError in registry load for corrupted files
- Replace print() with on_step_start callback (library-safe)
- Gate validation warns when on_reject set but no reject option
- Shell step: document shell=True security tradeoff
- README: sdd-pipeline → speckit, parallel → sequential for fan-out
- ARCHITECTURE.md: parallel → fan-out/fan-in in diagram

* Address seventh review: string literal before pipe, type annotations, validate on install

- Move string literal check above pipe filter parsing so 'a | b' works
- Fix type annotations: input_values list[str] | None, run_id str | None
- Run validate_workflow() before installing from local path/URL
- Remove duplicate string literal check from expression parser

* Address eighth review: fan-out namespaced IDs, early return, catalog validation

- Fan-out per-item step IDs use _fanout_{step_id}_{base}_{idx} namespace
  to avoid collisions with user-defined step IDs
- Early return after fan-out loop when state is paused/failed/aborted
- Catalog installs parse + validate downloaded YAML before registering,
  using definition metadata instead of catalog entry for registry

* Address ninth review: populate catalog, fix indentation, priority, README

- Add speckit workflow entry to catalog.json so it's discoverable
- Fix shell step output dict indentation
- Catalog add_catalog priority derived from max existing + 1
- README Quick Start clarified with install + local file examples

* Address tenth review: max_iterations validation, catalog config guard, version alignment

- Validate max_iterations is int >= 1 in while and do-while steps
- Guard add_catalog against corrupted config (non-dict/non-list)
- Align speckit_version requirement to >=0.6.1 (current package version)
- Fan-out template validation uses separate seen_ids set to avoid
  false duplication errors with user-defined step IDs

* Address eleventh review: command step fails without CLI, ID mismatch warning, state persistence

- Command step returns FAILED when CLI not installed (was silent COMPLETED)
- Catalog install warns on workflow ID vs catalog key mismatch
- Engine persists state.save() before returning on unknown step type
- Update tests to expect FAILED for command steps without CLI
- Integration tests use shell steps for CLI-independent execution

* Address twelfth review: type annotations, version examples, streaming docs, requires

- Fix workflow_search type annotations (str | None)
- PUBLISHING.md: speckit_version >=0.15.0 → >=0.6.1
- Document that exit_code is captured and referenceable by later steps
- Mark requires as declared-but-not-enforced (planned enhancement)
- Note full stdout/stderr capture as planned enhancement

* Enforce catalog key matches workflow ID (fail instead of warn)

* Bundle speckit workflow: auto-install during specify init

- Add workflows/speckit to pyproject.toml force-include for wheel builds
- Add _locate_bundled_workflow() helper (mirrors _locate_bundled_extension)
- Auto-install speckit workflow during specify init (after git extension)
- Update all integration file inventory tests to expect workflow files

* Address fourteenth review: prompt fails without CLI, resolved step data, fan-out normalization

- PromptStep returns FAILED when CLI not installed (was silent COMPLETED)
- Engine step_data prefers resolved values from step output
- Fan-out normalizes output.results=[] for empty item lists
- subprocess.run inherits stdout/stderr (no explicit sys.stdout)
- Registry tests use issubset for extensibility

* Address fifteenth review: fan_in docstring, gate defaults, validation guards, reserved prefix

- FanInStep docstring: aggregate-only, no blocking semantics
- FanInStep: guard output_config as dict, handle None
- Gate validate: use same default options as execute
- Validate inputs is dict and steps is list before iterating
- Reserve _fanout_ prefix in step ID validation
- PUBLISHING.md: remove unenforced checklist items, add _fanout_ note

* Address sixteenth review: docs regex, fan_in try/finally, hyphenated dot-path keys

- PUBLISHING.md: update ID regex docs to match implementation (single-char OK)
- FanInStep: wrap expression evaluation in try/finally for context.fan_in
- Expression dot-path: allow hyphens in keys before list index (e.g. run-tests[0])

* Make speckit workflow integration-agnostic, document Copilot CLI requirement

- Workflow integration selectable via input (default: claude)
- Each command step uses {{ inputs.integration }} instead of hardcoded copilot
- Copilot docstring documents CLI requirement for workflow dispatch
- Added install_url for Copilot CLI docs

* Address seventeenth review: project checks, catalog robustness

- Add .specify/ project check to workflow run/resume/status/search/info
- remove_catalog validates config shape (dict + list) before indexing
- _fetch_single_catalog validates response is a dict
- _get_merged_workflows raises when all catalogs fail to fetch
- add_catalog guards against non-dict catalog entries in config

* Address eighteenth review: condition coercion, gate abort result, while default, cache guard, resume log

- evaluate_condition treats plain 'false'/'true' strings as booleans
- Gate abort returns StepResult(FAILED) instead of raising exception
  so step output is persisted in state for inspection
- while_loop max_iterations optional (default 10), validation aligned
- Catalog cache fallback catches invalid JSON gracefully
- resume() appends workflow_finished log entry like execute()

* Address nineteenth review: allow-all-tools opt-in, empty catalogs, abort dead code, while docstring

- --allow-all-tools controlled by SPECKIT_ALLOW_ALL_TOOLS env var (default: 1)
  Set to 0 to disable automatic tool approval for Copilot CLI
- Empty catalogs list falls back to built-in defaults (not an error)
- Remove unreachable WorkflowAbortError catches from execute/resume
  (gate abort now returns StepResult(FAILED) instead of raising)
- while_loop docstring updated: max_iterations is optional (default 10)

* Address twentieth review: gate abort maps to ABORTED status, do-while max_iterations optional

- Engine detects output.aborted from gate step and sets RunStatus.ABORTED
  (was unreachable — gate abort returned FAILED but status was always FAILED)
- do-while max_iterations now optional (default 10), aligned with while_loop
- do-while docstring and validation updated accordingly

* Coerce default_options to dict, align bundled workflow ID regex with validator

* Gate validates string options, prompt uses resolved integration, loop normalizes max_iterations

* Use parentId:childId convention for nested step IDs

- Fan-out per-item IDs use parentId:templateId:index (e.g. parallel:impl:0)
- Reserve ':' in user step IDs (validation rejects)
- Replaces _fanout_ prefix with cleaner namespacing
- Expressions like {{ steps.parallel:impl:0.output.file }} work naturally

* Validate workflow version is semantic versioning (X.Y.Z)

* Schema version validation, strict semver, load_workflow docstring, preserve max_concurrency

- Validate schema_version is '1.0' (reject unknown future schemas)
- Strict semver regex: ^\d+\.\d+\.\d+$ (rejects 1.0.0beta etc.)
- load_workflow docstring: 'parsed' not 'validated'
- Keep max_concurrency in fan-out output (was dropped)
- do_while docstring: engine re-evaluates step_config condition
- ARCHITECTURE.md: document nested resume limitation

* Path traversal prevention, loop step ID namespacing

- RunState validates run_id is alphanumeric+hyphens (no path separators)
- workflow_add validates catalog source doesn't escape workflows_dir
- Loop iterations namespace nested step IDs as parentId:childId:iteration
  so multiple iterations don't overwrite each other in context/state

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
2026-04-14 10:11:56 -05:00

779 lines
29 KiB
Python

"""Workflow engine — loads, validates, and executes workflow YAML definitions.
The engine is the orchestrator that:
- Parses workflow YAML definitions
- Validates step configurations and requirements
- Executes steps sequentially, dispatching to the correct step type
- Manages state persistence for resume capability
- Handles control flow (branching, loops, fan-out/fan-in)
"""
from __future__ import annotations
import json
import re
import uuid
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
import yaml
from .base import RunStatus, StepContext, StepResult, StepStatus
# -- Workflow Definition --------------------------------------------------
class WorkflowDefinition:
"""Parsed and validated workflow YAML definition."""
def __init__(self, data: dict[str, Any], source_path: Path | None = None) -> None:
self.data = data
self.source_path = source_path
workflow = data.get("workflow", {})
self.id: str = workflow.get("id", "")
self.name: str = workflow.get("name", "")
self.version: str = workflow.get("version", "0.0.0")
self.author: str = workflow.get("author", "")
self.description: str = workflow.get("description", "")
self.schema_version: str = data.get("schema_version", "1.0")
# Defaults
self.default_integration: str | None = workflow.get("integration")
self.default_model: str | None = workflow.get("model")
self.default_options: dict[str, Any] = workflow.get("options") or {}
if not isinstance(self.default_options, dict):
self.default_options = {}
# Requirements (declared but not yet enforced at runtime;
# enforcement is a planned enhancement)
self.requires: dict[str, Any] = data.get("requires", {})
# Inputs
self.inputs: dict[str, Any] = data.get("inputs", {})
# Steps
self.steps: list[dict[str, Any]] = data.get("steps", [])
@classmethod
def from_yaml(cls, path: Path) -> WorkflowDefinition:
"""Load a workflow definition from a YAML file."""
with open(path, encoding="utf-8") as f:
data = yaml.safe_load(f)
if not isinstance(data, dict):
msg = f"Workflow YAML must be a mapping, got {type(data).__name__}."
raise ValueError(msg)
return cls(data, source_path=path)
@classmethod
def from_string(cls, content: str) -> WorkflowDefinition:
"""Load a workflow definition from a YAML string."""
data = yaml.safe_load(content)
if not isinstance(data, dict):
msg = f"Workflow YAML must be a mapping, got {type(data).__name__}."
raise ValueError(msg)
return cls(data)
# -- Workflow Validation --------------------------------------------------
# ID format: lowercase alphanumeric with hyphens
_ID_PATTERN = re.compile(r"^[a-z0-9][a-z0-9-]*[a-z0-9]$|^[a-z0-9]$")
# Valid step types (matching STEP_REGISTRY keys)
def _get_valid_step_types() -> set[str]:
"""Return valid step types from the registry, with a built-in fallback."""
from . import STEP_REGISTRY
if STEP_REGISTRY:
return set(STEP_REGISTRY.keys())
return {
"command", "shell", "prompt", "gate", "if",
"switch", "while", "do-while", "fan-out", "fan-in",
}
def validate_workflow(definition: WorkflowDefinition) -> list[str]:
"""Validate a workflow definition and return a list of error messages.
An empty list means the workflow is valid.
"""
errors: list[str] = []
# -- Schema version ---------------------------------------------------
if definition.schema_version not in ("1.0", "1"):
errors.append(
f"Unsupported schema_version {definition.schema_version!r}. "
f"Expected '1.0'."
)
# -- Top-level fields -------------------------------------------------
if not definition.id:
errors.append("Workflow is missing 'workflow.id'.")
elif not _ID_PATTERN.match(definition.id):
errors.append(
f"Workflow ID {definition.id!r} must be lowercase alphanumeric "
f"with hyphens."
)
if not definition.name:
errors.append("Workflow is missing 'workflow.name'.")
if not definition.version:
errors.append("Workflow is missing 'workflow.version'.")
elif not re.match(r"^\d+\.\d+\.\d+$", definition.version):
errors.append(
f"Workflow version {definition.version!r} is not valid "
f"semantic versioning (expected X.Y.Z)."
)
# -- Inputs -----------------------------------------------------------
if not isinstance(definition.inputs, dict):
errors.append("'inputs' must be a mapping (or omitted).")
else:
for input_name, input_def in definition.inputs.items():
if not isinstance(input_def, dict):
errors.append(f"Input {input_name!r} must be a mapping.")
continue
input_type = input_def.get("type")
if input_type and input_type not in ("string", "number", "boolean"):
errors.append(
f"Input {input_name!r} has invalid type {input_type!r}. "
f"Must be 'string', 'number', or 'boolean'."
)
# -- Steps ------------------------------------------------------------
if not isinstance(definition.steps, list):
errors.append("'steps' must be a list.")
return errors
if not definition.steps:
errors.append("Workflow has no steps defined.")
seen_ids: set[str] = set()
_validate_steps(definition.steps, seen_ids, errors)
return errors
def _validate_steps(
steps: list[dict[str, Any]],
seen_ids: set[str],
errors: list[str],
) -> None:
"""Recursively validate a list of steps."""
from . import STEP_REGISTRY
for step_config in steps:
if not isinstance(step_config, dict):
errors.append(f"Step must be a mapping, got {type(step_config).__name__}.")
continue
step_id = step_config.get("id")
if not step_id:
errors.append("Step is missing 'id' field.")
continue
if ":" in step_id:
errors.append(
f"Step ID {step_id!r} contains ':' which is reserved "
f"for engine-generated nested IDs (parentId:childId)."
)
if step_id in seen_ids:
errors.append(f"Duplicate step ID {step_id!r}.")
seen_ids.add(step_id)
# Determine step type
step_type = step_config.get("type", "command")
if step_type not in _get_valid_step_types():
errors.append(
f"Step {step_id!r} has invalid type {step_type!r}."
)
continue
# Delegate to step-specific validation
step_impl = STEP_REGISTRY.get(step_type)
if step_impl:
step_errors = step_impl.validate(step_config)
errors.extend(step_errors)
# Recursively validate nested steps
for nested_key in ("then", "else", "steps"):
nested = step_config.get(nested_key)
if isinstance(nested, list):
_validate_steps(nested, seen_ids, errors)
# Validate switch cases
cases = step_config.get("cases")
if isinstance(cases, dict):
for _case_key, case_steps in cases.items():
if isinstance(case_steps, list):
_validate_steps(case_steps, seen_ids, errors)
# Validate switch default
default = step_config.get("default")
if isinstance(default, list):
_validate_steps(default, seen_ids, errors)
# Validate fan-out nested step (template — not added to seen_ids
# since the engine generates parentId:templateId:index at runtime)
fan_step = step_config.get("step")
if isinstance(fan_step, dict):
fan_errors: list[str] = []
_validate_steps([fan_step], set(), fan_errors)
errors.extend(fan_errors)
# -- Run State Persistence ------------------------------------------------
class RunState:
"""Manages workflow run state for persistence and resume."""
def __init__(
self,
run_id: str | None = None,
workflow_id: str = "",
project_root: Path | None = None,
) -> None:
self.run_id = run_id or str(uuid.uuid4())[:8]
if not re.match(r'^[a-zA-Z0-9][a-zA-Z0-9_-]*$', self.run_id):
msg = f"Invalid run_id {self.run_id!r}: must be alphanumeric with hyphens/underscores only."
raise ValueError(msg)
self.workflow_id = workflow_id
self.project_root = project_root or Path(".")
self.status = RunStatus.CREATED
self.current_step_index = 0
self.current_step_id: str | None = None
self.step_results: dict[str, dict[str, Any]] = {}
self.inputs: dict[str, Any] = {}
self.created_at = datetime.now(timezone.utc).isoformat()
self.updated_at = self.created_at
self.log_entries: list[dict[str, Any]] = []
@property
def runs_dir(self) -> Path:
return self.project_root / ".specify" / "workflows" / "runs" / self.run_id
def save(self) -> None:
"""Persist current state to disk."""
self.updated_at = datetime.now(timezone.utc).isoformat()
runs_dir = self.runs_dir
runs_dir.mkdir(parents=True, exist_ok=True)
state_data = {
"run_id": self.run_id,
"workflow_id": self.workflow_id,
"status": self.status.value,
"current_step_index": self.current_step_index,
"current_step_id": self.current_step_id,
"step_results": self.step_results,
"created_at": self.created_at,
"updated_at": self.updated_at,
}
with open(runs_dir / "state.json", "w", encoding="utf-8") as f:
json.dump(state_data, f, indent=2)
inputs_data = {"inputs": self.inputs}
with open(runs_dir / "inputs.json", "w", encoding="utf-8") as f:
json.dump(inputs_data, f, indent=2)
@classmethod
def load(cls, run_id: str, project_root: Path) -> RunState:
"""Load a run state from disk."""
runs_dir = project_root / ".specify" / "workflows" / "runs" / run_id
state_path = runs_dir / "state.json"
if not state_path.exists():
msg = f"Run state not found: {state_path}"
raise FileNotFoundError(msg)
with open(state_path, encoding="utf-8") as f:
state_data = json.load(f)
state = cls(
run_id=state_data["run_id"],
workflow_id=state_data["workflow_id"],
project_root=project_root,
)
state.status = RunStatus(state_data["status"])
state.current_step_index = state_data.get("current_step_index", 0)
state.current_step_id = state_data.get("current_step_id")
state.step_results = state_data.get("step_results", {})
state.created_at = state_data.get("created_at", "")
state.updated_at = state_data.get("updated_at", "")
inputs_path = runs_dir / "inputs.json"
if inputs_path.exists():
with open(inputs_path, encoding="utf-8") as f:
inputs_data = json.load(f)
state.inputs = inputs_data.get("inputs", {})
return state
def append_log(self, entry: dict[str, Any]) -> None:
"""Append a log entry to the run log."""
entry["timestamp"] = datetime.now(timezone.utc).isoformat()
self.log_entries.append(entry)
runs_dir = self.runs_dir
runs_dir.mkdir(parents=True, exist_ok=True)
with open(runs_dir / "log.jsonl", "a", encoding="utf-8") as f:
f.write(json.dumps(entry) + "\n")
# -- Workflow Engine ------------------------------------------------------
class WorkflowEngine:
"""Orchestrator that loads, validates, and executes workflow definitions."""
def __init__(self, project_root: Path | None = None) -> None:
self.project_root = project_root or Path(".")
self.on_step_start: Any = None # Callable[[str, str], None] | None
def load_workflow(self, source: str | Path) -> WorkflowDefinition:
"""Load a workflow from an installed ID or a local YAML path.
Parameters
----------
source:
Either a workflow ID (looked up in the installed workflows
directory) or a path to a YAML file.
Returns
-------
A parsed ``WorkflowDefinition`` (not yet validated; call
``validate_workflow()`` or ``engine.validate()`` separately).
Raises
------
FileNotFoundError:
If the workflow file cannot be found.
ValueError:
If the workflow YAML is invalid.
"""
path = Path(source)
# Try as a direct file path first
if path.suffix in (".yml", ".yaml") and path.exists():
return WorkflowDefinition.from_yaml(path)
# Try as an installed workflow ID
installed_path = (
self.project_root
/ ".specify"
/ "workflows"
/ str(source)
/ "workflow.yml"
)
if installed_path.exists():
return WorkflowDefinition.from_yaml(installed_path)
msg = f"Workflow not found: {source}"
raise FileNotFoundError(msg)
def validate(self, definition: WorkflowDefinition) -> list[str]:
"""Validate a workflow definition."""
return validate_workflow(definition)
def execute(
self,
definition: WorkflowDefinition,
inputs: dict[str, Any] | None = None,
run_id: str | None = None,
) -> RunState:
"""Execute a workflow definition.
Parameters
----------
definition:
The validated workflow definition.
inputs:
User-provided input values.
run_id:
Optional run ID (auto-generated if not provided).
Returns
-------
The final ``RunState`` after execution completes (or pauses).
"""
from . import STEP_REGISTRY
state = RunState(
run_id=run_id,
workflow_id=definition.id,
project_root=self.project_root,
)
# Persist a copy of the workflow definition so resume can
# reload it even if the original source is no longer available
# (e.g. a local YAML path that was moved or deleted).
run_dir = self.project_root / ".specify" / "workflows" / "runs" / state.run_id
run_dir.mkdir(parents=True, exist_ok=True)
workflow_copy = run_dir / "workflow.yml"
import yaml
with open(workflow_copy, "w", encoding="utf-8") as f:
yaml.safe_dump(definition.data, f, sort_keys=False)
# Resolve inputs
resolved_inputs = self._resolve_inputs(definition, inputs or {})
state.inputs = resolved_inputs
state.status = RunStatus.RUNNING
state.save()
context = StepContext(
inputs=resolved_inputs,
default_integration=definition.default_integration,
default_model=definition.default_model,
default_options=definition.default_options,
project_root=str(self.project_root),
run_id=state.run_id,
)
# Execute steps
try:
self._execute_steps(definition.steps, context, state, STEP_REGISTRY)
except KeyboardInterrupt:
state.status = RunStatus.PAUSED
state.append_log({"event": "workflow_interrupted"})
state.save()
return state
except Exception as exc:
state.status = RunStatus.FAILED
state.append_log({"event": "workflow_failed", "error": str(exc)})
state.save()
raise
if state.status == RunStatus.RUNNING:
state.status = RunStatus.COMPLETED
state.append_log({"event": "workflow_finished", "status": state.status.value})
state.save()
return state
def resume(self, run_id: str) -> RunState:
"""Resume a paused or failed workflow run."""
state = RunState.load(run_id, self.project_root)
if state.status not in (RunStatus.PAUSED, RunStatus.FAILED):
msg = f"Cannot resume run {run_id!r} with status {state.status.value!r}."
raise ValueError(msg)
# Load the workflow definition — try the persisted copy in the
# run directory first so resume works even if the original
# source (e.g. a local YAML path) is no longer available.
run_dir = self.project_root / ".specify" / "workflows" / "runs" / run_id
run_copy = run_dir / "workflow.yml"
if run_copy.exists():
definition = WorkflowDefinition.from_yaml(run_copy)
else:
definition = self.load_workflow(state.workflow_id)
# Restore context
context = StepContext(
inputs=state.inputs,
steps=state.step_results,
default_integration=definition.default_integration,
default_model=definition.default_model,
default_options=definition.default_options,
project_root=str(self.project_root),
run_id=state.run_id,
)
from . import STEP_REGISTRY
state.status = RunStatus.RUNNING
state.save()
# Resume from the current step — re-execute it so gates
# can prompt interactively again.
remaining_steps = definition.steps[state.current_step_index :]
step_offset = state.current_step_index
try:
self._execute_steps(
remaining_steps, context, state, STEP_REGISTRY,
step_offset=step_offset,
)
except KeyboardInterrupt:
state.status = RunStatus.PAUSED
state.append_log({"event": "workflow_interrupted"})
state.save()
return state
except Exception as exc:
state.status = RunStatus.FAILED
state.append_log({"event": "resume_failed", "error": str(exc)})
state.save()
raise
if state.status == RunStatus.RUNNING:
state.status = RunStatus.COMPLETED
state.append_log({"event": "workflow_finished", "status": state.status.value})
state.save()
return state
def _execute_steps(
self,
steps: list[dict[str, Any]],
context: StepContext,
state: RunState,
registry: dict[str, Any],
*,
step_offset: int = 0,
) -> None:
"""Execute a list of steps sequentially."""
for i, step_config in enumerate(steps):
step_id = step_config.get("id", f"step-{i}")
step_type = step_config.get("type", "command")
state.current_step_id = step_id
if step_offset >= 0:
state.current_step_index = step_offset + i
state.save()
state.append_log(
{"event": "step_started", "step_id": step_id, "type": step_type}
)
# Log progress — use the engine's on_step_start callback if set,
# otherwise stay silent (library-safe default).
label = step_config.get("command", "") or step_type
if self.on_step_start is not None:
self.on_step_start(step_id, label)
step_impl = registry.get(step_type)
if not step_impl:
state.status = RunStatus.FAILED
state.append_log(
{
"event": "step_failed",
"step_id": step_id,
"error": f"Unknown step type: {step_type!r}",
}
)
state.save()
return
result: StepResult = step_impl.execute(step_config, context)
# Record step results — prefer resolved values from step output
step_data = {
"integration": result.output.get("integration")
or step_config.get("integration")
or context.default_integration,
"model": result.output.get("model")
or step_config.get("model")
or context.default_model,
"options": result.output.get("options")
or step_config.get("options", {}),
"input": result.output.get("input")
or step_config.get("input", {}),
"output": result.output,
"status": result.status.value,
}
context.steps[step_id] = step_data
state.step_results[step_id] = step_data
state.append_log(
{
"event": "step_completed",
"step_id": step_id,
"status": result.status.value,
}
)
# Handle gate pauses
if result.status == StepStatus.PAUSED:
state.status = RunStatus.PAUSED
state.save()
return
# Handle failures
if result.status == StepStatus.FAILED:
# Gate abort (output.aborted) maps to ABORTED status
if result.output.get("aborted"):
state.status = RunStatus.ABORTED
state.append_log(
{
"event": "workflow_aborted",
"step_id": step_id,
}
)
else:
state.status = RunStatus.FAILED
state.append_log(
{
"event": "step_failed",
"step_id": step_id,
"error": result.error,
}
)
state.save()
return
# Execute nested steps (from control flow)
# NOTE: Nested steps run with step_offset=-1 so they don't
# update current_step_index. If a nested step pauses,
# resume will re-run the parent step and its nested body.
# A step-path stack for exact nested resume is a future
# enhancement.
if result.next_steps:
self._execute_steps(
result.next_steps, context, state, registry,
step_offset=-1,
)
if state.status in (
RunStatus.PAUSED,
RunStatus.FAILED,
RunStatus.ABORTED,
):
return
# Loop iteration: while/do-while re-evaluate after body
if step_type in ("while", "do-while"):
from .expressions import evaluate_condition
max_iters = step_config.get("max_iterations")
if not isinstance(max_iters, int) or max_iters < 1:
max_iters = 10
condition = step_config.get("condition", False)
for _loop_iter in range(max_iters - 1):
if not evaluate_condition(condition, context):
break
# Namespace nested step IDs per iteration
iter_steps = []
for ns in result.next_steps:
ns_copy = dict(ns)
if "id" in ns_copy:
ns_copy["id"] = f"{step_id}:{ns_copy['id']}:{_loop_iter + 1}"
iter_steps.append(ns_copy)
self._execute_steps(
iter_steps, context, state, registry,
step_offset=-1,
)
if state.status in (
RunStatus.PAUSED,
RunStatus.FAILED,
RunStatus.ABORTED,
):
return
# Fan-out: execute nested step template per item with unique IDs
if step_type == "fan-out":
items = result.output.get("items", [])
template = result.output.get("step_template", {})
if template and items:
fan_out_results = []
for item_idx, item_val in enumerate(result.output["items"]):
context.item = item_val
# Per-item ID: parentId:templateId:index
item_step = dict(template)
base_id = item_step.get("id", "item")
item_step["id"] = f"{step_id}:{base_id}:{item_idx}"
self._execute_steps(
[item_step], context, state, registry,
step_offset=-1,
)
# Collect per-item result for fan-in
item_result = context.steps.get(item_step["id"], {})
fan_out_results.append(item_result.get("output", {}))
if state.status in (
RunStatus.PAUSED,
RunStatus.FAILED,
RunStatus.ABORTED,
):
break
context.item = None
# Preserve original output and add collected results
fan_out_output = dict(result.output)
fan_out_output["results"] = fan_out_results
context.steps[step_id]["output"] = fan_out_output
state.step_results[step_id]["output"] = fan_out_output
if state.status in (
RunStatus.PAUSED,
RunStatus.FAILED,
RunStatus.ABORTED,
):
return
else:
# Empty items or no template — normalize output
result.output["results"] = []
context.steps[step_id]["output"] = result.output
state.step_results[step_id]["output"] = result.output
def _resolve_inputs(
self,
definition: WorkflowDefinition,
provided: dict[str, Any],
) -> dict[str, Any]:
"""Resolve workflow inputs against definitions and provided values."""
resolved: dict[str, Any] = {}
for name, input_def in definition.inputs.items():
if not isinstance(input_def, dict):
continue
if name in provided:
resolved[name] = self._coerce_input(
name, provided[name], input_def
)
elif "default" in input_def:
resolved[name] = input_def["default"]
elif input_def.get("required", False):
msg = f"Required input {name!r} not provided."
raise ValueError(msg)
return resolved
@staticmethod
def _coerce_input(
name: str, value: Any, input_def: dict[str, Any]
) -> Any:
"""Coerce a provided input value to the declared type."""
input_type = input_def.get("type", "string")
enum_values = input_def.get("enum")
if input_type == "number":
try:
value = float(value)
if value == int(value):
value = int(value)
except (ValueError, TypeError):
msg = f"Input {name!r} expected a number, got {value!r}."
raise ValueError(msg) from None
elif input_type == "boolean":
if isinstance(value, str):
if value.lower() in ("true", "1", "yes"):
value = True
elif value.lower() in ("false", "0", "no"):
value = False
else:
msg = f"Input {name!r} expected a boolean, got {value!r}."
raise ValueError(msg)
if enum_values is not None and value not in enum_values:
msg = (
f"Input {name!r} value {value!r} not in allowed "
f"values: {enum_values}."
)
raise ValueError(msg)
return value
def list_runs(self) -> list[dict[str, Any]]:
"""List all workflow runs in the project."""
runs_dir = self.project_root / ".specify" / "workflows" / "runs"
if not runs_dir.exists():
return []
runs: list[dict[str, Any]] = []
for run_dir in sorted(runs_dir.iterdir()):
if not run_dir.is_dir():
continue
state_path = run_dir / "state.json"
if state_path.exists():
with open(state_path, encoding="utf-8") as f:
state_data = json.load(f)
runs.append(state_data)
return runs
class WorkflowAbortError(Exception):
"""Raised when a workflow is aborted (e.g., gate rejection)."""