mirror of
https://github.com/github/spec-kit.git
synced 2026-07-05 21:49:47 +08:00
* feat: surface gate detail in the workflow run/resume --json payload A paused run was indistinguishable from any other pause in the machine-readable outcome, and the gate's prompt/options/choice never left the human-facing stream. Record each step's type in the run state's step results (one engine line) and, when the run sits at a gate, add a gate block (step_id/message/options/choice) to the payload so orchestrators can drive review gates without parsing stdout. Reference implementation for the proposal in #2964. Addresses #2964 * fix(workflow): only surface gate detail in --json when the run is paused Address review (#2965): _gate_outcome() emitted a gate block whenever current_step_id pointed at a gate step. Since RunState.current_step_id is never cleared on completion, a completed/failed run whose last step was a gate leaked stale gate detail in run/resume/status --json. Guard on status == paused. Also assert CLI success in the _run_json test helper before JSON-parsing, and add direct coverage for the suppression guard. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(workflows): surface gate block on aborted runs; stabilize message Address Copilot review: - `_gate_outcome` now also surfaces the gate block when a run is `aborted` by a gate rejection (`on_reject: abort`), not only when `paused`. Abort is the only path that sets ABORTED and it leaves current_step_id on the gate, so an orchestrator can read the recorded `choice` for the stop. - Coerce `message` to a string (it may be a non-string YAML literal that GateStep only coerces for interpolation) so the JSON schema stays stable. - Tests: add a CLI-level aborted-path test, a message-coercion test, and extend the suppression test to allow `aborted`; share the run helper via `_invoke_json` to avoid duplicating the invoke boilerplate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): assert clean exit in gate-abort JSON test Address Copilot review: the gate-abort test parsed stdout without first asserting the CLI exited cleanly, so an invoke failure would surface as an opaque JSON decode error. Route it through `_run_json` (which asserts exit_code == 0 before parsing) and drop the now-redundant `_invoke_json` helper — a gate abort emits the payload and returns, so the run exits 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: use result.output in run-helper assert; document step_data shape Address Copilot review: - `_run_json` asserted with `result.stdout` in the message, but under `--json` step output is redirected off stdout — the useful diagnostics live on `result.output`. Switch the assertion message to `result.output` (the JSON parse still reads stdout), matching the other CLI tests. - `StepContext.steps` documented a 5-key entry shape; the engine now also persists `type` and `status`. Update the docstring to the canonical 7-key shape so step authors/debuggers see the real record. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): align gate-abort JSON test with aborted→exit-1 After rebasing onto main, a gate abort now emits the --json payload and then exits non-zero (`_run_outcome_exit_code` maps aborted → 1, from the merged exit-code work). Give `_run_json` an `expected_exit` parameter (default 0) so the abort case asserts exit 1 while the paused/completed cases stay at 0 — keeping a single shared helper rather than duplicating the invoke boilerplate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): backward-compat gate detection + normalize gate options Address Copilot review: - A run paused by an older version has no persisted step `type`, so `_gate_outcome` would never surface its gate block on resume. Add `_is_gate_step`: prefer the `type` field, but when it is absent fall back to the gate's unique output signature (`on_reject`, written only by GateStep). A record with a different known `type` is still not a gate. - Normalize `options` to a list of strings (mirroring the `message` coercion) so an unvalidated workflow with non-string options can't destabilize the JSON schema. - Tests: options coercion, type-less gate detection, and a type-less non-gate negative case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): normalize non-list gate options to a stable list[str] Address Copilot review: the prior options normalization only mapped a `list`, returning the raw value for any other shape (scalar/tuple), which contradicted the "stable list[str]" intent. Extract `_normalize_gate_options`: None stays None; list/tuple maps each element through str; any other scalar becomes a single-element list (a bare string is one option, never iterated character-by-character). The emitted schema is now always list[str] | None. Extend the options test to cover list, tuple, bare string, numeric scalar, and None. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): normalize gate choice to str; portable plain-gate test Address Copilot review: - `_gate_outcome` normalized `message` and `options` but passed `choice` through as-is; an unvalidated gate can record a non-string `choice`, which contradicts the stable-schema rationale. Coerce `choice` to `str | None` (None still means "no decision yet"), consistent with the other two fields. Adds a focused choice-coercion test. - The plain (no-gate) test workflow used `run: "true"`, which fails under cmd.exe on Windows (ShellStep uses shell=True). Use the cross-platform `run: "exit 0"` (matching the exit-code suite's workflows). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
134 lines
3.7 KiB
Python
134 lines
3.7 KiB
Python
"""Base classes for workflow step types.
|
|
|
|
Provides:
|
|
- ``StepBase`` — abstract base every step type must implement.
|
|
- ``StepContext`` — execution context passed to each step.
|
|
- ``StepResult`` — return value from step execution.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
from abc import ABC, abstractmethod
|
|
from dataclasses import dataclass, field
|
|
from enum import Enum
|
|
from typing import Any
|
|
|
|
|
|
class StepStatus(str, Enum):
|
|
"""Status of a step execution."""
|
|
|
|
PENDING = "pending"
|
|
RUNNING = "running"
|
|
COMPLETED = "completed"
|
|
FAILED = "failed"
|
|
SKIPPED = "skipped"
|
|
PAUSED = "paused"
|
|
|
|
|
|
class RunStatus(str, Enum):
|
|
"""Status of a workflow run."""
|
|
|
|
CREATED = "created"
|
|
RUNNING = "running"
|
|
PAUSED = "paused"
|
|
COMPLETED = "completed"
|
|
FAILED = "failed"
|
|
ABORTED = "aborted"
|
|
|
|
|
|
@dataclass
|
|
class StepContext:
|
|
"""Execution context passed to each step.
|
|
|
|
Contains everything the step needs to resolve expressions, dispatch
|
|
commands, and record results.
|
|
"""
|
|
|
|
#: Resolved workflow inputs (from user prompts / defaults).
|
|
inputs: dict[str, Any] = field(default_factory=dict)
|
|
|
|
#: Accumulated step results keyed by step ID. Each entry is the dict the
|
|
#: engine persists per step:
|
|
#: ``{"type": ..., "integration": ..., "model": ..., "options": ...,
|
|
#: "input": ..., "output": ..., "status": ...}``.
|
|
steps: dict[str, dict[str, Any]] = field(default_factory=dict)
|
|
|
|
#: Current fan-out item (set only inside fan-out iterations).
|
|
item: Any = None
|
|
|
|
#: Fan-in aggregated results (set only for fan-in steps).
|
|
fan_in: dict[str, Any] = field(default_factory=dict)
|
|
|
|
#: Workflow-level default integration key.
|
|
default_integration: str | None = None
|
|
|
|
#: Workflow-level default model.
|
|
default_model: str | None = None
|
|
|
|
#: Workflow-level default options.
|
|
default_options: dict[str, Any] = field(default_factory=dict)
|
|
|
|
#: Project root path.
|
|
project_root: str | None = None
|
|
|
|
#: Current run ID.
|
|
run_id: str | None = None
|
|
|
|
|
|
@dataclass
|
|
class StepResult:
|
|
"""Return value from a step execution."""
|
|
|
|
#: Step status.
|
|
status: StepStatus = StepStatus.COMPLETED
|
|
|
|
#: Output data (stored as ``steps.<id>.output``).
|
|
output: dict[str, Any] = field(default_factory=dict)
|
|
|
|
#: Nested steps to execute (for control-flow steps like if/then).
|
|
next_steps: list[dict[str, Any]] = field(default_factory=list)
|
|
|
|
#: Error message if step failed.
|
|
error: str | None = None
|
|
|
|
|
|
class StepBase(ABC):
|
|
"""Abstract base class for workflow step types.
|
|
|
|
Every step type — built-in or extension-provided — implements this
|
|
interface and registers in ``STEP_REGISTRY``.
|
|
"""
|
|
|
|
#: Matches the ``type:`` value in workflow YAML.
|
|
type_key: str = ""
|
|
|
|
@abstractmethod
|
|
def execute(self, config: dict[str, Any], context: StepContext) -> StepResult:
|
|
"""Execute the step with the given config and context.
|
|
|
|
Parameters
|
|
----------
|
|
config:
|
|
The step configuration from workflow YAML.
|
|
context:
|
|
The execution context with inputs, accumulated step results, etc.
|
|
|
|
Returns
|
|
-------
|
|
StepResult with status, output data, and optional nested steps.
|
|
"""
|
|
|
|
def validate(self, config: dict[str, Any]) -> list[str]:
|
|
"""Validate step configuration and return a list of error messages.
|
|
|
|
An empty list means the configuration is valid.
|
|
"""
|
|
errors: list[str] = []
|
|
if "id" not in config:
|
|
errors.append("Step is missing required 'id' field.")
|
|
return errors
|
|
|
|
def can_resume(self, state: dict[str, Any]) -> bool:
|
|
"""Return whether this step can be resumed from the given state."""
|
|
return True
|