mirror of
https://github.com/github/spec-kit.git
synced 2026-07-03 12:28:06 +08:00
v0.11.6
8 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
affbf5ead5 |
feat(workflows): add from_json expression filter (#2961)
* feat(workflows): add from_json expression filter Step outputs captured as strings could never become typed values in templates - the filter set was default/join/map/contains only, so e.g. a fan-out items: could never consume a step's JSON stdout. Add an arg-less from_json pipe filter with parse-or-raise semantics: invalid JSON or non-string input raises a clear ValueError rather than passing through silently. Fixes #2960 * fix(expressions): make from_json strict — reject any arguments Address review (#2961): from_json('x') and from_json() previously fell through to a silent passthrough of the unparsed value. Reject any parenthesized form with a clear error so mis-wired templates fail loudly. Rename test to ...parses_object (JSON under test is an object) and add coverage for the strict no-arguments behavior. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(workflows): document the from_json expression filter Address Copilot review: the user-facing filter references omitted the newly added `from_json` filter. Add it to the ARCHITECTURE.md filter table (with the `{{ steps.emit.output.stdout | from_json }}` example) and to the filter enumerations in workflows/README.md and docs/reference/workflows.md so the docs match the evaluator's capabilities. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): make from_json strictness reject trailing tokens; fix docstring Address Copilot review: - Strictness only rejected parenthesized forms, so typos like `| from_json)` or `| from_json extra` still fell through to the unknown-filter path and silently returned the unparsed value. Match on the leading filter token and require the whole filter to be exactly `from_json`, so every mis-wired form raises. Extend the rejection test to cover the trailing-token cases. - The module docstring claimed "no imports", which is misleading now that the module imports `json`. Reword to state the actual sandbox guarantee: templates cannot do file I/O, import modules, or run arbitrary code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com> |
||
|
|
00bff788c9 |
Add init workflow step to bootstrap projects like specify init (#2838)
* Initial plan * Add init workflow step to bootstrap projects like `specify init` * Address review: simplify stderr capture and extract VALID_SCRIPT_TYPES * Address review: fail fast on non-empty dir, stdout fallback, README force fix * Populate exit_code/stdout/stderr in non-empty-dir fast-fail * fix: address three unresolved review comments in InitStep - Use `with os.scandir(...)` context manager so the iterator is always closed even when `any()` short-circuits, preventing file-descriptor leaks in long-running workflow runs. - Guard `os.chdir(prev_cwd)` in the `finally` block with a try/except so an `OSError` (e.g. directory deleted) doesn't bypass returning the captured `StepResult`. - Reject non-string `script` values in `validate()` with a clear error message, rather than silently passing them through to become `--script True` at runtime. * Potential fix for pull request finding 'Empty except' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * fix: remove no_git and branch_numbering options removed upstream The --no-git and --branch-numbering flags were removed from `specify init` on main. Update InitStep to drop these unsupported config fields and fix tests accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address review — integration defaults, integration_options, engine-owned dirs - Apply DEFAULT_INIT_INTEGRATION fallback when neither step config nor workflow context provides an integration, so output.integration always reflects the actual integration used. - Add integration_options config field to support --integration-options passthrough (required for generic integration and --skills mode). - Exclude .specify/ from the non-empty directory fast-fail check so that here: true works when the engine has already created its run-state directory before steps execute. - Note: mix_stderr=False is not needed — Click 8.2+ captures stderr separately by default and the existing try/except handles access. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: implicitly add --force when only engine-owned dirs exist When the workflow engine creates .specify/workflows/runs/ before steps execute, the directory is technically non-empty. Previously, specify init would prompt for confirmation (hanging in unattended mode) unless the user explicitly set force: true. Now the step detects that only engine-owned directories (.specify/) are present and implicitly adds --force so init proceeds without user interaction. Also fixes the test to exercise the implicit-force path rather than passing force: True explicitly (which bypassed the check entirely). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: derive VALID_SCRIPT_TYPES from shared constant, fail fast on OSError, include all resolved fields in output - Derive VALID_SCRIPT_TYPES from SCRIPT_TYPE_CHOICES in _agent_config so the valid set cannot drift from the specify init CLI. - Fail fast with a clear error when os.scandir() raises OSError (e.g. permission denied) instead of silently treating the directory as empty. - Include preset, force, and ignore_agent_tools in all output dicts (both fast-fail and normal paths) for consistent interpolation and debugging downstream. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: populate stderr from stdout on older Click, fix force comment wording - When Click does not expose result.stderr (older versions where stderr is mixed into stdout), use stdout as stderr on non-zero exit so workflows can consistently read steps.<id>.output.stderr for errors. - Update README inline comment for force: wording to say 'when target directory already exists' rather than 'non-empty directory', matching the actual specify init behavior for the project: form. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: build argv flags before early returns, use any() for dir scan - Move argv flag-building (--integration, --script, --preset, --ignore-agent-tools) before the non-empty-dir and OSError early returns so output['argv'] always reflects the complete command. - --force is appended after the check since it may be set implicitly. - Replace list comprehension with any() generator expression to short-circuit without allocating a full list of DirEntry objects. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: only treat .specify as engine-owned when it is a real directory A file or symlink named .specify should not be excluded from the non-empty check. Use entry.is_dir(follow_symlinks=False) to ensure only an actual directory is considered engine-owned content. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: guard implicit force for engine dirs only, fix integration fallback order - Only set implicit --force when engine-owned directories (.specify/) are actually present. A completely empty directory no longer gets --force added unnecessarily. - Fix integration resolution precedence: resolve step config expression first, then fall back to workflow default (also resolved), then to DEFAULT_INIT_INTEGRATION. Previously, a step expression resolving to falsy would bypass the workflow default entirely. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Manfred Riem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> |
||
|
|
c52ccd7dc7 |
Add workflow step catalog — community-installable step types (#2394)
* Initial plan * Add workflow step catalog: StepRegistry, StepCatalog, CLI commands, and tests Agent-Logs-Url: https://github.com/github/spec-kit/sessions/2885e646-477d-4df8-b9a3-06d8cb29e748 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Potential fix for pull request finding 'An assert statement has a side-effect' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Address PR review: path traversal, cache robustness, collision check, failed-to-load display - Add resolve()+relative_to() path traversal guards in workflow_step_add and workflow_step_remove to prevent directory escape via step_id - Harden _is_url_cache_valid in both StepCatalog and WorkflowCatalog to coerce fetched_at to float and catch TypeError/ValueError - Check STEP_REGISTRY and StepRegistry before installing to prevent collisions with built-in step types or already-installed steps - Show 'Custom (installed, failed to load)' section in workflow step list for steps in the registry that failed to load into STEP_REGISTRY * Fix StepRegistry shape validation and StepCatalog empty-YAML handling Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Polish: rename _default to default_registry, strengthen unreadable-file test Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address PR review: atomic install, hostname validation, cache resilience, no dynamic imports in list/info Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Fix shutil.move with existing step_dir: remove before move to avoid subdirectory nesting Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Call load_custom_steps at execution time; enforce hostname in _safe_fetch and _validate_url Agent-Logs-Url: https://github.com/github/spec-kit/sessions/73865880-fb25-4061-a43e-4e4b4d1c4de6 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Wrap YAML parsing in try/except; atomic step install via os.rename() under same fs Agent-Logs-Url: https://github.com/github/spec-kit/sessions/ff915bc5-ec7e-4e6a-b505-35f5795250df Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Validate YAML root is a dict in _load_catalog_config and workflow_step_add; fix WorkflowCatalog hostname validation Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Fix load_custom_steps() package imports and add reserved step ID validation * Move _re/_sys imports out of loop and _RESERVED_STEP_IDS to module level * Address review: collision-resistant module names, extra_files support, remove orphan dir * Harden extra_files: warn on non-dict, resolve symlinks in path traversal check * Switch _safe_fetch and StepCatalog._fetch_single_catalog to use open_url for auth consistency * Harden step_id validation against path-segment tricks; raise on StepRegistry.save() OSError * Clean up sys.modules on broken step packages; handle StepValidationError in step add/remove * Address review thread: int-coerce priorities, sys.modules cleanup, _require_specify_project, registry-first remove * fix: normalize workflow step catalog metadata fallbacks * fix: address latest workflow step and catalog review findings * Handle non-string extra_files keys in workflow step add * Harden StepRegistry symlink reads and extra_files path/URL validation * Harden custom step loader and step remove against symlinks and OSError * Fix StepCatalog.search() to coerce non-string fields before joining * Fix WorkflowCatalog YAML parsing error handling and isinstance checks * Harden step registry save and custom step/catalog ID handling * Harden cache validation and staging OSError handling * Address review: reorder symlink guard and split mixed test - Move symlink-parent check before is_dir() in load_custom_steps() so we never stat an external target through a symlink - Split test_get_merged_steps_normalizes_list_ids_to_strings into two focused tests: one for list-id normalization, one for get_step_info return values Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review: symlink-before-stat in loader, restore registry on rmtree failure - load_custom_steps(): check is_symlink() before is_dir() on step directories so symlinked entries are skipped without statting external targets - workflow_step_remove: restore the registry entry when shutil.rmtree() fails so filesystem and registry state stay consistent and a future 'step add' isn't blocked Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Harden step_id validation and file-write error handling - _validate_step_id_or_exit: reject whitespace-only/padded IDs, Windows-invalid characters (<>:"|?*), control characters, trailing dots/spaces, and Windows reserved device names (con, nul, etc.) - Wrap step.yml/__init__.py staging writes in OSError handler - Wrap extra_files disk writes (mkdir + write_bytes) in OSError handler that names the failing relative path - Registry rollback on rmtree failure: restore verbatim metadata and emit a warning if the restore itself fails Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review: cache symlink guard, verbatim registry rollback, Windows test fix - StepCatalog: add _is_cache_path_safe() guard that checks for symlinks in .specify/workflows/steps/.cache path; skip cache reads and writes when any component is symlinked to prevent writes outside project root - Registry rollback: write metadata directly to registry.data['steps'] and call save() instead of using add() which overwrites timestamps - temp_dir fixture: use ignore_errors=True on Windows to avoid flaky teardown from locked file handles (WinError 32) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Simplify exec_module call by removing redundant nested try/except Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix empty YAML tolerance in WorkflowCatalog.add_catalog, scope ignore_errors to Windows - WorkflowCatalog.add_catalog(): treat None from yaml.safe_load() (empty file) as an empty mapping instead of raising 'corrupted' - temp_dir fixture: limit ignore_errors to sys.platform == 'win32' so real cleanup issues surface on non-Windows platforms Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Chain exceptions in _load_catalog_config for both catalog classes Add 'from exc' to preserve root cause in tracebacks while keeping clean user-facing messages. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Make default catalog tests hermetic by isolating HOME Monkeypatch Path.home() to project_dir and clear catalog env vars so tests don't break on machines with a real ~/.specify/step-catalogs.yml or ~/.specify/workflow-catalogs.yml. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix falsy ID handling in _get_merged_steps for list-based catalogs Check for None explicitly instead of using 'or' which drops valid falsy IDs like 0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Compare reserved step IDs case-insensitively for filesystem safety On case-insensitive filesystems (Windows, common macOS), variants like STEP-REGISTRY.JSON would collide with the actual registry file. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add explanatory comments to intentional empty except blocks Document why cache-read failures are silently ignored in both WorkflowCatalog and StepCatalog _fetch_single_catalog methods. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> |
||
|
|
7bab0568c5 |
feat(workflows): add continue_on_error step field for non-halting failures (#2663)
* feat(workflows): add continue_on_error step field
Adds an optional `continue_on_error: bool` field on every step.
When set to `true` and the step fails, the engine records the
result (`exit_code`, `stderr` on `steps.<id>.output` plus `status`
as a sibling key on `steps.<id>`) and continues to the next sibling
step instead of halting the run. Downstream `if`, `switch`, or
`gate` steps can then branch on
`{{ steps.<id>.output.exit_code }}` to route the recovery path.
Engine details
--------------
`WorkflowEngine._execute_steps` now consults the step config when a
step returns `StepStatus.FAILED`:
- Gate aborts (`output.aborted`) always halt the run — operator
decisions take precedence over the flag.
- Otherwise, if `continue_on_error` is the literal `True`, log a
`step_continue_on_error` event and proceed to the next sibling.
The runtime check uses identity comparison (`is True`) rather
than truthiness, so truthy non-bool values like the string
`"true"` cannot silently change run semantics even if a caller
bypasses `validate_workflow()`.
- Otherwise, behave as before: log `step_failed`, set
`RunStatus.FAILED`, and return.
Validation
----------
`_validate_steps` rejects non-bool values for `continue_on_error`.
Coerced strings like `"true"` are not accepted so authoring
mistakes surface at validation time rather than silently changing
run semantics.
Tests
-----
`TestContinueOnError` in `tests/test_workflows.py` (8 tests):
- `test_undeclared_failure_halts_run` — default halt behaviour.
- `test_declared_and_fired_continues_run` — flag + fail → continue.
- `test_declared_but_step_succeeded_is_noop` — flag + success → no-op.
- `test_if_branch_routes_around_failure` — end-to-end recovery.
- `test_gate_abort_still_halts_with_continue_on_error` — abort
always halts.
- `test_validation_rejects_non_bool_continue_on_error` — `"true"`
rejected at validation.
- `test_validation_accepts_bool_continue_on_error` — `true`/`false`
pass cleanly.
- `test_engine_ignores_truthy_non_bool_continue_on_error` —
defense-in-depth: engine ignores string `"true"` even when
validation is bypassed.
Rebased onto current upstream/main (post #2664 merge); the new
`TestContinueOnError` class sits immediately after upstream's
`TestContextRunId` so the two feature suites coexist cleanly.
Closes #2591.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(workflows): restore runtime context section, clarify gate prompt
Two Copilot findings on
|
||
|
|
c6afe4cde1 |
feat(workflows): expose {{ context.run_id }} template variable (#2664)
* feat(workflows): expose `{{ context.run_id }}` template variable
Closes #2590.
Surfaces the engine-assigned run id (the same 8-character hex
string Spec Kit prints as `Run ID:` at the end of
`workflow run`) as a workflow template variable so YAML
authors can reference it from shell `run:`, command
`input.args:`, switch `expression:`, and any other field that
already evaluates `{{ ... }}` templates.
### Why
The run id is the natural join key between a Spec Kit workflow
run and downstream artifacts, telemetry, or per-run scratch
state. Today the operator sees it in stdout but workflows
themselves cannot reference it — there was no way to stamp a
log line, name a scratch directory, or tag an artifact with
the same id Spec Kit assigned.
The three motivating use cases from the issue:
1. Telemetry / observability — stamp logs and events with the
run id so external systems can join workflow runs to
downstream artifacts.
2. Per-run scratch / isolation — interactive operator commands
that need their own state directory under
`/tmp/run-<id>/`.
3. Run-id in artifact metadata — stable join key from artifact
back to the producing run.
### Implementation
`StepContext.run_id` is already populated by `WorkflowEngine`
in both `execute()` and `resume()`. The only gap was the
template namespace builder.
`_build_namespace` (in `workflows/expressions.py`) now adds a
`context` key alongside the existing `inputs`, `steps`,
`item`, and `fan_in` namespaces:
```python
ns["context"] = {"run_id": run_id}
```
The value is always present (even outside a run) and falls
back to an empty string when no run is active. Workflows
referencing `{{ context.run_id }}` therefore never error — a
hard requirement from the issue's acceptance criteria for
dry-run, validation, and ad-hoc evaluator usage.
### Default behaviour preserved
Workflows that do not reference `{{ context.run_id }}` are
byte-equivalent to before this change. The `context`
namespace is added unconditionally to keep template
resolution branch-free, but its presence has no observable
effect when nothing references it.
### Tests
`TestExpressions` (unit-level) gains three tests:
- `test_context_run_id_resolves` — direct lookup against a
`StepContext(run_id=...)`.
- `test_context_run_id_defaults_to_empty_when_unset` —
graceful default outside a run context.
- `test_context_run_id_string_interpolation` — mixed
template (e.g. `"RUN_ID={{ context.run_id }}"`).
`TestContextRunId` (end-to-end) covers the three step types
the acceptance criteria called out:
- `test_shell_run_resolves_run_id` — `run:` field
substitution, verified via captured stdout.
- `test_command_input_args_resolves_run_id` — `input.args:`
resolution, captured in step output even when CLI dispatch
is unavailable (the artifact-metadata use case).
- `test_switch_expression_matches_on_run_id` — switch
matches against the resolved value, proving the run id is a
first-class value in the expression engine, not just an
interpolation token.
- `test_workflow_without_context_reference_unchanged` —
locks the byte-equivalent default required by the issue.
### Docs
`workflows/README.md` gains a "Runtime Context" subsection
under "Expressions" documenting the new namespace and the
three canonical use patterns (telemetry, per-run scratch,
artifact metadata).
* test(workflows): drop inline double-quotes in run_id shell tests
`test_shell_run_resolves_run_id` and
`test_switch_expression_matches_on_run_id` used
`run: 'echo "RUN_ID={{ context.run_id }}"'` with inner double-quotes
around the echo argument. Bash/sh strips those quotes before invoking
echo, but cmd.exe (used on Windows when `shell=True`) treats them
as literal characters and emits `"RUN_ID=abc12345"` — failing the
exact-match assertion. Linux passed; all three Windows-latest matrix
entries failed with `assert '"RUN_ID=abc12345"' == 'RUN_ID=abc12345'`.
Resolve by dropping the inner double-quotes (the value has no spaces
or shell metacharacters) and wrapping the YAML scalar in plain
double-quotes the same way other shell-step tests in this file do
(e.g. `run: "echo b-saw-..."`). Behaviour-equivalent on POSIX,
portable to cmd.exe.
|
||
|
|
409ec59704 |
fix(workflow): support integration: auto to follow project's initialized AI (#2421)
* fix(workflow): support integration: auto to follow project's initialized AI Closes #2406 (squashed) * fix(workflow): combine JSONDecodeError and UnicodeDecodeError handling Address Copilot feedback: UnicodeDecodeError can be raised by both read_text() and json.loads(), so combining the handlers ensures both cases produce a consistent, clear error message. * fix(workflows): honor integration_state schema guard and modern state in 'integration: auto' Three Copilot follow-ups on PR #2421: 1. engine.py:799 — `_load_project_integration` was bypassing the same schema guard `_read_integration_json` enforces. It now reads the schema field directly, returns None on a future schema (so the workflow falls back to the literal 'auto' default rather than guessing), and routes through `normalize_integration_state` / `default_integration_key` so modern installs that record `default_integration` / `installed_integrations` (without the legacy top-level `integration` field) resolve correctly. 2. test_workflows.py — added two regression cases: - `integration: auto` resolves a modern normalized state file - `integration: auto` falls back when the state file declares a newer `integration_state_schema` than this CLI supports 3. test_cli.py — added a CLI-level regression for the `UnicodeDecodeError` branch in `_read_integration_json` to match the existing malformed-JSON coverage. * refactor(integration): extract shared try_read_integration_json helper Address Copilot review on PR #2421: Both `_read_integration_json` (CLI) and `_load_project_integration` (workflow engine) were parsing `.specify/integration.json` independently, duplicating the schema guard and risking drift between the two readers. Extract the parse + schema validation into a single low-level helper `try_read_integration_json` in `integration_state.py` that returns either the normalized state or a structured `IntegrationReadError`. Both callers now delegate to this helper: - CLI keeps its loud-fail UX: each error kind ("decode", "os", "not_object", "schema_too_new") is translated into the existing console message + typer.Exit(1). - Engine keeps its silent fallback: any error simply returns None so `integration: auto` falls back to the workflow's literal default. This eliminates the divergence Copilot flagged without changing observable behavior for either caller. * fix(integration): distinguish missing file from non-regular path Address Copilot review on PR #2421: `try_read_integration_json` was collapsing two distinct cases into a single `(None, None)` return: 1. `.specify/integration.json` truly missing — silent fallback is correct. 2. Path exists but is a directory, socket, or other non-regular file — this is a misconfiguration the CLI should surface loudly. Split the check: `exists()` falsey returns `(None, None)`; existing-but- not-a-regular-file returns `(None, IntegrationReadError(kind="os", ...))` so the CLI's loud-fail path produces an actionable error while the engine still treats it as a fallback to the workflow's literal default. * docs(workflow): clarify version pin, advisory integrations list, enum exemption - workflow.yml: fix comment that said 0.8.3 was first release with auto resolution; the pin is >=0.8.5 so the comment now matches the pin. - workflow.yml: clarify that requires.integrations.any is an advisory, non-exhaustive compatibility hint, not a closed set. - engine.py: clarify that the auto-sentinel exemption only skips enum membership; declared type is still enforced through _coerce_input. * fix(workflow): resolve auto sentinel for provided values; report stat errors Two Copilot findings fixed: 1. _resolve_inputs only resolved the ``integration: auto`` sentinel when it came from the input default. A caller explicitly providing ``{"integration": "auto"}`` (which the workflow prompt advertises as a valid value) bypassed _resolve_default and the literal "auto" leaked to dispatch. Provided values now go through the same resolution path as defaults, and the enum-membership exemption applies in both cases. Regression test added. 2. try_read_integration_json used Path.exists() / Path.is_file() as a pre-check. Both return False on some OSErrors (e.g. permission errors during stat), which silently treated an unreadable-but-present file as missing — the engine fell back without warning and the CLI failed to surface the loud error. The pre-check is gone: read_text() is attempted directly, FileNotFoundError means missing (silent fallback), IsADirectoryError and other OSErrors become loud IntegrationReadError. * fix(workflow): enforce declared type for string inputs, reject bool-as-number Two Copilot findings fixed: 1. _coerce_input previously coerced/validated only ``number`` and ``boolean`` types, so ``type: string`` silently accepted any Python value (numbers, lists, dicts). A YAML authoring mistake like ``type: string`` + ``default: 5`` slipped through. Strings are now required to actually be strings; non-strings raise ValueError, which surfaces as an ``invalid default`` error from validate_workflow. 2. ``type: number`` accepted ``default: true`` because ``bool`` is a subclass of ``int`` (``float(True) == 1.0``). Bools are now rejected explicitly in the number path so the YAML mistake fails fast. The boolean path is also tightened to reject non-bool / non-string values for symmetry. Comment on the auto-sentinel enum exemption updated to reflect the stronger guarantee. Regression tests added for both rejections. * fix(cli): drop unused normalize_integration_state import to satisfy ruff CI's `uvx ruff check src/` flagged this as F401: the symbol was imported under a private alias but never referenced. Tests stay green after removal. |
||
|
|
02a1d610df |
docs: add workflows reference, reorganize into docs/reference/, and add --version flag (#2244)
* docs: add workflows reference, reorganize into docs/reference/, and add --version flag - Move integrations.md, extensions.md, presets.md into docs/reference/ - New docs/reference/workflows.md: command reference for all workflow commands, built-in SDD Cycle workflow with Mermaid diagram, step types, expressions, input types, state/resume, and FAQ - Rename workflow input feature_name to spec with prompt 'Describe what you want to build' to match speckit.specify command terminology - Add --version / -V flag to root specify command with tests - Update docs/toc.yml, README.md links, and docs/upgrade.md cross-reference to use reference/ paths - Add workflow command to README CLI reference table * docs: update speckit_version requirement to >=0.7.2 in workflow example |
||
|
|
a00e679918 |
Add workflow engine with catalog system (#2158)
* Initial plan * Add workflow engine with step registry, expression engine, catalog system, and CLI commands Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Add comprehensive tests for workflow engine (94 tests) Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address review feedback: do-while condition preservation and URL scheme validation Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address review feedback, add CLI dispatch, interactive gates, and docs Review comments (7/7): - Add explanatory comment to empty except block - Implement workflow catalog download with cleanup on failure - Add input type coercion for number/boolean/enum - Fix example workflow to remove non-existent output references - Fix while_loop and if_then condition defaults (string 'false' → bool False) - Fix resume step index tracking with step_offset parameter CLI dispatch: - Add build_exec_args() and dispatch_command() to IntegrationBase - Override for Claude (skills: /speckit-specify), Gemini (-m flag), Codex (codex exec), Copilot (--agent speckit.specify) - CommandStep invokes installed commands by name via integration CLI - Add PromptStep for arbitrary inline prompts (10th step type) - Stream CLI output live to terminal (no silent blocking) - Remove timeout when streaming (user can Ctrl+C) - Ctrl+C saves state as PAUSED for clean resume Interactive gates: - Gate steps prompt [1] approve [2] reject in TTY - Fall back to PAUSED in non-interactive environments - Resume re-executes the gate for interactive prompting Documentation: - workflows/README.md — user guide - workflows/ARCHITECTURE.md — internals with Mermaid diagrams - workflows/PUBLISHING.md — catalog submission guide Tests: 94 → 122 workflow tests, 1362 total (all passing) * Fix ruff lint errors: unused imports, f-string placeholders, undefined name * Address second review: registry-backed validation, shell failures, loop/fan-out execution, URL validation - VALID_STEP_TYPES now queries STEP_REGISTRY dynamically - Shell step returns FAILED on non-zero exit code - Persist workflow YAML in run directory for reliable resume - Resume loads from run copy, falls back to installed workflow - Engine iterates while/do-while loops up to max_iterations - Engine expands fan-out per item with context.item - HTTPS URL validation for catalog workflow installs (HTTP allowed for localhost) - Fix catalog merge priority docstring (lower number wins) - Fix dispatch_command docstring (no build_exec_args_for_command) - Gate on_reject=retry pauses for re-prompt on resume - Update docs to 10 step types, add prompt step to tables and README * Potential fix for pull request finding 'Empty except' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Address third review: fan-out IDs, catalog guards, shell coercion, docs - Fan-out generates unique per-item step IDs and collects results - Catalog merge skips non-dict workflow entries (malformed data guard) - Shell step coerces run_cmd to str after expression evaluation - urlopen timeout=30 for catalog workflow installs - yaml.dump with sort_keys=False, allow_unicode=True for catalog configs - Document streaming timeout as intentionally unbounded (user Ctrl+C) - Document --allow-all-tools as required for non-interactive + future enhancement - Update test docstring and PUBLISHING.md to 10 step types with prompt * Validate final URL after redirects in catalog fetch urlopen follows redirects, so validate the response URL against the same HTTPS/localhost rules to prevent redirect-based downgrade attacks. * Address fourth review: filter arg eval, tags normalization, install redirect check - Filter arguments now evaluated via _evaluate_simple_expression() so default(42) returns int not string - Tags normalized: non-list/non-string values handled gracefully - Install URL redirect validation (same as catalog fetch) - Remove unused 'skipped' variable in catalog config parsing - Author 'github' → 'GitHub' in example workflow - Document nested step resume limitation (re-runs parent step) * Add explanatory comment to empty except ValueError block * Address fifth review: expression parsing, fan-out output, URL install, gate options - Move string literal parsing before operator detection in expressions so quoted strings with operators (e.g. 'a in b') are not mis-parsed - Fan-out: remove max_concurrency from persisted output, fix docstring to reflect sequential execution - workflow add: support URL sources with HTTPS/redirect validation, validate workflow ID is non-empty before writing files - Deduplicate local install logic via _validate_and_install_local() - Remove 'edit' gate option from speckit workflow (not implemented) * Add comments to empty except ValueError blocks in URL install * Address sixth review: operator precedence, fan_in cleanup, registry resilience, docs - Fix or/and operator precedence (or parsed first = lower precedence) - Restore context.fan_in after fan-in step completes - Catch JSONDecodeError in registry load for corrupted files - Replace print() with on_step_start callback (library-safe) - Gate validation warns when on_reject set but no reject option - Shell step: document shell=True security tradeoff - README: sdd-pipeline → speckit, parallel → sequential for fan-out - ARCHITECTURE.md: parallel → fan-out/fan-in in diagram * Address seventh review: string literal before pipe, type annotations, validate on install - Move string literal check above pipe filter parsing so 'a | b' works - Fix type annotations: input_values list[str] | None, run_id str | None - Run validate_workflow() before installing from local path/URL - Remove duplicate string literal check from expression parser * Address eighth review: fan-out namespaced IDs, early return, catalog validation - Fan-out per-item step IDs use _fanout_{step_id}_{base}_{idx} namespace to avoid collisions with user-defined step IDs - Early return after fan-out loop when state is paused/failed/aborted - Catalog installs parse + validate downloaded YAML before registering, using definition metadata instead of catalog entry for registry * Address ninth review: populate catalog, fix indentation, priority, README - Add speckit workflow entry to catalog.json so it's discoverable - Fix shell step output dict indentation - Catalog add_catalog priority derived from max existing + 1 - README Quick Start clarified with install + local file examples * Address tenth review: max_iterations validation, catalog config guard, version alignment - Validate max_iterations is int >= 1 in while and do-while steps - Guard add_catalog against corrupted config (non-dict/non-list) - Align speckit_version requirement to >=0.6.1 (current package version) - Fan-out template validation uses separate seen_ids set to avoid false duplication errors with user-defined step IDs * Address eleventh review: command step fails without CLI, ID mismatch warning, state persistence - Command step returns FAILED when CLI not installed (was silent COMPLETED) - Catalog install warns on workflow ID vs catalog key mismatch - Engine persists state.save() before returning on unknown step type - Update tests to expect FAILED for command steps without CLI - Integration tests use shell steps for CLI-independent execution * Address twelfth review: type annotations, version examples, streaming docs, requires - Fix workflow_search type annotations (str | None) - PUBLISHING.md: speckit_version >=0.15.0 → >=0.6.1 - Document that exit_code is captured and referenceable by later steps - Mark requires as declared-but-not-enforced (planned enhancement) - Note full stdout/stderr capture as planned enhancement * Enforce catalog key matches workflow ID (fail instead of warn) * Bundle speckit workflow: auto-install during specify init - Add workflows/speckit to pyproject.toml force-include for wheel builds - Add _locate_bundled_workflow() helper (mirrors _locate_bundled_extension) - Auto-install speckit workflow during specify init (after git extension) - Update all integration file inventory tests to expect workflow files * Address fourteenth review: prompt fails without CLI, resolved step data, fan-out normalization - PromptStep returns FAILED when CLI not installed (was silent COMPLETED) - Engine step_data prefers resolved values from step output - Fan-out normalizes output.results=[] for empty item lists - subprocess.run inherits stdout/stderr (no explicit sys.stdout) - Registry tests use issubset for extensibility * Address fifteenth review: fan_in docstring, gate defaults, validation guards, reserved prefix - FanInStep docstring: aggregate-only, no blocking semantics - FanInStep: guard output_config as dict, handle None - Gate validate: use same default options as execute - Validate inputs is dict and steps is list before iterating - Reserve _fanout_ prefix in step ID validation - PUBLISHING.md: remove unenforced checklist items, add _fanout_ note * Address sixteenth review: docs regex, fan_in try/finally, hyphenated dot-path keys - PUBLISHING.md: update ID regex docs to match implementation (single-char OK) - FanInStep: wrap expression evaluation in try/finally for context.fan_in - Expression dot-path: allow hyphens in keys before list index (e.g. run-tests[0]) * Make speckit workflow integration-agnostic, document Copilot CLI requirement - Workflow integration selectable via input (default: claude) - Each command step uses {{ inputs.integration }} instead of hardcoded copilot - Copilot docstring documents CLI requirement for workflow dispatch - Added install_url for Copilot CLI docs * Address seventeenth review: project checks, catalog robustness - Add .specify/ project check to workflow run/resume/status/search/info - remove_catalog validates config shape (dict + list) before indexing - _fetch_single_catalog validates response is a dict - _get_merged_workflows raises when all catalogs fail to fetch - add_catalog guards against non-dict catalog entries in config * Address eighteenth review: condition coercion, gate abort result, while default, cache guard, resume log - evaluate_condition treats plain 'false'/'true' strings as booleans - Gate abort returns StepResult(FAILED) instead of raising exception so step output is persisted in state for inspection - while_loop max_iterations optional (default 10), validation aligned - Catalog cache fallback catches invalid JSON gracefully - resume() appends workflow_finished log entry like execute() * Address nineteenth review: allow-all-tools opt-in, empty catalogs, abort dead code, while docstring - --allow-all-tools controlled by SPECKIT_ALLOW_ALL_TOOLS env var (default: 1) Set to 0 to disable automatic tool approval for Copilot CLI - Empty catalogs list falls back to built-in defaults (not an error) - Remove unreachable WorkflowAbortError catches from execute/resume (gate abort now returns StepResult(FAILED) instead of raising) - while_loop docstring updated: max_iterations is optional (default 10) * Address twentieth review: gate abort maps to ABORTED status, do-while max_iterations optional - Engine detects output.aborted from gate step and sets RunStatus.ABORTED (was unreachable — gate abort returned FAILED but status was always FAILED) - do-while max_iterations now optional (default 10), aligned with while_loop - do-while docstring and validation updated accordingly * Coerce default_options to dict, align bundled workflow ID regex with validator * Gate validates string options, prompt uses resolved integration, loop normalizes max_iterations * Use parentId:childId convention for nested step IDs - Fan-out per-item IDs use parentId:templateId:index (e.g. parallel:impl:0) - Reserve ':' in user step IDs (validation rejects) - Replaces _fanout_ prefix with cleaner namespacing - Expressions like {{ steps.parallel:impl:0.output.file }} work naturally * Validate workflow version is semantic versioning (X.Y.Z) * Schema version validation, strict semver, load_workflow docstring, preserve max_concurrency - Validate schema_version is '1.0' (reject unknown future schemas) - Strict semver regex: ^\d+\.\d+\.\d+$ (rejects 1.0.0beta etc.) - load_workflow docstring: 'parsed' not 'validated' - Keep max_concurrency in fan-out output (was dropped) - do_while docstring: engine re-evaluates step_config condition - ARCHITECTURE.md: document nested resume limitation * Path traversal prevention, loop step ID namespacing - RunState validates run_id is alphanumeric+hyphens (no path separators) - workflow_add validates catalog source doesn't escape workflows_dir - Loop iterations namespace nested step IDs as parentId:childId:iteration so multiple iterations don't overwrite each other in context/state --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> |