Commit Graph

10 Commits

Author SHA1 Message Date
Zied Jlassi
f846d6526c fix(workflows): validate requires keys and reject phantom permissions gate (#3079)
* fix(workflows): validate requires keys and reject phantom permissions gate

A workflow's `requires` block was parsed but its keys were never
validated, so a typo or an unsupported key was silently ignored. Most
importantly, authors could write `requires.permissions.shell: true`
expecting a runtime capability gate — but no such gate exists: a `shell`
step always runs with the user's privileges. The declaration gave a
false sense of sandboxing.

`validate_workflow` now accepts only the recognised keys
(`speckit_version`, `integrations`, `tools`, `mcp`) and rejects anything
else, with an explicit error for `requires.permissions` pointing authors
to `gate` steps for approval. Docs and the model comment are updated to
state that `requires` is advisory, not a security boundary.

- Reject non-mapping `requires`, unknown keys, and `requires.permissions`
- Clarify workflows reference + PUBLISHING.md shell-step guidance
- Tests for valid keys, non-mapping, unknown key, and permissions

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
Assisted-by: AI

* fix(workflows): address review feedback on requires validation

Follow-up to the review on #3079:

- Guard `requires` validation on `is not None` instead of truthiness so a
  falsy non-mapping value (e.g. `requires: []` or `requires: ''`) is
  reported as an error instead of being silently skipped; `requires:`
  (YAML null) is still treated as an omitted block. Add a regression test.
- Reword the workflows security note so `requires.permissions` is shown
  as rejected/unsupported rather than as a valid example of `requires`.
- Standardize on US spelling (`_RECOGNIZED_REQUIRES_KEYS`, "recognized")
  to match the surrounding code and ease searching.
- Tighten the permissions-rejection test to assert on specific message
  markers (`requires.permissions` and the `gate` guidance) so it fails if
  the validation path or wording drifts.

Assisted-by: AI
Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>

* fix(workflows): scope requires validation to workflow keys (drop tools/mcp)

tools and mcp belong to the bundle manifest requires schema (bundler/models/manifest.py, resolved in bundler/services/resolver.py), not the workflow requires validated here. Drop them from _RECOGNIZED_REQUIRES_KEYS and revert the PUBLISHING.md claim that this PR had introduced, so workflow requires only recognizes speckit_version and integrations.

This keeps the existing docs accurate and resolves the inline doc-consistency review comments.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

* refactor(workflows): type WorkflowDefinition.requires as Any pre-validation

self.requires holds the raw parsed value, which before validate_workflow()
runs may be a non-mapping (None for a bare 'requires:', a list for
'requires: []', etc.). Annotating it dict[str, Any] was misleading for
editors/type-checkers; use Any and document that validate_workflow() enforces
the mapping shape.

Addresses Copilot review feedback on engine.py.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

* fix(workflows): reject YAML-null requires: as a non-mapping

Address Copilot review: validate requires the same way as inputs. A
bare requires: parses as YAML null and was previously treated as an
omitted block, which is inconsistent with inputs and lets a stray
requires: line be silently ignored.

Drop the is-not-None guard and check isinstance(..., dict) directly: an
omitted block still defaults to {} (valid), but a present-but-non-mapping
value -- YAML null, [] or '' -- is now an authoring error that surfaces.

Tests: add YAML-null rejection + an omitted-is-still-valid guard test.
Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

---------

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>
2026-06-24 14:49:43 -05:00
Pascal THUET
8e76ff3d5c harden: reject shell=True in run_command (#3132)
run_command() forwarded shell= straight to subprocess.run, so a caller
passing shell=True would invoke a shell. Reject shell=True with ValueError
(keeping the parameter for signature compatibility) and drop shell= from
both subprocess.run calls.

Enable ruff S602/S604/S605 to flag any future shell=True reintroduction,
annotate the one intentional workflow shell sink with # noqa: S602, and
document the shell-step execution risk in workflows/PUBLISHING.md.
2026-06-24 13:05:21 -05:00
Huy Do
affbf5ead5 feat(workflows): add from_json expression filter (#2961)
* feat(workflows): add from_json expression filter

Step outputs captured as strings could never become typed values in
templates - the filter set was default/join/map/contains only, so e.g.
a fan-out items: could never consume a step's JSON stdout. Add an
arg-less from_json pipe filter with parse-or-raise semantics: invalid
JSON or non-string input raises a clear ValueError rather than passing
through silently.

Fixes #2960

* fix(expressions): make from_json strict — reject any arguments

Address review (#2961): from_json('x') and from_json() previously fell through to a silent passthrough of the unparsed value. Reject any parenthesized form with a clear error so mis-wired templates fail loudly. Rename test to ...parses_object (JSON under test is an object) and add coverage for the strict no-arguments behavior.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(workflows): document the from_json expression filter

Address Copilot review: the user-facing filter references omitted the
newly added `from_json` filter. Add it to the ARCHITECTURE.md filter table
(with the `{{ steps.emit.output.stdout | from_json }}` example) and to the
filter enumerations in workflows/README.md and docs/reference/workflows.md
so the docs match the evaluator's capabilities.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(workflows): make from_json strictness reject trailing tokens; fix docstring

Address Copilot review:
- Strictness only rejected parenthesized forms, so typos like
  `| from_json)` or `| from_json extra` still fell through to the
  unknown-filter path and silently returned the unparsed value. Match on
  the leading filter token and require the whole filter to be exactly
  `from_json`, so every mis-wired form raises. Extend the rejection test to
  cover the trailing-token cases.
- The module docstring claimed "no imports", which is misleading now that
  the module imports `json`. Reword to state the actual sandbox guarantee:
  templates cannot do file I/O, import modules, or run arbitrary code.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-17 13:43:26 -05:00
Copilot
00bff788c9 Add init workflow step to bootstrap projects like specify init (#2838)
* Initial plan

* Add init workflow step to bootstrap projects like `specify init`

* Address review: simplify stderr capture and extract VALID_SCRIPT_TYPES

* Address review: fail fast on non-empty dir, stdout fallback, README force fix

* Populate exit_code/stdout/stderr in non-empty-dir fast-fail

* fix: address three unresolved review comments in InitStep

- Use `with os.scandir(...)` context manager so the iterator is always
  closed even when `any()` short-circuits, preventing file-descriptor
  leaks in long-running workflow runs.
- Guard `os.chdir(prev_cwd)` in the `finally` block with a try/except
  so an `OSError` (e.g. directory deleted) doesn't bypass returning
  the captured `StepResult`.
- Reject non-string `script` values in `validate()` with a clear error
  message, rather than silently passing them through to become
  `--script True` at runtime.

* Potential fix for pull request finding 'Empty except'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* fix: remove no_git and branch_numbering options removed upstream

The --no-git and --branch-numbering flags were removed from `specify init`
on main. Update InitStep to drop these unsupported config fields and fix
tests accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address review — integration defaults, integration_options, engine-owned dirs

- Apply DEFAULT_INIT_INTEGRATION fallback when neither step config nor
  workflow context provides an integration, so output.integration always
  reflects the actual integration used.
- Add integration_options config field to support --integration-options
  passthrough (required for generic integration and --skills mode).
- Exclude .specify/ from the non-empty directory fast-fail check so that
  here: true works when the engine has already created its run-state
  directory before steps execute.
- Note: mix_stderr=False is not needed — Click 8.2+ captures stderr
  separately by default and the existing try/except handles access.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: implicitly add --force when only engine-owned dirs exist

When the workflow engine creates .specify/workflows/runs/ before steps
execute, the directory is technically non-empty. Previously, specify init
would prompt for confirmation (hanging in unattended mode) unless the
user explicitly set force: true. Now the step detects that only
engine-owned directories (.specify/) are present and implicitly adds
--force so init proceeds without user interaction.

Also fixes the test to exercise the implicit-force path rather than
passing force: True explicitly (which bypassed the check entirely).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: derive VALID_SCRIPT_TYPES from shared constant, fail fast on OSError, include all resolved fields in output

- Derive VALID_SCRIPT_TYPES from SCRIPT_TYPE_CHOICES in _agent_config
  so the valid set cannot drift from the specify init CLI.
- Fail fast with a clear error when os.scandir() raises OSError (e.g.
  permission denied) instead of silently treating the directory as empty.
- Include preset, force, and ignore_agent_tools in all output dicts
  (both fast-fail and normal paths) for consistent interpolation and
  debugging downstream.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: populate stderr from stdout on older Click, fix force comment wording

- When Click does not expose result.stderr (older versions where stderr
  is mixed into stdout), use stdout as stderr on non-zero exit so
  workflows can consistently read steps.<id>.output.stderr for errors.
- Update README inline comment for force: wording to say 'when target
  directory already exists' rather than 'non-empty directory', matching
  the actual specify init behavior for the project: form.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: build argv flags before early returns, use any() for dir scan

- Move argv flag-building (--integration, --script, --preset,
  --ignore-agent-tools) before the non-empty-dir and OSError early
  returns so output['argv'] always reflects the complete command.
- --force is appended after the check since it may be set implicitly.
- Replace list comprehension with any() generator expression to
  short-circuit without allocating a full list of DirEntry objects.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: only treat .specify as engine-owned when it is a real directory

A file or symlink named .specify should not be excluded from the
non-empty check. Use entry.is_dir(follow_symlinks=False) to ensure
only an actual directory is considered engine-owned content.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: guard implicit force for engine dirs only, fix integration fallback order

- Only set implicit --force when engine-owned directories (.specify/)
  are actually present. A completely empty directory no longer gets
  --force added unnecessarily.
- Fix integration resolution precedence: resolve step config expression
  first, then fall back to workflow default (also resolved), then to
  DEFAULT_INIT_INTEGRATION. Previously, a step expression resolving to
  falsy would bypass the workflow default entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Manfred Riem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-17 11:46:51 -05:00
Copilot
c52ccd7dc7 Add workflow step catalog — community-installable step types (#2394)
* Initial plan

* Add workflow step catalog: StepRegistry, StepCatalog, CLI commands, and tests

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/2885e646-477d-4df8-b9a3-06d8cb29e748

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Potential fix for pull request finding 'An assert statement has a side-effect'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Address PR review: path traversal, cache robustness, collision check, failed-to-load display

- Add resolve()+relative_to() path traversal guards in workflow_step_add and
  workflow_step_remove to prevent directory escape via step_id
- Harden _is_url_cache_valid in both StepCatalog and WorkflowCatalog to
  coerce fetched_at to float and catch TypeError/ValueError
- Check STEP_REGISTRY and StepRegistry before installing to prevent
  collisions with built-in step types or already-installed steps
- Show 'Custom (installed, failed to load)' section in workflow step list
  for steps in the registry that failed to load into STEP_REGISTRY

* Fix StepRegistry shape validation and StepCatalog empty-YAML handling

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Polish: rename _default to default_registry, strengthen unreadable-file test

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Address PR review: atomic install, hostname validation, cache resilience, no dynamic imports in list/info

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Fix shutil.move with existing step_dir: remove before move to avoid subdirectory nesting

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Call load_custom_steps at execution time; enforce hostname in _safe_fetch and _validate_url

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/73865880-fb25-4061-a43e-4e4b4d1c4de6

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Wrap YAML parsing in try/except; atomic step install via os.rename() under same fs

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/ff915bc5-ec7e-4e6a-b505-35f5795250df

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Validate YAML root is a dict in _load_catalog_config and workflow_step_add; fix WorkflowCatalog hostname validation

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Fix load_custom_steps() package imports and add reserved step ID validation

* Move _re/_sys imports out of loop and _RESERVED_STEP_IDS to module level

* Address review: collision-resistant module names, extra_files support, remove orphan dir

* Harden extra_files: warn on non-dict, resolve symlinks in path traversal check

* Switch _safe_fetch and StepCatalog._fetch_single_catalog to use open_url for auth consistency

* Harden step_id validation against path-segment tricks; raise on StepRegistry.save() OSError

* Clean up sys.modules on broken step packages; handle StepValidationError in step add/remove

* Address review thread: int-coerce priorities, sys.modules cleanup, _require_specify_project, registry-first remove

* fix: normalize workflow step catalog metadata fallbacks

* fix: address latest workflow step and catalog review findings

* Handle non-string extra_files keys in workflow step add

* Harden StepRegistry symlink reads and extra_files path/URL validation

* Harden custom step loader and step remove against symlinks and OSError

* Fix StepCatalog.search() to coerce non-string fields before joining

* Fix WorkflowCatalog YAML parsing error handling and isinstance checks

* Harden step registry save and custom step/catalog ID handling

* Harden cache validation and staging OSError handling

* Address review: reorder symlink guard and split mixed test

- Move symlink-parent check before is_dir() in load_custom_steps() so
  we never stat an external target through a symlink
- Split test_get_merged_steps_normalizes_list_ids_to_strings into two
  focused tests: one for list-id normalization, one for get_step_info
  return values

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review: symlink-before-stat in loader, restore registry on rmtree failure

- load_custom_steps(): check is_symlink() before is_dir() on step
  directories so symlinked entries are skipped without statting external
  targets
- workflow_step_remove: restore the registry entry when shutil.rmtree()
  fails so filesystem and registry state stay consistent and a future
  'step add' isn't blocked

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Harden step_id validation and file-write error handling

- _validate_step_id_or_exit: reject whitespace-only/padded IDs,
  Windows-invalid characters (<>:"|?*), control characters, trailing
  dots/spaces, and Windows reserved device names (con, nul, etc.)
- Wrap step.yml/__init__.py staging writes in OSError handler
- Wrap extra_files disk writes (mkdir + write_bytes) in OSError handler
  that names the failing relative path
- Registry rollback on rmtree failure: restore verbatim metadata and
  emit a warning if the restore itself fails

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review: cache symlink guard, verbatim registry rollback, Windows test fix

- StepCatalog: add _is_cache_path_safe() guard that checks for symlinks
  in .specify/workflows/steps/.cache path; skip cache reads and writes
  when any component is symlinked to prevent writes outside project root
- Registry rollback: write metadata directly to registry.data['steps']
  and call save() instead of using add() which overwrites timestamps
- temp_dir fixture: use ignore_errors=True on Windows to avoid flaky
  teardown from locked file handles (WinError 32)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Simplify exec_module call by removing redundant nested try/except

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix empty YAML tolerance in WorkflowCatalog.add_catalog, scope ignore_errors to Windows

- WorkflowCatalog.add_catalog(): treat None from yaml.safe_load() (empty
  file) as an empty mapping instead of raising 'corrupted'
- temp_dir fixture: limit ignore_errors to sys.platform == 'win32' so
  real cleanup issues surface on non-Windows platforms

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Chain exceptions in _load_catalog_config for both catalog classes

Add 'from exc' to preserve root cause in tracebacks while keeping
clean user-facing messages.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Make default catalog tests hermetic by isolating HOME

Monkeypatch Path.home() to project_dir and clear catalog env vars so
tests don't break on machines with a real ~/.specify/step-catalogs.yml
or ~/.specify/workflow-catalogs.yml.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix falsy ID handling in _get_merged_steps for list-based catalogs

Check for None explicitly instead of using 'or' which drops valid
falsy IDs like 0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Compare reserved step IDs case-insensitively for filesystem safety

On case-insensitive filesystems (Windows, common macOS), variants like
STEP-REGISTRY.JSON would collide with the actual registry file.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add explanatory comments to intentional empty except blocks

Document why cache-read failures are silently ignored in both
WorkflowCatalog and StepCatalog _fetch_single_catalog methods.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-16 18:03:45 -05:00
Huy Do
7bab0568c5 feat(workflows): add continue_on_error step field for non-halting failures (#2663)
* feat(workflows): add continue_on_error step field

Adds an optional `continue_on_error: bool` field on every step.
When set to `true` and the step fails, the engine records the
result (`exit_code`, `stderr` on `steps.<id>.output` plus `status`
as a sibling key on `steps.<id>`) and continues to the next sibling
step instead of halting the run. Downstream `if`, `switch`, or
`gate` steps can then branch on
`{{ steps.<id>.output.exit_code }}` to route the recovery path.

Engine details
--------------
`WorkflowEngine._execute_steps` now consults the step config when a
step returns `StepStatus.FAILED`:

- Gate aborts (`output.aborted`) always halt the run — operator
  decisions take precedence over the flag.
- Otherwise, if `continue_on_error` is the literal `True`, log a
  `step_continue_on_error` event and proceed to the next sibling.
  The runtime check uses identity comparison (`is True`) rather
  than truthiness, so truthy non-bool values like the string
  `"true"` cannot silently change run semantics even if a caller
  bypasses `validate_workflow()`.
- Otherwise, behave as before: log `step_failed`, set
  `RunStatus.FAILED`, and return.

Validation
----------
`_validate_steps` rejects non-bool values for `continue_on_error`.
Coerced strings like `"true"` are not accepted so authoring
mistakes surface at validation time rather than silently changing
run semantics.

Tests
-----
`TestContinueOnError` in `tests/test_workflows.py` (8 tests):
- `test_undeclared_failure_halts_run` — default halt behaviour.
- `test_declared_and_fired_continues_run` — flag + fail → continue.
- `test_declared_but_step_succeeded_is_noop` — flag + success → no-op.
- `test_if_branch_routes_around_failure` — end-to-end recovery.
- `test_gate_abort_still_halts_with_continue_on_error` — abort
  always halts.
- `test_validation_rejects_non_bool_continue_on_error` — `"true"`
  rejected at validation.
- `test_validation_accepts_bool_continue_on_error` — `true`/`false`
  pass cleanly.
- `test_engine_ignores_truthy_non_bool_continue_on_error` —
  defense-in-depth: engine ignores string `"true"` even when
  validation is bypassed.

Rebased onto current upstream/main (post #2664 merge); the new
`TestContinueOnError` class sits immediately after upstream's
`TestContextRunId` so the two feature suites coexist cleanly.

Closes #2591.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(workflows): restore runtime context section, clarify gate prompt

Two Copilot findings on d0b9e00:

1. The `### Runtime Context` documentation for `{{ context.* }}` was
   lost during the rebase onto current main (the squash dropped the
   anchor where #2664 had added it). Restored under `## Expressions`
   so users can find `context.run_id` semantics and examples.

2. The continue_on_error example gate had message "Retry or skip?"
   but used the default `options: [approve, reject]` with `on_reject:
   skip`, which implied an automatic retry path that gates do not
   provide. Reworded the message to match the actual approve/reject
   semantics and added an explicit note that retry requires either
   custom gate options + downstream branching or a wrapper loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(workflows): clarify continue_on_error scope — returned FAILED only

Copilot finding on d0b9e00:

The README's "Error Handling" intro implied `continue_on_error` covers
"any other runtime error raised during step execution", but the engine
only consults the flag when a step returns `StepResult(status=FAILED, ...)`.
Exceptions raised out of `step_impl.execute()` propagate to
`WorkflowEngine.execute()`, where the catch-all logs `workflow_failed`
and re-raises — the step result is never recorded, and the flag is
never consulted.

Audited the whole PR diff for the same overclaim:

1. workflows/README.md — main fix. Reworded the Error Handling intro to
   "any step that returns StepResult(status=FAILED, ...)" and promoted
   the parenthetical structural-validation note into the Notes block.
   Added a new "Scope: returned failures only" note that names the
   exception path explicitly and tells step authors how to bring the
   flag into scope for exceptional code (catch internally and return
   FAILED with the failure encoded in `output`).

2. tests/test_workflows.py — section comment used "when an executable
   step fails", same ambiguity. Tightened to "when a step returns
   StepResult(status=FAILED, ...)" and added a sentence calling out
   that unhandled exceptions are out of scope.

3. src/specify_cli/workflows/engine.py — already correct ("any step
   that returns FAILED" in the validator comment; "lets the pipeline
   route around the failure" in the execute path). No change.

Engine semantics and test bodies are unchanged. Docs-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(workflows): clarify on_reject:skip semantics — engine returns COMPLETED, not auto-skip

Copilot finding on b8982a7:

The README example's gate message said "reject to skip the rest of this
branch", and the explanatory paragraph claimed [approve, reject] map
to "continue" vs "skip the rest of this branch". The engine does not
implement automatic branch-skipping. `on_reject: skip` returns
`StepStatus.COMPLETED` (gate/__init__.py:65-66); the next sibling step
runs unconditionally unless the author wires a downstream `if` reading
`{{ steps.<gate-id>.output.choice }}`.

Two fixes:

1. Restructured the YAML example so it actually demonstrates the
   manual-branching pattern: added a `recover` if-step after the gate
   that conditions on `steps.review.output.choice == 'approve'`. Now
   the example shows the real workflow author's responsibility instead
   of implying the engine does it.

2. Replaced the trailing paragraph with three precise notes:
   - both gate options return COMPLETED; `on_reject: skip` controls
     abort behaviour only, not sibling-skipping
   - all three `on_reject` values enumerated with their actual engine
     semantics (FAILED+aborted / COMPLETED / PAUSED)
   - the original retry-loop guidance retained as the third bullet

Updated the gate message in the example to match — "reject to leave the
failure recorded and move on" instead of "reject to skip the rest of
this branch".

Audited the whole PR diff for the same overclaim: no other instance.
Engine semantics, validation, and test bodies are unchanged. Docs-only.

161/161 tests/test_workflows.py pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(workflows): clarify gate's role — surfaces, doesn't programmatically branch

Audit follow-up to 393ac6b — three sites repeated the same minor
overclaim about gates being one of the "branch on it" step types
alongside `if` and `switch`:

1. workflows/README.md (the "downstream `if`, `switch`, or `gate`
   steps can branch on it" sentence introducing the example)
2. engine.py:236 (validator inline comment)
3. engine.py:657 (execute-path inline comment)

A `gate` step does not have a `condition` or `expression` field — it
only evaluates expressions for `message` and `show_file` (gate/__init__.py:29,36).
Programmatic branching happens in `if`/`switch`; a gate surfaces the
value to a human operator via message interpolation, and the operator's
choice is recorded in `output.choice` for a *subsequent* `if`/`switch`
to route on.

Reworded all three sites consistently: "a downstream `if` or `switch`
can branch on it (or a `gate` can surface it to the operator via
message interpolation)". The README example already demonstrates this
distinction — the gate carries `{{ }}` template variables in its
message and the `recover` if-step downstream is what actually branches
on the choice.

Engine semantics, validation, and test bodies are unchanged. Docs-only
on the README; comment-only on engine.py.

161/161 tests/test_workflows.py pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(workflows): use qualified StepStatus.* instead of bare FAILED/COMPLETED/PAUSED

Three Copilot inline comments on workflows/README.md lines 226, 282, 288
flagged that ``StepResult(status=FAILED, ...)`` is not valid Python —
``StepResult.status`` is a ``StepStatus`` enum value, so the
documented form should be ``StepStatus.FAILED``.

Audited the whole PR diff for the same shorthand. The bare unqualified
form appears in three files added/modified by this PR:

1. workflows/README.md (6 sites) — three ``StepResult(status=FAILED, ...)``
   parentheticals, plus the on_reject Notes bullet listing the three
   step statuses (``FAILED``, ``COMPLETED``, ``PAUSED``).

2. tests/test_workflows.py (4 sites) — section header for
   TestContinueOnError, two test-method docstrings, one inline comment
   about a gate's TTY-fallback behaviour.

3. src/specify_cli/workflows/engine.py (1 site) — the validator inline
   comment added in d0b9e00 said "returns FAILED" where the engine
   code itself uses ``StepStatus.FAILED``.

All 11 sites normalised to the qualified ``StepStatus.<name>`` form so
the docs / test docstrings / inline comments match what readers will
actually find in the engine code and the tests. Engine semantics,
validation, and test bodies are unchanged.

161/161 tests/test_workflows.py pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 10:10:07 -05:00
Huy Do
c6afe4cde1 feat(workflows): expose {{ context.run_id }} template variable (#2664)
* feat(workflows): expose `{{ context.run_id }}` template variable

Closes #2590.

Surfaces the engine-assigned run id (the same 8-character hex
string Spec Kit prints as `Run ID:` at the end of
`workflow run`) as a workflow template variable so YAML
authors can reference it from shell `run:`, command
`input.args:`, switch `expression:`, and any other field that
already evaluates `{{ ... }}` templates.

### Why

The run id is the natural join key between a Spec Kit workflow
run and downstream artifacts, telemetry, or per-run scratch
state. Today the operator sees it in stdout but workflows
themselves cannot reference it — there was no way to stamp a
log line, name a scratch directory, or tag an artifact with
the same id Spec Kit assigned.

The three motivating use cases from the issue:

1. Telemetry / observability — stamp logs and events with the
   run id so external systems can join workflow runs to
   downstream artifacts.
2. Per-run scratch / isolation — interactive operator commands
   that need their own state directory under
   `/tmp/run-<id>/`.
3. Run-id in artifact metadata — stable join key from artifact
   back to the producing run.

### Implementation

`StepContext.run_id` is already populated by `WorkflowEngine`
in both `execute()` and `resume()`. The only gap was the
template namespace builder.

`_build_namespace` (in `workflows/expressions.py`) now adds a
`context` key alongside the existing `inputs`, `steps`,
`item`, and `fan_in` namespaces:

```python
ns["context"] = {"run_id": run_id}
```

The value is always present (even outside a run) and falls
back to an empty string when no run is active. Workflows
referencing `{{ context.run_id }}` therefore never error — a
hard requirement from the issue's acceptance criteria for
dry-run, validation, and ad-hoc evaluator usage.

### Default behaviour preserved

Workflows that do not reference `{{ context.run_id }}` are
byte-equivalent to before this change. The `context`
namespace is added unconditionally to keep template
resolution branch-free, but its presence has no observable
effect when nothing references it.

### Tests

`TestExpressions` (unit-level) gains three tests:

- `test_context_run_id_resolves` — direct lookup against a
  `StepContext(run_id=...)`.
- `test_context_run_id_defaults_to_empty_when_unset` —
  graceful default outside a run context.
- `test_context_run_id_string_interpolation` — mixed
  template (e.g. `"RUN_ID={{ context.run_id }}"`).

`TestContextRunId` (end-to-end) covers the three step types
the acceptance criteria called out:

- `test_shell_run_resolves_run_id` — `run:` field
  substitution, verified via captured stdout.
- `test_command_input_args_resolves_run_id` — `input.args:`
  resolution, captured in step output even when CLI dispatch
  is unavailable (the artifact-metadata use case).
- `test_switch_expression_matches_on_run_id` — switch
  matches against the resolved value, proving the run id is a
  first-class value in the expression engine, not just an
  interpolation token.
- `test_workflow_without_context_reference_unchanged` —
  locks the byte-equivalent default required by the issue.

### Docs

`workflows/README.md` gains a "Runtime Context" subsection
under "Expressions" documenting the new namespace and the
three canonical use patterns (telemetry, per-run scratch,
artifact metadata).

* test(workflows): drop inline double-quotes in run_id shell tests

`test_shell_run_resolves_run_id` and
`test_switch_expression_matches_on_run_id` used
`run: 'echo "RUN_ID={{ context.run_id }}"'` with inner double-quotes
around the echo argument. Bash/sh strips those quotes before invoking
echo, but cmd.exe (used on Windows when `shell=True`) treats them
as literal characters and emits `"RUN_ID=abc12345"` — failing the
exact-match assertion. Linux passed; all three Windows-latest matrix
entries failed with `assert '"RUN_ID=abc12345"' == 'RUN_ID=abc12345'`.

Resolve by dropping the inner double-quotes (the value has no spaces
or shell metacharacters) and wrapping the YAML scalar in plain
double-quotes the same way other shell-step tests in this file do
(e.g. `run: "echo b-saw-..."`). Behaviour-equivalent on POSIX,
portable to cmd.exe.
2026-05-27 13:00:58 -05:00
Quratulain-bilal
409ec59704 fix(workflow): support integration: auto to follow project's initialized AI (#2421)
* fix(workflow): support integration: auto to follow project's initialized AI

Closes #2406

(squashed)

* fix(workflow): combine JSONDecodeError and UnicodeDecodeError handling

Address Copilot feedback: UnicodeDecodeError can be raised by both
read_text() and json.loads(), so combining the handlers ensures both
cases produce a consistent, clear error message.

* fix(workflows): honor integration_state schema guard and modern state in 'integration: auto'

Three Copilot follow-ups on PR #2421:

1. engine.py:799 — `_load_project_integration` was bypassing the same
   schema guard `_read_integration_json` enforces. It now reads the
   schema field directly, returns None on a future schema (so the
   workflow falls back to the literal 'auto' default rather than
   guessing), and routes through `normalize_integration_state` /
   `default_integration_key` so modern installs that record
   `default_integration` / `installed_integrations` (without the
   legacy top-level `integration` field) resolve correctly.

2. test_workflows.py — added two regression cases:
   - `integration: auto` resolves a modern normalized state file
   - `integration: auto` falls back when the state file declares a
     newer `integration_state_schema` than this CLI supports

3. test_cli.py — added a CLI-level regression for the `UnicodeDecodeError`
   branch in `_read_integration_json` to match the existing
   malformed-JSON coverage.

* refactor(integration): extract shared try_read_integration_json helper

Address Copilot review on PR #2421:

Both `_read_integration_json` (CLI) and `_load_project_integration`
(workflow engine) were parsing `.specify/integration.json` independently,
duplicating the schema guard and risking drift between the two readers.

Extract the parse + schema validation into a single low-level helper
`try_read_integration_json` in `integration_state.py` that returns either
the normalized state or a structured `IntegrationReadError`. Both callers
now delegate to this helper:

- CLI keeps its loud-fail UX: each error kind ("decode", "os",
  "not_object", "schema_too_new") is translated into the existing console
  message + typer.Exit(1).
- Engine keeps its silent fallback: any error simply returns None so
  `integration: auto` falls back to the workflow's literal default.

This eliminates the divergence Copilot flagged without changing observable
behavior for either caller.

* fix(integration): distinguish missing file from non-regular path

Address Copilot review on PR #2421:

`try_read_integration_json` was collapsing two distinct cases into a
single `(None, None)` return:

1. `.specify/integration.json` truly missing — silent fallback is correct.
2. Path exists but is a directory, socket, or other non-regular file —
   this is a misconfiguration the CLI should surface loudly.

Split the check: `exists()` falsey returns `(None, None)`; existing-but-
not-a-regular-file returns `(None, IntegrationReadError(kind="os", ...))`
so the CLI's loud-fail path produces an actionable error while the
engine still treats it as a fallback to the workflow's literal default.

* docs(workflow): clarify version pin, advisory integrations list, enum exemption

- workflow.yml: fix comment that said 0.8.3 was first release with auto
  resolution; the pin is >=0.8.5 so the comment now matches the pin.
- workflow.yml: clarify that requires.integrations.any is an advisory,
  non-exhaustive compatibility hint, not a closed set.
- engine.py: clarify that the auto-sentinel exemption only skips enum
  membership; declared type is still enforced through _coerce_input.

* fix(workflow): resolve auto sentinel for provided values; report stat errors

Two Copilot findings fixed:

1. _resolve_inputs only resolved the ``integration: auto`` sentinel when it
   came from the input default. A caller explicitly providing
   ``{"integration": "auto"}`` (which the workflow prompt advertises as a
   valid value) bypassed _resolve_default and the literal "auto" leaked
   to dispatch. Provided values now go through the same resolution path
   as defaults, and the enum-membership exemption applies in both cases.
   Regression test added.

2. try_read_integration_json used Path.exists() / Path.is_file() as a
   pre-check. Both return False on some OSErrors (e.g. permission errors
   during stat), which silently treated an unreadable-but-present file
   as missing — the engine fell back without warning and the CLI failed
   to surface the loud error. The pre-check is gone: read_text() is
   attempted directly, FileNotFoundError means missing (silent fallback),
   IsADirectoryError and other OSErrors become loud IntegrationReadError.

* fix(workflow): enforce declared type for string inputs, reject bool-as-number

Two Copilot findings fixed:

1. _coerce_input previously coerced/validated only ``number`` and
   ``boolean`` types, so ``type: string`` silently accepted any Python
   value (numbers, lists, dicts). A YAML authoring mistake like
   ``type: string`` + ``default: 5`` slipped through. Strings are now
   required to actually be strings; non-strings raise ValueError, which
   surfaces as an ``invalid default`` error from validate_workflow.

2. ``type: number`` accepted ``default: true`` because ``bool`` is a
   subclass of ``int`` (``float(True) == 1.0``). Bools are now rejected
   explicitly in the number path so the YAML mistake fails fast. The
   boolean path is also tightened to reject non-bool / non-string
   values for symmetry.

Comment on the auto-sentinel enum exemption updated to reflect the
stronger guarantee. Regression tests added for both rejections.

* fix(cli): drop unused normalize_integration_state import to satisfy ruff

CI's `uvx ruff check src/` flagged this as F401: the symbol was imported
under a private alias but never referenced. Tests stay green after
removal.
2026-05-15 16:03:33 -05:00
Manfred Riem
02a1d610df docs: add workflows reference, reorganize into docs/reference/, and add --version flag (#2244)
* docs: add workflows reference, reorganize into docs/reference/, and add --version flag

- Move integrations.md, extensions.md, presets.md into docs/reference/
- New docs/reference/workflows.md: command reference for all workflow
  commands, built-in SDD Cycle workflow with Mermaid diagram, step types,
  expressions, input types, state/resume, and FAQ
- Rename workflow input feature_name to spec with prompt 'Describe what
  you want to build' to match speckit.specify command terminology
- Add --version / -V flag to root specify command with tests
- Update docs/toc.yml, README.md links, and docs/upgrade.md cross-reference
  to use reference/ paths
- Add workflow command to README CLI reference table

* docs: update speckit_version requirement to >=0.7.2 in workflow example
2026-04-16 13:34:08 -05:00
Copilot
a00e679918 Add workflow engine with catalog system (#2158)
* Initial plan

* Add workflow engine with step registry, expression engine, catalog system, and CLI commands

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Add comprehensive tests for workflow engine (94 tests)

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Address review feedback: do-while condition preservation and URL scheme validation

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Address review feedback, add CLI dispatch, interactive gates, and docs

Review comments (7/7):
- Add explanatory comment to empty except block
- Implement workflow catalog download with cleanup on failure
- Add input type coercion for number/boolean/enum
- Fix example workflow to remove non-existent output references
- Fix while_loop and if_then condition defaults (string 'false' → bool False)
- Fix resume step index tracking with step_offset parameter

CLI dispatch:
- Add build_exec_args() and dispatch_command() to IntegrationBase
- Override for Claude (skills: /speckit-specify), Gemini (-m flag),
  Codex (codex exec), Copilot (--agent speckit.specify)
- CommandStep invokes installed commands by name via integration CLI
- Add PromptStep for arbitrary inline prompts (10th step type)
- Stream CLI output live to terminal (no silent blocking)
- Remove timeout when streaming (user can Ctrl+C)
- Ctrl+C saves state as PAUSED for clean resume

Interactive gates:
- Gate steps prompt [1] approve [2] reject in TTY
- Fall back to PAUSED in non-interactive environments
- Resume re-executes the gate for interactive prompting

Documentation:
- workflows/README.md — user guide
- workflows/ARCHITECTURE.md — internals with Mermaid diagrams
- workflows/PUBLISHING.md — catalog submission guide

Tests: 94 → 122 workflow tests, 1362 total (all passing)

* Fix ruff lint errors: unused imports, f-string placeholders, undefined name

* Address second review: registry-backed validation, shell failures, loop/fan-out execution, URL validation

- VALID_STEP_TYPES now queries STEP_REGISTRY dynamically
- Shell step returns FAILED on non-zero exit code
- Persist workflow YAML in run directory for reliable resume
- Resume loads from run copy, falls back to installed workflow
- Engine iterates while/do-while loops up to max_iterations
- Engine expands fan-out per item with context.item
- HTTPS URL validation for catalog workflow installs (HTTP allowed for localhost)
- Fix catalog merge priority docstring (lower number wins)
- Fix dispatch_command docstring (no build_exec_args_for_command)
- Gate on_reject=retry pauses for re-prompt on resume
- Update docs to 10 step types, add prompt step to tables and README

* Potential fix for pull request finding 'Empty except'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* Address third review: fan-out IDs, catalog guards, shell coercion, docs

- Fan-out generates unique per-item step IDs and collects results
- Catalog merge skips non-dict workflow entries (malformed data guard)
- Shell step coerces run_cmd to str after expression evaluation
- urlopen timeout=30 for catalog workflow installs
- yaml.dump with sort_keys=False, allow_unicode=True for catalog configs
- Document streaming timeout as intentionally unbounded (user Ctrl+C)
- Document --allow-all-tools as required for non-interactive + future enhancement
- Update test docstring and PUBLISHING.md to 10 step types with prompt

* Validate final URL after redirects in catalog fetch

urlopen follows redirects, so validate the response URL against the
same HTTPS/localhost rules to prevent redirect-based downgrade attacks.

* Address fourth review: filter arg eval, tags normalization, install redirect check

- Filter arguments now evaluated via _evaluate_simple_expression() so
  default(42) returns int not string
- Tags normalized: non-list/non-string values handled gracefully
- Install URL redirect validation (same as catalog fetch)
- Remove unused 'skipped' variable in catalog config parsing
- Author 'github' → 'GitHub' in example workflow
- Document nested step resume limitation (re-runs parent step)

* Add explanatory comment to empty except ValueError block

* Address fifth review: expression parsing, fan-out output, URL install, gate options

- Move string literal parsing before operator detection in expressions
  so quoted strings with operators (e.g. 'a in b') are not mis-parsed
- Fan-out: remove max_concurrency from persisted output, fix docstring
  to reflect sequential execution
- workflow add: support URL sources with HTTPS/redirect validation,
  validate workflow ID is non-empty before writing files
- Deduplicate local install logic via _validate_and_install_local()
- Remove 'edit' gate option from speckit workflow (not implemented)

* Add comments to empty except ValueError blocks in URL install

* Address sixth review: operator precedence, fan_in cleanup, registry resilience, docs

- Fix or/and operator precedence (or parsed first = lower precedence)
- Restore context.fan_in after fan-in step completes
- Catch JSONDecodeError in registry load for corrupted files
- Replace print() with on_step_start callback (library-safe)
- Gate validation warns when on_reject set but no reject option
- Shell step: document shell=True security tradeoff
- README: sdd-pipeline → speckit, parallel → sequential for fan-out
- ARCHITECTURE.md: parallel → fan-out/fan-in in diagram

* Address seventh review: string literal before pipe, type annotations, validate on install

- Move string literal check above pipe filter parsing so 'a | b' works
- Fix type annotations: input_values list[str] | None, run_id str | None
- Run validate_workflow() before installing from local path/URL
- Remove duplicate string literal check from expression parser

* Address eighth review: fan-out namespaced IDs, early return, catalog validation

- Fan-out per-item step IDs use _fanout_{step_id}_{base}_{idx} namespace
  to avoid collisions with user-defined step IDs
- Early return after fan-out loop when state is paused/failed/aborted
- Catalog installs parse + validate downloaded YAML before registering,
  using definition metadata instead of catalog entry for registry

* Address ninth review: populate catalog, fix indentation, priority, README

- Add speckit workflow entry to catalog.json so it's discoverable
- Fix shell step output dict indentation
- Catalog add_catalog priority derived from max existing + 1
- README Quick Start clarified with install + local file examples

* Address tenth review: max_iterations validation, catalog config guard, version alignment

- Validate max_iterations is int >= 1 in while and do-while steps
- Guard add_catalog against corrupted config (non-dict/non-list)
- Align speckit_version requirement to >=0.6.1 (current package version)
- Fan-out template validation uses separate seen_ids set to avoid
  false duplication errors with user-defined step IDs

* Address eleventh review: command step fails without CLI, ID mismatch warning, state persistence

- Command step returns FAILED when CLI not installed (was silent COMPLETED)
- Catalog install warns on workflow ID vs catalog key mismatch
- Engine persists state.save() before returning on unknown step type
- Update tests to expect FAILED for command steps without CLI
- Integration tests use shell steps for CLI-independent execution

* Address twelfth review: type annotations, version examples, streaming docs, requires

- Fix workflow_search type annotations (str | None)
- PUBLISHING.md: speckit_version >=0.15.0 → >=0.6.1
- Document that exit_code is captured and referenceable by later steps
- Mark requires as declared-but-not-enforced (planned enhancement)
- Note full stdout/stderr capture as planned enhancement

* Enforce catalog key matches workflow ID (fail instead of warn)

* Bundle speckit workflow: auto-install during specify init

- Add workflows/speckit to pyproject.toml force-include for wheel builds
- Add _locate_bundled_workflow() helper (mirrors _locate_bundled_extension)
- Auto-install speckit workflow during specify init (after git extension)
- Update all integration file inventory tests to expect workflow files

* Address fourteenth review: prompt fails without CLI, resolved step data, fan-out normalization

- PromptStep returns FAILED when CLI not installed (was silent COMPLETED)
- Engine step_data prefers resolved values from step output
- Fan-out normalizes output.results=[] for empty item lists
- subprocess.run inherits stdout/stderr (no explicit sys.stdout)
- Registry tests use issubset for extensibility

* Address fifteenth review: fan_in docstring, gate defaults, validation guards, reserved prefix

- FanInStep docstring: aggregate-only, no blocking semantics
- FanInStep: guard output_config as dict, handle None
- Gate validate: use same default options as execute
- Validate inputs is dict and steps is list before iterating
- Reserve _fanout_ prefix in step ID validation
- PUBLISHING.md: remove unenforced checklist items, add _fanout_ note

* Address sixteenth review: docs regex, fan_in try/finally, hyphenated dot-path keys

- PUBLISHING.md: update ID regex docs to match implementation (single-char OK)
- FanInStep: wrap expression evaluation in try/finally for context.fan_in
- Expression dot-path: allow hyphens in keys before list index (e.g. run-tests[0])

* Make speckit workflow integration-agnostic, document Copilot CLI requirement

- Workflow integration selectable via input (default: claude)
- Each command step uses {{ inputs.integration }} instead of hardcoded copilot
- Copilot docstring documents CLI requirement for workflow dispatch
- Added install_url for Copilot CLI docs

* Address seventeenth review: project checks, catalog robustness

- Add .specify/ project check to workflow run/resume/status/search/info
- remove_catalog validates config shape (dict + list) before indexing
- _fetch_single_catalog validates response is a dict
- _get_merged_workflows raises when all catalogs fail to fetch
- add_catalog guards against non-dict catalog entries in config

* Address eighteenth review: condition coercion, gate abort result, while default, cache guard, resume log

- evaluate_condition treats plain 'false'/'true' strings as booleans
- Gate abort returns StepResult(FAILED) instead of raising exception
  so step output is persisted in state for inspection
- while_loop max_iterations optional (default 10), validation aligned
- Catalog cache fallback catches invalid JSON gracefully
- resume() appends workflow_finished log entry like execute()

* Address nineteenth review: allow-all-tools opt-in, empty catalogs, abort dead code, while docstring

- --allow-all-tools controlled by SPECKIT_ALLOW_ALL_TOOLS env var (default: 1)
  Set to 0 to disable automatic tool approval for Copilot CLI
- Empty catalogs list falls back to built-in defaults (not an error)
- Remove unreachable WorkflowAbortError catches from execute/resume
  (gate abort now returns StepResult(FAILED) instead of raising)
- while_loop docstring updated: max_iterations is optional (default 10)

* Address twentieth review: gate abort maps to ABORTED status, do-while max_iterations optional

- Engine detects output.aborted from gate step and sets RunStatus.ABORTED
  (was unreachable — gate abort returned FAILED but status was always FAILED)
- do-while max_iterations now optional (default 10), aligned with while_loop
- do-while docstring and validation updated accordingly

* Coerce default_options to dict, align bundled workflow ID regex with validator

* Gate validates string options, prompt uses resolved integration, loop normalizes max_iterations

* Use parentId:childId convention for nested step IDs

- Fan-out per-item IDs use parentId:templateId:index (e.g. parallel:impl:0)
- Reserve ':' in user step IDs (validation rejects)
- Replaces _fanout_ prefix with cleaner namespacing
- Expressions like {{ steps.parallel:impl:0.output.file }} work naturally

* Validate workflow version is semantic versioning (X.Y.Z)

* Schema version validation, strict semver, load_workflow docstring, preserve max_concurrency

- Validate schema_version is '1.0' (reject unknown future schemas)
- Strict semver regex: ^\d+\.\d+\.\d+$ (rejects 1.0.0beta etc.)
- load_workflow docstring: 'parsed' not 'validated'
- Keep max_concurrency in fan-out output (was dropped)
- do_while docstring: engine re-evaluates step_config condition
- ARCHITECTURE.md: document nested resume limitation

* Path traversal prevention, loop step ID namespacing

- RunState validates run_id is alphanumeric+hyphens (no path separators)
- workflow_add validates catalog source doesn't escape workflows_dir
- Loop iterations namespace nested step IDs as parentId:childId:iteration
  so multiple iterations don't overwrite each other in context/state

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
2026-04-14 10:11:56 -05:00