github-spec-kit

mirror of https://github.com/github/spec-kit.git synced 2026-07-03 20:36:23 +08:00

Author	SHA1	Message	Date
lselvar	f512b8b0d1	fix: resolve GitHub release asset API URL for private repo preset and workflow downloads (#2855 ) * fix: resolve GitHub release asset API URL for private repo preset and workflow downloads - Add shared `resolve_github_release_asset_api_url` utility to `_github_http.py` for reuse across preset and workflow download paths - Apply the same private-repo fix from PR #2792 (extensions) to: - `PresetCatalog.download_pack` — ZIP downloads via catalog `download_url` - `preset add --from <url>` — ZIP downloads from a direct URL - `workflow add <url>` — workflow YAML downloads from a direct URL - `workflow add <id>` (catalog) — workflow YAML downloads via catalog `url` - For browser release URLs (`github.com/…/releases/download/…`), the asset is resolved via the GitHub REST API and downloaded with `Accept: application/octet-stream` - Direct REST API asset URLs (`api.github.com/…/releases/assets/<id>`) are downloaded directly with `Accept: application/octet-stream` - Auth is preserved end-to-end through the existing `open_url` infrastructure - Update `test_download_pack_sends_auth_header` and add `test_download_pack_accepts_direct_github_rest_asset_url` to cover both paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: URL-encode tag in release API URL to handle special characters Encode the tag as a path segment (using quote with safe='') when building the releases/tags/<tag> API URL. This prevents malformed URLs when tags contain reserved characters like '/' or '#'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: add CLI-level tests for preset add --from GitHub release URL resolution Adds regression tests covering: - resolve_github_release_asset_api_url unit tests (passthrough, resolution, network error, URL encoding of special chars in tags) - CLI-level 'preset add --from <github-release-url>' end-to-end flow - CLI-level 'preset add --from <api-asset-url>' direct passthrough Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: deduplicate release URL resolution; fix test issues - ExtensionCatalog._resolve_github_release_asset_api_url now delegates to the shared helper in _github_http.py (also gains URL-encoding fix) - Remove unused 'io' import from test_github_http.py - Remove duplicate 'provides' dict keys accidentally added to test_presets.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: align resolver timeout with download timeout; add workflow CLI tests - Pass timeout=30 to resolve_github_release_asset_api_url in both workflow add paths so worst-case latency matches the download timeout - Add CLI-level regression tests for 'workflow add <url>' covering browser URL resolution and direct API asset URL passthrough Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: remove unused urllib.request import; add catalog workflow test - Remove unused 'import urllib.request' in preset add --from path - Add CLI test for catalog-based 'workflow add <id>' with GitHub release URL resolution Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * style: remove unused MagicMock imports from tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-05 10:41:40 -05:00
Huy Do	005c80a9c7	fix(workflows): render gate show_file contents in the interactive prompt (#2810 ) * fix(workflows): render gate show_file contents in the interactive prompt The gate step read and recorded `show_file` but never displayed its contents at the interactive prompt, so the operator approved/rejected without seeing the referenced file. Render the file inside the prompt when stdin is a TTY, with a graceful notice for missing/unreadable files. Non-interactive PAUSED behaviour, exit codes, resume semantics, and no-`show_file` output are unchanged. Closes #2809. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): keep gate _prompt signature stable and harden show_file reads The gate prompt rendered show_file by passing it as a third positional argument to _prompt. A test that stubs _prompt with a two-argument lambda (test_gate_abort_still_halts_with_continue_on_error) then failed once the branch caught up to main, because the call site passed three arguments to the two-argument stub. Compose the show_file material into the displayed message in execute() and keep _prompt to its (message, options) contract. Display data no longer widens the interactive seam, so stubbing _prompt stays stable and future review material can be added without breaking callers. _prompt now renders a multi-line message inside the gate box. Also catch ValueError in _read_show_file so a path the OS rejects outright (e.g. an embedded NUL byte) degrades to a notice instead of crashing the prompt, matching the helper's stated contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): coerce gate prompt message to str before rendering The multi-line render loop split the message on newlines, which assumes a str. A non-string message (e.g. a YAML numeric literal) previously rendered fine through the old f-string and would now raise on .split. Coerce with str() to preserve that tolerance, and add a regression test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): make gate stdin handling robust; tidy compose_prompt typing Address review feedback on the gate tests and helper: - Swap the gate module's sys.stdin for a fixed-isatty stub (shared _StubStdin / _force_gate_stdin helpers) instead of setattr on sys.stdin.isatty, which is not assignable under some pytest capture modes. This also forces the non-interactive tests to a non-TTY so they cannot block on input() when run in a real terminal. - The non-interactive show_file test now hard-fails if _read_show_file is called, proving the file is not read on the PAUSED path. - _compose_prompt accepts a non-string message (e.g. a YAML numeric literal) and always returns str via str(message), keeping its annotation and docstring accurate; the redundant coercion in _prompt is removed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): strip control chars from gate show_file; default tests non-TTY Address review feedback: - _read_show_file strips C0 control characters (except tab) from each line, so a show_file containing ANSI escape sequences (e.g. \x1b[2J) cannot clear the screen or spoof the prompt/options when rendered to a terminal. - Add an autouse fixture on TestGateStep that defaults every gate test to a non-TTY stdin, so no test can drop into the interactive prompt and block on input() when the suite runs under a real TTY. Interactive tests opt back in via _force_gate_stdin(tty=True); the now-redundant explicit non-TTY calls were removed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): localize gate stdin patch to the gate module's sys _force_gate_stdin rebinds the gate module's `sys` name to a stand-in whose stdin has a fixed isatty() and which delegates every other attribute to the real sys, instead of mutating the process-wide sys.stdin. This keeps the patch local to the gate module and leaves real stdin untouched. The gate abort test, which used the same process-wide swap, now shares the helper, so the pattern exists in exactly one place. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): sanitize the displayed gate show_file path, not just content Control characters were stripped from show_file contents but the path was still printed verbatim as the header (`f"{show_file}:"`) and echoed in the read-error notice, so a show_file path containing ANSI escapes could still inject terminal sequences. Centralize stripping in `_sanitize_for_display` and apply it to every show_file-derived string that reaches the terminal — the displayed path, each file line, and the error notice — while still opening the file with the original path. Add a test for path sanitization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(workflows): inline control-char stripping, drop the helper Reuse the existing _CONTROL_CHARS regex directly at the three display sites instead of wrapping it in a one-line helper. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): also strip LF and C1 controls from gate show_file display The control-char class skipped LF (so an embedded newline in a show_file path could break the boxed layout) and the C1 range (so \x9b CSI and other 8-bit controls survived). Widen the class to [\x00-\x08\x0a-\x1f\x7f-\x9f] (still keeping tab). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-05 08:04:52 -05:00
Samir Abed	34ce66139e	feat: add support for rovodev (#2539 ) * feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev	2026-06-04 11:34:05 -05:00
Huy Do	141119efea	feat(workflows): add JSON output for workflow run resume and status (#2814 ) * feat(workflows): add --json output to workflow run, resume, and status Adds an opt-in `--json` flag to `workflow run`, `workflow resume`, and `workflow status` that emits a single machine-readable object (run_id, workflow_id, status, current step; status also reports per-step states and a runs list) for automation and external orchestrators. JSON is written via a small `_emit_workflow_json` helper using plain stdout, so Rich markup, highlighting, and line-wrapping can never alter the emitted object. Default human-readable output and exit codes are unchanged when `--json` is omitted. Reference docs updated. Closes #2811. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): keep --json stdout clean while steps write output Suppressing the banner and the step-start callback was not enough to guarantee a single parseable JSON object on stdout: individual steps still write there while the engine runs. The gate step prints its prompt, and the prompt step runs a CLI subprocess that inherits the process's stdout file descriptor — either can corrupt the JSON stream for interactive runs or integration-backed workflows. Wrap engine.execute()/engine.resume() in a file-descriptor-level redirect (dup2) when --json is set, so both Python-level writes and inherited-fd subprocess output go to stderr while stdout carries only the emitted JSON. Step progress stays visible on stderr. status does not run the engine, so it is unaffected. Tests cover both pollution channels (a Python print and a real subprocess) via fd-level capture, and the inactive no-op path. Docs note the stdout/stderr split. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(workflows): fix stray escape sequence in --json redirect comments The redirect helper's docstring and its test comment wrote ``print``\s, which renders as "print\s" rather than "prints". Replace with plain "prints". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-04 11:11:39 -05:00
One-TheOnly	8e5643d4ff	fix(cursor-agent): enable headless CLI dispatch end-to-end (-p --trust --approve-mcps --force + Windows .cmd shim resolution) (#2631 ) * fix(cursor-agent): enable CLI dispatch via ``-p --trust`` headless mode Restores the ability for ``specify workflow run`` to dispatch the cursor-agent CLI, complementing the existing in-IDE skill flow. Without this fix, ``specify workflow run speckit --input integration=cursor-agent ...`` fails with a misleading ``CLI not found or not installed`` error even when the CLI is installed (since cursor-agent had ``requires_cli=False`` and an unset ``build_exec_args``). The cursor-agent CLI (>= 2026.05.16) supports headless execution via ``-p`` (print mode with full tool access including write/shell) and ``--trust`` (bypass Workspace Trust prompt). Without ``--trust`` the CLI exits non-zero in non-TTY contexts (verified locally). Changes to ``src/specify_cli/integrations/cursor_agent/__init__.py``: * ``config.requires_cli``: ``False`` -> ``True`` * ``config.install_url``: ``None`` -> Cursor CLI docs URL * Override ``build_exec_args()`` to emit ``[cursor-agent, -p, --trust, <prompt>, ...]`` with optional ``--model`` and ``--output-format json`` flags, mirroring the shape used by ``claude``/``codex``/``gemini``. Tests: * 34 existing cursor-agent tests still pass. * 6 new tests in ``TestCursorAgentCliDispatch`` pin ``requires_cli``, ``install_url``, and the exact argv shape (default, text-output, with-model, and the hyphenated skill invocation form ``/speckit-<name>``). * Full repo: 1085 / 1085 passed, no regressions. Fixes #2629 Co-authored-by: Cursor <cursoragent@cursor.com> * fix(integrations): resolve ``.cmd``/``.bat`` shims before subprocess.run On Windows, ``shutil.which`` honors ``PATHEXT`` and locates wrappers like ``cursor-agent.cmd`` and ``codex.cmd``, but Python's ``subprocess.run`` calls ``CreateProcess`` which does not consult ``PATHEXT`` and therefore fails with ``WinError 2`` on a bare argv like ``[cursor-agent, ...]``. Resolve ``exec_args[0]`` via ``shutil.which`` in ``IntegrationBase.dispatch_command`` so ``.cmd``/``.bat`` shims work transparently. On POSIX this is a no-op for absolute paths and a harmless lookup otherwise. Verified locally on Windows 10 + cursor-agent 2026.05.16: without this fix, ``specify workflow run speckit --input integration=cursor-agent`` fails with ``FileNotFoundError`` even after the cursor-agent integration starts producing valid exec args (per the prior commit on this branch). Tests: * New: 2 cursor-agent tests pin the shim-resolution + passthrough behavior (``test_dispatch_command_resolves_cmd_shim_for_subprocess`` and ``test_dispatch_command_passthrough_when_shutil_which_finds_nothing``). * Updated: ``tests/test_workflows.py::TestCommandStep::test_dispatch_with_mock_cli`` was mocking ``shutil.which`` only at the ``command`` step level and not at the ``base`` level, which made it environment-sensitive (fails locally when the real ``claude`` CLI is on PATH). Added the matching base-level patch and updated the argv-assertion to reflect the resolved path. ``test_dispatch_failure_returns_failed_status`` gets the same patch for consistency. * Full repo: 2867 passed, 0 regression from this PR. The 12 remaining pre-existing failures are unrelated Windows ``symlink`` privilege failures (``WinError 1314``) on a non-admin Windows runner. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cursor-agent): inject --approve-mcps --force for headless MCP/tool access The previous commit (`1c55988`) wired up ``-p --trust`` so the CLI launches in headless mode without the Workspace Trust prompt, but that alone is not enough to let ``specify workflow run`` drive a real speckit feature end-to-end with cursor-agent on Windows. Two more flags are required: * ``--approve-mcps``: without it, every MCP server configured in ``.cursor/mcp.json`` stays ``not loaded (needs approval)``, and any tool call against them is silently dropped. We hit this immediately trying to read a DingTalk PRD from a remote MCP server during the ``/speckit-specify`` step. * ``--force``: without it, the agent halts on the first tool-call approval prompt (the tool call gets rejected and the workflow exits non-zero with a misleading message). With ``--force`` cursor-agent matches the implicit "trusted environment" semantics that ``claude -p`` and ``codex --exec`` already have by default -- which is the right semantics for an unattended ``specify workflow run`` invocation. Verified end-to-end on Windows 10 + cursor-agent 2026.05.16-0338208: * ``cursor-agent -p --trust --approve-mcps --force --output-format text`` + a ``/speckit-specify`` prompt that included a DingTalk URL produced a full spec.md (31.5 KB) plus checklists/requirements.md in ~10.7 min, reading the source PRD through the ``dingtalk-doc`` remote MCP server, deciding the ``specs/`` subpath itself, and updating ``.specify/feature.json`` and ``specs/menu-dictionary.md`` along the way -- no human-in-the-loop, no source PRD ever touched the filesystem. * Without ``--approve-mcps`` the same prompt errors with the tool call rejected message; without ``--force`` the agent stops at the first non-MCP tool call. Tests: * ``test_build_exec_args_`` updated to pin the new four-flag prefix. New ``test_build_exec_args_contains_mandatory_headless_flags`` asserts the four flags are always present together. * ``test_dispatch_command_resolves_cmd_shim_for_subprocess`` updated to match the new argv layout. * All 43 cursor-agent tests pass; no other tests touched. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(cursor-agent): express dispatch support via build_exec_args() instead of requires_cli Co-authored-by: Cursor <cursoragent@cursor.com> * test(cursor-agent): use urlparse hostname check and cover dispatch without requires_cli Co-authored-by: Cursor <cursoragent@cursor.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: 刘一 <liuyi@oureman.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-04 09:48:33 -05:00
Quratulain-bilal	bb2b49d0ae	fix(workflows): validate run_id in RunState.load before touching the … (#2813 ) * fix(workflows): validate run_id in RunState.load before touching the filesystem ``RunState.load(run_id, project_root)`` interpolates ``run_id`` directly into ``project_root / ".specify" / "workflows" / "runs" / run_id`` and then calls ``state_path.exists()`` and ``json.load`` on the result. The run_id is reachable from user input via ``specify workflow resume <run_id>`` (CLI argument) and via ``SPECKIT_WORKFLOW_RUN_ID`` (env var override on the engine's run path), so a value like ``../escape`` turns ``runs_dir`` into ``.specify/workflows/escape/`` and: * ``state_path.exists()`` becomes a file-existence oracle for any path the process can read. * if a ``state.json`` exists at the traversed location (planted by a malicious dependency, a misconfigured shared workspace, or an older spec-kit version that happened to write there), ``json.load`` parses it and the workflow resumes under the attacker-chosen ``workflow_id`` / step state. * a subsequent ``state.save()`` then writes back to the traversed location, persisting the corruption. ``RunState.__init__`` already validates ``run_id`` against ``r'^[a-zA-Z0-9][a-zA-Z0-9_-]$'`` — but that check runs on ``state_data["run_id"]`` after* ``load`` has already done the file lookup, which is too late to prevent the disclosure. This change extracts the pattern into a class-level constant ``_RUN_ID_PATTERN`` and a single ``_validate_run_id`` classmethod so ``__init__`` and ``load`` cannot drift, then calls the validator at the top of ``load`` before any path is built. Mirrors the precedent in ``src/specify_cli/agents.py::_ensure_within_directory`` (used at line 437 of that file) which guards extension-install paths against the same threat model. Regression tests parametrize 9 traversal vectors (``../escape``, ``..``, ``../../etc/passwd``, ``foo/bar``, ``foo\bar``, ``.hidden``, ``-flag``, ``foo\x00bar``, empty) and plant a malicious ``state.json`` outside ``runs/`` so a missing guard would surface as a successful load rather than the ambiguous ``FileNotFoundError``. A second test asserts ``__init__`` and ``load`` reject the same representative malformed ID, so future changes to one path can't silently drift from the other. * test(workflows): exercise RunState.load in shared-validation test, fix __init__ empty-string asymmetry Copilot's review on this PR pointed out that test_init_and_load_share_validation claimed to verify both entry points share the same validation rules but never actually called RunState.load — only __init__ and the shared _validate_run_id helper. A regression in load (e.g. someone deleting the cls._validate_run_id(run_id) call before the path is built) would slip through even though __init__ and the helper stayed aligned, defeating the whole point of the test. Tightening the test surfaced a real asymmetry the previous version was silently masking: self.run_id = run_id or str(uuid.uuid4())[:8] The truthiness fallback meant RunState(run_id="") silently substituted a UUID and skipped validation, while RunState.load("", project_root) correctly rejected the empty string. The two entry points diverged on the empty-string vector. That is exactly the drift the test name claimed to defend against — and the original test missed it. Changes ------- * engine.py: __init__ now distinguishes run_id is None (caller omitted it → auto-generate UUID) from an empty string (caller provided it → must validate like any other value). Both paths still flow through _validate_run_id, but only the explicit-None case auto-generates. * test_workflows.py: test_init_and_load_share_validation is now parametrized over one representative vector per category from test_load_rejects_path_traversal (parent traversal, embedded separator, leading non-alphanumeric, empty string) and asserts that all three entry points — __init__, _validate_run_id, and load — reject the same input. Adding load to the assertion is the substantive fix Copilot asked for; keeping __init__ and the helper alongside it makes any future drift between the three immediately observable instead of having to read three separate tests. Verification ------------ pytest tests/test_workflows.py — 168 passed (was 165 before the parametrize expansion; __init__ empty-string vector would have failed the new test against the old engine code, confirming the asymmetry was real).	2026-06-03 14:26:07 -05:00
Huy Do	1732b9b62e	feat(workflows): allow resume to accept updated workflow inputs (#2815 ) `workflow resume` now accepts `--input key=value` (the same flag and parsing as `workflow run`, via a shared `_parse_input_values` helper). Supplied values are merged over the run's persisted inputs and re-resolved through the existing typed-validation path (`_resolve_inputs`), so a resumed/re-run step sees the updated inputs and ill-typed values fail fast. Keys not supplied keep their persisted values; resuming without `--input` is unchanged. Reference docs updated. Distinct from #2405 (file-reference inputs at run time): this is about supplying inputs at resume time, reusing the existing input model. Closes #2812. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 12:04:07 -05:00
Huy Do	7bab0568c5	feat(workflows): add continue_on_error step field for non-halting failures (#2663 ) * feat(workflows): add continue_on_error step field Adds an optional `continue_on_error: bool` field on every step. When set to `true` and the step fails, the engine records the result (`exit_code`, `stderr` on `steps.<id>.output` plus `status` as a sibling key on `steps.<id>`) and continues to the next sibling step instead of halting the run. Downstream `if`, `switch`, or `gate` steps can then branch on `{{ steps.<id>.output.exit_code }}` to route the recovery path. Engine details -------------- `WorkflowEngine._execute_steps` now consults the step config when a step returns `StepStatus.FAILED`: - Gate aborts (`output.aborted`) always halt the run — operator decisions take precedence over the flag. - Otherwise, if `continue_on_error` is the literal `True`, log a `step_continue_on_error` event and proceed to the next sibling. The runtime check uses identity comparison (`is True`) rather than truthiness, so truthy non-bool values like the string `"true"` cannot silently change run semantics even if a caller bypasses `validate_workflow()`. - Otherwise, behave as before: log `step_failed`, set `RunStatus.FAILED`, and return. Validation ---------- `_validate_steps` rejects non-bool values for `continue_on_error`. Coerced strings like `"true"` are not accepted so authoring mistakes surface at validation time rather than silently changing run semantics. Tests ----- `TestContinueOnError` in `tests/test_workflows.py` (8 tests): - `test_undeclared_failure_halts_run` — default halt behaviour. - `test_declared_and_fired_continues_run` — flag + fail → continue. - `test_declared_but_step_succeeded_is_noop` — flag + success → no-op. - `test_if_branch_routes_around_failure` — end-to-end recovery. - `test_gate_abort_still_halts_with_continue_on_error` — abort always halts. - `test_validation_rejects_non_bool_continue_on_error` — `"true"` rejected at validation. - `test_validation_accepts_bool_continue_on_error` — `true`/`false` pass cleanly. - `test_engine_ignores_truthy_non_bool_continue_on_error` — defense-in-depth: engine ignores string `"true"` even when validation is bypassed. Rebased onto current upstream/main (post #2664 merge); the new `TestContinueOnError` class sits immediately after upstream's `TestContextRunId` so the two feature suites coexist cleanly. Closes #2591. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): restore runtime context section, clarify gate prompt Two Copilot findings on `d0b9e00`: 1. The `### Runtime Context` documentation for `{{ context.* }}` was lost during the rebase onto current main (the squash dropped the anchor where #2664 had added it). Restored under `## Expressions` so users can find `context.run_id` semantics and examples. 2. The continue_on_error example gate had message "Retry or skip?" but used the default `options: [approve, reject]` with `on_reject: skip`, which implied an automatic retry path that gates do not provide. Reworded the message to match the actual approve/reject semantics and added an explicit note that retry requires either custom gate options + downstream branching or a wrapper loop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): clarify continue_on_error scope — returned FAILED only Copilot finding on `d0b9e00`: The README's "Error Handling" intro implied `continue_on_error` covers "any other runtime error raised during step execution", but the engine only consults the flag when a step returns `StepResult(status=FAILED, ...)`. Exceptions raised out of `step_impl.execute()` propagate to `WorkflowEngine.execute()`, where the catch-all logs `workflow_failed` and re-raises — the step result is never recorded, and the flag is never consulted. Audited the whole PR diff for the same overclaim: 1. workflows/README.md — main fix. Reworded the Error Handling intro to "any step that returns StepResult(status=FAILED, ...)" and promoted the parenthetical structural-validation note into the Notes block. Added a new "Scope: returned failures only" note that names the exception path explicitly and tells step authors how to bring the flag into scope for exceptional code (catch internally and return FAILED with the failure encoded in `output`). 2. tests/test_workflows.py — section comment used "when an executable step fails", same ambiguity. Tightened to "when a step returns StepResult(status=FAILED, ...)" and added a sentence calling out that unhandled exceptions are out of scope. 3. src/specify_cli/workflows/engine.py — already correct ("any step that returns FAILED" in the validator comment; "lets the pipeline route around the failure" in the execute path). No change. Engine semantics and test bodies are unchanged. Docs-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): clarify on_reject:skip semantics — engine returns COMPLETED, not auto-skip Copilot finding on `b8982a7`: The README example's gate message said "reject to skip the rest of this branch", and the explanatory paragraph claimed [approve, reject] map to "continue" vs "skip the rest of this branch". The engine does not implement automatic branch-skipping. `on_reject: skip` returns `StepStatus.COMPLETED` (gate/__init__.py:65-66); the next sibling step runs unconditionally unless the author wires a downstream `if` reading `{{ steps.<gate-id>.output.choice }}`. Two fixes: 1. Restructured the YAML example so it actually demonstrates the manual-branching pattern: added a `recover` if-step after the gate that conditions on `steps.review.output.choice == 'approve'`. Now the example shows the real workflow author's responsibility instead of implying the engine does it. 2. Replaced the trailing paragraph with three precise notes: - both gate options return COMPLETED; `on_reject: skip` controls abort behaviour only, not sibling-skipping - all three `on_reject` values enumerated with their actual engine semantics (FAILED+aborted / COMPLETED / PAUSED) - the original retry-loop guidance retained as the third bullet Updated the gate message in the example to match — "reject to leave the failure recorded and move on" instead of "reject to skip the rest of this branch". Audited the whole PR diff for the same overclaim: no other instance. Engine semantics, validation, and test bodies are unchanged. Docs-only. 161/161 tests/test_workflows.py pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): clarify gate's role — surfaces, doesn't programmatically branch Audit follow-up to `393ac6b` — three sites repeated the same minor overclaim about gates being one of the "branch on it" step types alongside `if` and `switch`: 1. workflows/README.md (the "downstream `if`, `switch`, or `gate` steps can branch on it" sentence introducing the example) 2. engine.py:236 (validator inline comment) 3. engine.py:657 (execute-path inline comment) A `gate` step does not have a `condition` or `expression` field — it only evaluates expressions for `message` and `show_file` (gate/__init__.py:29,36). Programmatic branching happens in `if`/`switch`; a gate surfaces the value to a human operator via message interpolation, and the operator's choice is recorded in `output.choice` for a subsequent `if`/`switch` to route on. Reworded all three sites consistently: "a downstream `if` or `switch` can branch on it (or a `gate` can surface it to the operator via message interpolation)". The README example already demonstrates this distinction — the gate carries `{{ }}` template variables in its message and the `recover` if-step downstream is what actually branches on the choice. Engine semantics, validation, and test bodies are unchanged. Docs-only on the README; comment-only on engine.py. 161/161 tests/test_workflows.py pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): use qualified StepStatus.* instead of bare FAILED/COMPLETED/PAUSED Three Copilot inline comments on workflows/README.md lines 226, 282, 288 flagged that ``StepResult(status=FAILED, ...)`` is not valid Python — ``StepResult.status`` is a ``StepStatus`` enum value, so the documented form should be ``StepStatus.FAILED``. Audited the whole PR diff for the same shorthand. The bare unqualified form appears in three files added/modified by this PR: 1. workflows/README.md (6 sites) — three ``StepResult(status=FAILED, ...)`` parentheticals, plus the on_reject Notes bullet listing the three step statuses (``FAILED``, ``COMPLETED``, ``PAUSED``). 2. tests/test_workflows.py (4 sites) — section header for TestContinueOnError, two test-method docstrings, one inline comment about a gate's TTY-fallback behaviour. 3. src/specify_cli/workflows/engine.py (1 site) — the validator inline comment added in `d0b9e00` said "returns FAILED" where the engine code itself uses ``StepStatus.FAILED``. All 11 sites normalised to the qualified ``StepStatus.<name>`` form so the docs / test docstrings / inline comments match what readers will actually find in the engine code and the tests. Engine semantics, validation, and test bodies are unchanged. 161/161 tests/test_workflows.py pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-02 10:10:07 -05:00
Pedro Barbosa	44aac9f6e4	feat: add native Cline integration (#2508 ) * test: strip ansi to make asserts work * feat: add native Cline integration	2026-06-01 11:20:48 -05:00
Copilot	cc3d828227	Add support for SPECKIT_WORKFLOW_RUN_ID override (#2742 ) * Initial plan * feat: support SPECKIT_WORKFLOW_RUN_ID override * docs: clarify run_id env var precedence wording --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-05-29 10:50:00 -05:00
Huy Do	c6afe4cde1	feat(workflows): expose {{ context.run_id }} template variable (#2664 ) * feat(workflows): expose `{{ context.run_id }}` template variable Closes #2590. Surfaces the engine-assigned run id (the same 8-character hex string Spec Kit prints as `Run ID:` at the end of `workflow run`) as a workflow template variable so YAML authors can reference it from shell `run:`, command `input.args:`, switch `expression:`, and any other field that already evaluates `{{ ... }}` templates. ### Why The run id is the natural join key between a Spec Kit workflow run and downstream artifacts, telemetry, or per-run scratch state. Today the operator sees it in stdout but workflows themselves cannot reference it — there was no way to stamp a log line, name a scratch directory, or tag an artifact with the same id Spec Kit assigned. The three motivating use cases from the issue: 1. Telemetry / observability — stamp logs and events with the run id so external systems can join workflow runs to downstream artifacts. 2. Per-run scratch / isolation — interactive operator commands that need their own state directory under `/tmp/run-<id>/`. 3. Run-id in artifact metadata — stable join key from artifact back to the producing run. ### Implementation `StepContext.run_id` is already populated by `WorkflowEngine` in both `execute()` and `resume()`. The only gap was the template namespace builder. `_build_namespace` (in `workflows/expressions.py`) now adds a `context` key alongside the existing `inputs`, `steps`, `item`, and `fan_in` namespaces: ```python ns["context"] = {"run_id": run_id} ``` The value is always present (even outside a run) and falls back to an empty string when no run is active. Workflows referencing `{{ context.run_id }}` therefore never error — a hard requirement from the issue's acceptance criteria for dry-run, validation, and ad-hoc evaluator usage. ### Default behaviour preserved Workflows that do not reference `{{ context.run_id }}` are byte-equivalent to before this change. The `context` namespace is added unconditionally to keep template resolution branch-free, but its presence has no observable effect when nothing references it. ### Tests `TestExpressions` (unit-level) gains three tests: - `test_context_run_id_resolves` — direct lookup against a `StepContext(run_id=...)`. - `test_context_run_id_defaults_to_empty_when_unset` — graceful default outside a run context. - `test_context_run_id_string_interpolation` — mixed template (e.g. `"RUN_ID={{ context.run_id }}"`). `TestContextRunId` (end-to-end) covers the three step types the acceptance criteria called out: - `test_shell_run_resolves_run_id` — `run:` field substitution, verified via captured stdout. - `test_command_input_args_resolves_run_id` — `input.args:` resolution, captured in step output even when CLI dispatch is unavailable (the artifact-metadata use case). - `test_switch_expression_matches_on_run_id` — switch matches against the resolved value, proving the run id is a first-class value in the expression engine, not just an interpolation token. - `test_workflow_without_context_reference_unchanged` — locks the byte-equivalent default required by the issue. ### Docs `workflows/README.md` gains a "Runtime Context" subsection under "Expressions" documenting the new namespace and the three canonical use patterns (telemetry, per-run scratch, artifact metadata). * test(workflows): drop inline double-quotes in run_id shell tests `test_shell_run_resolves_run_id` and `test_switch_expression_matches_on_run_id` used `run: 'echo "RUN_ID={{ context.run_id }}"'` with inner double-quotes around the echo argument. Bash/sh strips those quotes before invoking echo, but cmd.exe (used on Windows when `shell=True`) treats them as literal characters and emits `"RUN_ID=abc12345"` — failing the exact-match assertion. Linux passed; all three Windows-latest matrix entries failed with `assert '"RUN_ID=abc12345"' == 'RUN_ID=abc12345'`. Resolve by dropping the inner double-quotes (the value has no spaces or shell metacharacters) and wrapping the YAML scalar in plain double-quotes the same way other shell-step tests in this file do (e.g. `run: "echo b-saw-..."`). Behaviour-equivalent on POSIX, portable to cmd.exe.	2026-05-27 13:00:58 -05:00
LahkLeKey	0ae451f697	Add util for windows sub-process (#2598 ) * Add util for windows sub-process * Use platform-aware Copilot executable in subprocess calls * Update test_workflows.py	2026-05-26 08:25:15 -05:00
Manfred Riem	616eba6a57	fix: while/do-while loop condition reads stale iteration-0 step output (#2662 ) * fix: while/do-while loop condition reads stale iteration-0 step output After executing namespaced loop body steps, copy each iteration's results back to the original unprefixed step key so that evaluate_condition() sees the latest values instead of stale iteration-0 data. Fixes #2592 * address review: cross-platform tests, preserve iteration-0 history - Rewrite shell scripts in tests to use Python via script files instead of POSIX syntax, so they pass on Windows CI. - Snapshot iteration-0 nested-step results under a namespaced key (parent:child:0) before the first copy-back overwrite, preserving complete per-iteration history for debugging. * address review: skip copy-back on paused/failed iterations Move the status check before the copy-back so that partial results from paused or failed nested steps (e.g., a gate awaiting input) do not overwrite the unprefixed key. This preserves correct resume behavior. * address review: quote paths in test shell commands Quote both the Python executable and script file paths in the run: commands to handle spaces in paths on Windows. * address review: execute loop body with original IDs Instead of namespacing step IDs for execution and copying results back, execute the loop body with original (unprefixed) step IDs so results naturally land at the right keys. Snapshot previous iteration results to namespaced keys (parent:child:N) for history only. This fixes multi-step loop bodies where step B references step A's output within the same iteration — previously step B would see stale data until the copy-back ran after the entire iteration. * address review: namespaced execution with per-step copy-back Revert to namespaced step IDs for execution (preserving unique log entries and state keys per iteration) but copy each step's result back to the unprefixed key immediately after it completes. This preserves backward compatibility (same namespaced key format, same log IDs) while fixing both the condition evaluation bug and inter-step references within multi-step loop bodies. * address review: alias after status check, add multi-step body test - Move per-step aliasing below the PAUSED/FAILED/ABORTED status check so partial results from incomplete steps are not aliased back to the unprefixed key. - Add test_while_loop_multi_step_body_inter_step_refs to exercise a multi-step loop body where step B reads step A's output within the same iteration, verifying per-step aliasing works correctly. Addresses feedback from @doquanghuy (items 2 & 4) and Copilot review on commit `9d0a222`. * address review: stable fallback IDs, expression-based inter-step test - Use enumerate() for stable fallback IDs when loop body steps lack an explicit id (step-0, step-1, etc. instead of always step-0). - Rewrite multi-step body test so step B uses expression substitution ({{ steps.step-a.output.stdout }}) instead of reading the counter file directly, making it a true regression test for per-step aliasing.	2026-05-21 12:25:03 -05:00
darion-yaphet	3d50f85875	fix(cli): clarify exception diagnostics (#2602 ) Consolidate the CLI diagnostic plan, implementation, and test hardening into one reviewable change. The CLI now reports phase and target context for broad failure paths while preserving existing fail-fast behavior for real setup failures and warning-only behavior for optional best-effort work. The workflow unit tests also avoid discovering real local agent CLIs, so developer machines with tools such as gemini installed do not hang pytest during metadata-only assertions. Constraint: CLI setup failures must remain fail-fast, while optional preset and cleanup paths should continue with clear warnings. Rejected: Replace broad handlers across the whole codebase in one pass \| too broad for a targeted CLI diagnostic fix Rejected: Add runtime timeouts to workflow agent dispatch \| dispatch may legitimately be long-running and the observed hang was test isolation Confidence: high Scope-risk: moderate Directive: Keep future best-effort CLI warnings tied to the failed phase and target so users can diagnose setup state. Tested: uvx ruff check src/; uv run pytest tests/integrations/test_cli.py -v; uv run pytest tests/test_workflows.py::TestCommandStep::test_step_override_integration tests/test_workflows.py::TestPromptStep::test_execute_with_step_integration tests/test_workflows.py::TestPromptStep::test_execute_with_model -vv; uv run pytest Not-tested: Real Nacos/PG/Redis-style external service failure injection; real interactive workflow dispatch against installed gemini CLI	2026-05-20 21:19:48 -05:00
Quratulain-bilal	409ec59704	fix(workflow): support integration: auto to follow project's initialized AI (#2421 ) * fix(workflow): support integration: auto to follow project's initialized AI Closes #2406 (squashed) * fix(workflow): combine JSONDecodeError and UnicodeDecodeError handling Address Copilot feedback: UnicodeDecodeError can be raised by both read_text() and json.loads(), so combining the handlers ensures both cases produce a consistent, clear error message. * fix(workflows): honor integration_state schema guard and modern state in 'integration: auto' Three Copilot follow-ups on PR #2421: 1. engine.py:799 — `_load_project_integration` was bypassing the same schema guard `_read_integration_json` enforces. It now reads the schema field directly, returns None on a future schema (so the workflow falls back to the literal 'auto' default rather than guessing), and routes through `normalize_integration_state` / `default_integration_key` so modern installs that record `default_integration` / `installed_integrations` (without the legacy top-level `integration` field) resolve correctly. 2. test_workflows.py — added two regression cases: - `integration: auto` resolves a modern normalized state file - `integration: auto` falls back when the state file declares a newer `integration_state_schema` than this CLI supports 3. test_cli.py — added a CLI-level regression for the `UnicodeDecodeError` branch in `_read_integration_json` to match the existing malformed-JSON coverage. * refactor(integration): extract shared try_read_integration_json helper Address Copilot review on PR #2421: Both `_read_integration_json` (CLI) and `_load_project_integration` (workflow engine) were parsing `.specify/integration.json` independently, duplicating the schema guard and risking drift between the two readers. Extract the parse + schema validation into a single low-level helper `try_read_integration_json` in `integration_state.py` that returns either the normalized state or a structured `IntegrationReadError`. Both callers now delegate to this helper: - CLI keeps its loud-fail UX: each error kind ("decode", "os", "not_object", "schema_too_new") is translated into the existing console message + typer.Exit(1). - Engine keeps its silent fallback: any error simply returns None so `integration: auto` falls back to the workflow's literal default. This eliminates the divergence Copilot flagged without changing observable behavior for either caller. * fix(integration): distinguish missing file from non-regular path Address Copilot review on PR #2421: `try_read_integration_json` was collapsing two distinct cases into a single `(None, None)` return: 1. `.specify/integration.json` truly missing — silent fallback is correct. 2. Path exists but is a directory, socket, or other non-regular file — this is a misconfiguration the CLI should surface loudly. Split the check: `exists()` falsey returns `(None, None)`; existing-but- not-a-regular-file returns `(None, IntegrationReadError(kind="os", ...))` so the CLI's loud-fail path produces an actionable error while the engine still treats it as a fallback to the workflow's literal default. * docs(workflow): clarify version pin, advisory integrations list, enum exemption - workflow.yml: fix comment that said 0.8.3 was first release with auto resolution; the pin is >=0.8.5 so the comment now matches the pin. - workflow.yml: clarify that requires.integrations.any is an advisory, non-exhaustive compatibility hint, not a closed set. - engine.py: clarify that the auto-sentinel exemption only skips enum membership; declared type is still enforced through _coerce_input. * fix(workflow): resolve auto sentinel for provided values; report stat errors Two Copilot findings fixed: 1. _resolve_inputs only resolved the ``integration: auto`` sentinel when it came from the input default. A caller explicitly providing ``{"integration": "auto"}`` (which the workflow prompt advertises as a valid value) bypassed _resolve_default and the literal "auto" leaked to dispatch. Provided values now go through the same resolution path as defaults, and the enum-membership exemption applies in both cases. Regression test added. 2. try_read_integration_json used Path.exists() / Path.is_file() as a pre-check. Both return False on some OSErrors (e.g. permission errors during stat), which silently treated an unreadable-but-present file as missing — the engine fell back without warning and the CLI failed to surface the loud error. The pre-check is gone: read_text() is attempted directly, FileNotFoundError means missing (silent fallback), IsADirectoryError and other OSErrors become loud IntegrationReadError. * fix(workflow): enforce declared type for string inputs, reject bool-as-number Two Copilot findings fixed: 1. _coerce_input previously coerced/validated only ``number`` and ``boolean`` types, so ``type: string`` silently accepted any Python value (numbers, lists, dicts). A YAML authoring mistake like ``type: string`` + ``default: 5`` slipped through. Strings are now required to actually be strings; non-strings raise ValueError, which surfaces as an ``invalid default`` error from validate_workflow. 2. ``type: number`` accepted ``default: true`` because ``bool`` is a subclass of ``int`` (``float(True) == 1.0``). Bools are now rejected explicitly in the number path so the YAML mistake fails fast. The boolean path is also tightened to reject non-bool / non-string values for symmetry. Comment on the auto-sentinel enum exemption updated to reflect the stronger guarantee. Regression tests added for both rejections. * fix(cli): drop unused normalize_integration_state import to satisfy ruff CI's `uvx ruff check src/` flagged this as F401: the symbol was imported under a private alias but never referenced. Tests stay green after removal.	2026-05-15 16:03:33 -05:00
Manfred Riem	370b5b4890	fix(copilot): use --yolo to grant all permissions in non-interactive mode (#2298 ) * fix(copilot): use --yolo to grant all permissions in non-interactive mode The Copilot CLI's --allow-all-tools flag only covers tool execution permissions but does not grant path or URL access. When the Copilot agent autonomously runs shell commands (e.g. npm run build) during workflow execution, the CLI blocks path access and cannot prompt for approval in non-interactive mode, producing: Permission denied and could not request permission from user Replace --allow-all-tools with --yolo (equivalent to --allow-all-tools --allow-all-paths --allow-all-urls) to grant all three permission types. Rename the opt-out env var from SPECKIT_ALLOW_ALL_TOOLS to SPECKIT_COPILOT_ALLOW_ALL to match the formal --allow-all alias and scope it to the Copilot integration. Fixes #2294 * review: deprecate SPECKIT_ALLOW_ALL_TOOLS, rename to SPECKIT_COPILOT_ALLOW_ALL_TOOLS Address Copilot review feedback: - Honour the old SPECKIT_ALLOW_ALL_TOOLS env var as a fallback with a DeprecationWarning so existing opt-outs are not silently ignored. - Rename the new canonical env var to SPECKIT_COPILOT_ALLOW_ALL_TOOLS. - New var takes precedence when both are set. - Use monkeypatch in tests to avoid flakiness from ambient env vars. - Add tests for deprecation warning, precedence, and opt-out paths. * review: use UserWarning instead of DeprecationWarning DeprecationWarning is suppressed by default in Python, so users relying on the old SPECKIT_ALLOW_ALL_TOOLS env var would never see the deprecation notice during normal CLI runs. Switch to UserWarning which is always shown. Update test to also assert the warning category.	2026-04-21 11:41:52 -05:00
Ayesha Khalid	b4c4e86cbc	fix(integrations): strip UTF-8 BOM when reading agent context files (#2283 ) * fix(integrations): strip UTF-8 BOM when reading agent context files * test(integrations): add BOM regression tests for context file read/write * test(workflows): mock shutil.which in tests that assume CLI is absent * test(integrations): remove unused manifest variable in BOM test	2026-04-20 16:00:20 -05:00
Manfred Riem	02a1d610df	docs: add workflows reference, reorganize into docs/reference/, and add --version flag (#2244 ) * docs: add workflows reference, reorganize into docs/reference/, and add --version flag - Move integrations.md, extensions.md, presets.md into docs/reference/ - New docs/reference/workflows.md: command reference for all workflow commands, built-in SDD Cycle workflow with Mermaid diagram, step types, expressions, input types, state/resume, and FAQ - Rename workflow input feature_name to spec with prompt 'Describe what you want to build' to match speckit.specify command terminology - Add --version / -V flag to root specify command with tests - Update docs/toc.yml, README.md links, and docs/upgrade.md cross-reference to use reference/ paths - Add workflow command to README CLI reference table * docs: update speckit_version requirement to >=0.7.2 in workflow example	2026-04-16 13:34:08 -05:00
Copilot	a00e679918	Add workflow engine with catalog system (#2158 ) * Initial plan * Add workflow engine with step registry, expression engine, catalog system, and CLI commands Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Add comprehensive tests for workflow engine (94 tests) Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address review feedback: do-while condition preservation and URL scheme validation Agent-Logs-Url: https://github.com/github/spec-kit/sessions/72a7bb5d-071f-4d67-a507-7e1abae2384d Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address review feedback, add CLI dispatch, interactive gates, and docs Review comments (7/7): - Add explanatory comment to empty except block - Implement workflow catalog download with cleanup on failure - Add input type coercion for number/boolean/enum - Fix example workflow to remove non-existent output references - Fix while_loop and if_then condition defaults (string 'false' → bool False) - Fix resume step index tracking with step_offset parameter CLI dispatch: - Add build_exec_args() and dispatch_command() to IntegrationBase - Override for Claude (skills: /speckit-specify), Gemini (-m flag), Codex (codex exec), Copilot (--agent speckit.specify) - CommandStep invokes installed commands by name via integration CLI - Add PromptStep for arbitrary inline prompts (10th step type) - Stream CLI output live to terminal (no silent blocking) - Remove timeout when streaming (user can Ctrl+C) - Ctrl+C saves state as PAUSED for clean resume Interactive gates: - Gate steps prompt [1] approve [2] reject in TTY - Fall back to PAUSED in non-interactive environments - Resume re-executes the gate for interactive prompting Documentation: - workflows/README.md — user guide - workflows/ARCHITECTURE.md — internals with Mermaid diagrams - workflows/PUBLISHING.md — catalog submission guide Tests: 94 → 122 workflow tests, 1362 total (all passing) * Fix ruff lint errors: unused imports, f-string placeholders, undefined name * Address second review: registry-backed validation, shell failures, loop/fan-out execution, URL validation - VALID_STEP_TYPES now queries STEP_REGISTRY dynamically - Shell step returns FAILED on non-zero exit code - Persist workflow YAML in run directory for reliable resume - Resume loads from run copy, falls back to installed workflow - Engine iterates while/do-while loops up to max_iterations - Engine expands fan-out per item with context.item - HTTPS URL validation for catalog workflow installs (HTTP allowed for localhost) - Fix catalog merge priority docstring (lower number wins) - Fix dispatch_command docstring (no build_exec_args_for_command) - Gate on_reject=retry pauses for re-prompt on resume - Update docs to 10 step types, add prompt step to tables and README * Potential fix for pull request finding 'Empty except' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Address third review: fan-out IDs, catalog guards, shell coercion, docs - Fan-out generates unique per-item step IDs and collects results - Catalog merge skips non-dict workflow entries (malformed data guard) - Shell step coerces run_cmd to str after expression evaluation - urlopen timeout=30 for catalog workflow installs - yaml.dump with sort_keys=False, allow_unicode=True for catalog configs - Document streaming timeout as intentionally unbounded (user Ctrl+C) - Document --allow-all-tools as required for non-interactive + future enhancement - Update test docstring and PUBLISHING.md to 10 step types with prompt * Validate final URL after redirects in catalog fetch urlopen follows redirects, so validate the response URL against the same HTTPS/localhost rules to prevent redirect-based downgrade attacks. * Address fourth review: filter arg eval, tags normalization, install redirect check - Filter arguments now evaluated via _evaluate_simple_expression() so default(42) returns int not string - Tags normalized: non-list/non-string values handled gracefully - Install URL redirect validation (same as catalog fetch) - Remove unused 'skipped' variable in catalog config parsing - Author 'github' → 'GitHub' in example workflow - Document nested step resume limitation (re-runs parent step) * Add explanatory comment to empty except ValueError block * Address fifth review: expression parsing, fan-out output, URL install, gate options - Move string literal parsing before operator detection in expressions so quoted strings with operators (e.g. 'a in b') are not mis-parsed - Fan-out: remove max_concurrency from persisted output, fix docstring to reflect sequential execution - workflow add: support URL sources with HTTPS/redirect validation, validate workflow ID is non-empty before writing files - Deduplicate local install logic via _validate_and_install_local() - Remove 'edit' gate option from speckit workflow (not implemented) * Add comments to empty except ValueError blocks in URL install * Address sixth review: operator precedence, fan_in cleanup, registry resilience, docs - Fix or/and operator precedence (or parsed first = lower precedence) - Restore context.fan_in after fan-in step completes - Catch JSONDecodeError in registry load for corrupted files - Replace print() with on_step_start callback (library-safe) - Gate validation warns when on_reject set but no reject option - Shell step: document shell=True security tradeoff - README: sdd-pipeline → speckit, parallel → sequential for fan-out - ARCHITECTURE.md: parallel → fan-out/fan-in in diagram * Address seventh review: string literal before pipe, type annotations, validate on install - Move string literal check above pipe filter parsing so 'a \| b' works - Fix type annotations: input_values list[str] \| None, run_id str \| None - Run validate_workflow() before installing from local path/URL - Remove duplicate string literal check from expression parser * Address eighth review: fan-out namespaced IDs, early return, catalog validation - Fan-out per-item step IDs use _fanout_{step_id}_{base}_{idx} namespace to avoid collisions with user-defined step IDs - Early return after fan-out loop when state is paused/failed/aborted - Catalog installs parse + validate downloaded YAML before registering, using definition metadata instead of catalog entry for registry * Address ninth review: populate catalog, fix indentation, priority, README - Add speckit workflow entry to catalog.json so it's discoverable - Fix shell step output dict indentation - Catalog add_catalog priority derived from max existing + 1 - README Quick Start clarified with install + local file examples * Address tenth review: max_iterations validation, catalog config guard, version alignment - Validate max_iterations is int >= 1 in while and do-while steps - Guard add_catalog against corrupted config (non-dict/non-list) - Align speckit_version requirement to >=0.6.1 (current package version) - Fan-out template validation uses separate seen_ids set to avoid false duplication errors with user-defined step IDs * Address eleventh review: command step fails without CLI, ID mismatch warning, state persistence - Command step returns FAILED when CLI not installed (was silent COMPLETED) - Catalog install warns on workflow ID vs catalog key mismatch - Engine persists state.save() before returning on unknown step type - Update tests to expect FAILED for command steps without CLI - Integration tests use shell steps for CLI-independent execution * Address twelfth review: type annotations, version examples, streaming docs, requires - Fix workflow_search type annotations (str \| None) - PUBLISHING.md: speckit_version >=0.15.0 → >=0.6.1 - Document that exit_code is captured and referenceable by later steps - Mark requires as declared-but-not-enforced (planned enhancement) - Note full stdout/stderr capture as planned enhancement * Enforce catalog key matches workflow ID (fail instead of warn) * Bundle speckit workflow: auto-install during specify init - Add workflows/speckit to pyproject.toml force-include for wheel builds - Add _locate_bundled_workflow() helper (mirrors _locate_bundled_extension) - Auto-install speckit workflow during specify init (after git extension) - Update all integration file inventory tests to expect workflow files * Address fourteenth review: prompt fails without CLI, resolved step data, fan-out normalization - PromptStep returns FAILED when CLI not installed (was silent COMPLETED) - Engine step_data prefers resolved values from step output - Fan-out normalizes output.results=[] for empty item lists - subprocess.run inherits stdout/stderr (no explicit sys.stdout) - Registry tests use issubset for extensibility * Address fifteenth review: fan_in docstring, gate defaults, validation guards, reserved prefix - FanInStep docstring: aggregate-only, no blocking semantics - FanInStep: guard output_config as dict, handle None - Gate validate: use same default options as execute - Validate inputs is dict and steps is list before iterating - Reserve _fanout_ prefix in step ID validation - PUBLISHING.md: remove unenforced checklist items, add _fanout_ note * Address sixteenth review: docs regex, fan_in try/finally, hyphenated dot-path keys - PUBLISHING.md: update ID regex docs to match implementation (single-char OK) - FanInStep: wrap expression evaluation in try/finally for context.fan_in - Expression dot-path: allow hyphens in keys before list index (e.g. run-tests[0]) * Make speckit workflow integration-agnostic, document Copilot CLI requirement - Workflow integration selectable via input (default: claude) - Each command step uses {{ inputs.integration }} instead of hardcoded copilot - Copilot docstring documents CLI requirement for workflow dispatch - Added install_url for Copilot CLI docs * Address seventeenth review: project checks, catalog robustness - Add .specify/ project check to workflow run/resume/status/search/info - remove_catalog validates config shape (dict + list) before indexing - _fetch_single_catalog validates response is a dict - _get_merged_workflows raises when all catalogs fail to fetch - add_catalog guards against non-dict catalog entries in config * Address eighteenth review: condition coercion, gate abort result, while default, cache guard, resume log - evaluate_condition treats plain 'false'/'true' strings as booleans - Gate abort returns StepResult(FAILED) instead of raising exception so step output is persisted in state for inspection - while_loop max_iterations optional (default 10), validation aligned - Catalog cache fallback catches invalid JSON gracefully - resume() appends workflow_finished log entry like execute() * Address nineteenth review: allow-all-tools opt-in, empty catalogs, abort dead code, while docstring - --allow-all-tools controlled by SPECKIT_ALLOW_ALL_TOOLS env var (default: 1) Set to 0 to disable automatic tool approval for Copilot CLI - Empty catalogs list falls back to built-in defaults (not an error) - Remove unreachable WorkflowAbortError catches from execute/resume (gate abort now returns StepResult(FAILED) instead of raising) - while_loop docstring updated: max_iterations is optional (default 10) * Address twentieth review: gate abort maps to ABORTED status, do-while max_iterations optional - Engine detects output.aborted from gate step and sets RunStatus.ABORTED (was unreachable — gate abort returned FAILED but status was always FAILED) - do-while max_iterations now optional (default 10), aligned with while_loop - do-while docstring and validation updated accordingly * Coerce default_options to dict, align bundled workflow ID regex with validator * Gate validates string options, prompt uses resolved integration, loop normalizes max_iterations * Use parentId:childId convention for nested step IDs - Fan-out per-item IDs use parentId:templateId:index (e.g. parallel:impl:0) - Reserve ':' in user step IDs (validation rejects) - Replaces _fanout_ prefix with cleaner namespacing - Expressions like {{ steps.parallel:impl:0.output.file }} work naturally * Validate workflow version is semantic versioning (X.Y.Z) * Schema version validation, strict semver, load_workflow docstring, preserve max_concurrency - Validate schema_version is '1.0' (reject unknown future schemas) - Strict semver regex: ^\d+\.\d+\.\d+$ (rejects 1.0.0beta etc.) - load_workflow docstring: 'parsed' not 'validated' - Keep max_concurrency in fan-out output (was dropped) - do_while docstring: engine re-evaluates step_config condition - ARCHITECTURE.md: document nested resume limitation * Path traversal prevention, loop step ID namespacing - RunState validates run_id is alphanumeric+hyphens (no path separators) - workflow_add validates catalog source doesn't escape workflows_dir - Loop iterations namespace nested step IDs as parentId:childId:iteration so multiple iterations don't overwrite each other in context/state --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>	2026-04-14 10:11:56 -05:00

19 Commits