github-spec-kit

mirror of https://github.com/github/spec-kit.git synced 2026-07-03 20:36:23 +08:00

Author	SHA1	Message	Date
Ali jawwad	d378485696	fix(workflows): reject infinite number-input default instead of raising OverflowError (#3199 ) WorkflowEngine._coerce_input normalizes a whole-valued number to int via int(value). For an infinite float (e.g. a 'type: number' input with YAML 'default: .inf') int(inf) raises OverflowError, which is not in the except (ValueError, TypeError) tuple. validate_workflow eager-coerces declared defaults and is documented to RETURN a list of errors, but it only catches ValueError -- so the OverflowError escaped and validate_workflow raised instead of reporting, breaking its contract. (NaN already surfaced cleanly because int(nan) raises ValueError.) Add OverflowError to the except tuple so an infinite default surfaces as the same clean 'expected a number' ValueError as NaN, consistent with the function's existing fail-fast-on-authoring-mistakes design. Finite values (5.0 -> 5, 3.5 -> 3.5) are unaffected. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 10:05:51 -05:00
Ali jawwad	96f73d192c	fix(scripts): emit 'Copied plan template' status in setup-plan.ps1 (parity with bash) (#3198 ) setup-plan.sh prints 'Copied plan template to $IMPL_PLAN' after copying the template (to stderr in --json mode, stdout otherwise), but the PowerShell twin emitted nothing on the successful-copy path -- only the 'Plan already exists' skip message and the 'Plan template not found' warning existed. So the two scripts had a divergent status-output contract on a fresh run. Emit the same message after WriteAllText, routed like the sibling skip message ([Console]::Error.WriteLine in -Json so stdout stays pure JSON, Write-Output in text mode). Mirrors the bash wording and stream routing exactly. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 09:54:38 -05:00
Ali jawwad	2a9db1d350	fix(workflows): make expression operator/literal parsing quote-aware (#3197 ) _evaluate_simple_expression split on operator keywords using naive str.find/split, so a keyword INSIDE a quoted operand was treated as an operator: `inputs.mode == 'read and write'` split on the inner ' and ' and evaluated as `(inputs.mode == 'read) and (write')`. The literal short-circuit was also too greedy -- `'a' == 'b'` matched startswith("'")/endswith("'") and was stripped to the garbage truthy string `a' == 'b`, so `'done' == 'failed'` evaluated truthy and gated the wrong branch. Add a quote/bracket-aware _find_top_level helper (mirroring the existing _split_top_level_commas) and use it for the and/or/comparison/in/not-in splits; tighten the literal short-circuit to fire only when the opening quote's match is the final char. The docstring already lists comparisons + and/or/not + in/not-in + string literals as supported, so this restores the documented contract. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 09:53:35 -05:00
Ali jawwad	fd185c1fd8	fix(scripts): honor explicit -Number 0 in PowerShell create-new-feature (parity with bash) (#3196 ) Get-BranchName used `[long]$Number = 0` as both the default and the 'auto-detect' sentinel (`if ($Number -eq 0)`), so an explicitly-passed `-Number 0` was indistinguishable from 'not supplied' and silently auto-incremented. The bash twin keys off whether the value is non-empty (`[ -z "$BRANCH_NUMBER" ]`), so `--number 0` is honored and yields FEATURE_NUM 000 -- a cross-platform divergence for identical input. Use $PSBoundParameters.ContainsKey('Number') instead, so an explicit value (including 0) is honored and only a missing -Number auto-detects -- mirroring bash. This also aligns the -Timestamp+-Number warning, which bash emits for `--number 0 --timestamp` (non-empty check) but PowerShell previously skipped. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 09:52:22 -05:00
Manfred Riem	c49966da4d	fix(claude): stop forking /speckit-analyze to prevent long-session freezes (#3188 ) PR #2511 added `context: fork` + `agent: general-purpose` to the generated speckit-analyze SKILL.md on the assumption that its heavy reads collapse to a short summary. In practice /speckit-analyze returns a 300-500 line report that is injected back into the main conversation. In long sessions each subsequent fork inherits that growing context, compounding overhead until the chat freezes (#3185). Empty FORK_CONTEXT_COMMANDS so no command opts into context: fork, restoring direct in-session execution for analyze. The injection mechanism is retained so a command can be re-enabled once it genuinely returns a compact result. Fixes #3185 Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-26 12:11:54 -05:00
Amirreza Alibeigi	49cc05384a	fix: derive plan path from feature.json in update-agent-context (#3069 ) * fix: derive plan path from feature.json in update-agent-context When `plan_path` is omitted, prefer `.specify/feature.json` (written by /speckit-specify) over the mtime heuristic. The old approach picked the most recently modified `specs//plan.md`, which could inject an unrelated plan into CLAUDE.md if another spec's plan was touched after the active feature directory was created but before its own plan.md existed. Bash: handle both relative and absolute feature_directory values, normalizing absolute paths back to project-relative for the context file. Fall back to mtime only when feature.json is absent or the derived plan.md does not yet exist. PowerShell: same logic, PS 5.1-compatible (nested Join-Path, IsPathRooted guard to avoid Unix Join-Path mis-joining absolute ChildPaths, manual prefix-strip instead of GetRelativePath). Fixes #3067 fix: address Copilot review feedback on update-agent-context - bash: add explicit encoding="utf-8" to feature.json open() call - powershell: replace GetRelativePath (.NET 5+ only) with manual prefix-strip in mtime fallback for PS 5.1 compatibility - tests: add coverage for absolute feature_directory values (under and outside PROJECT_ROOT) * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * test: replace time.sleep with os.utime and strengthen PS normalization assertion * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: normalize trailing slash and guard non-string feature_directory in PS script * Fix: use .resolve().as_posix(). Valid. The PS tests run on Windows where str(tmp_path) uses backslashes, but the PS script normalizes output to forward slashes. Assertions like assert str(tmp_path) not in ctx become false negatives on Windows CI. Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: use context manager for feature.json open() in bash heredoc * test: add PS coverage for absolute feature_directory outside project root * fix: guard null feature_directory, re-check empty after trailing-slash strip, fix blank line * test: add stale plan to absolute-path tests so feature.json preference is actually exercised * test: convert absolute paths to MSYS2 style for Git-for-Windows bash compatibility * fix: revert PS test to native path, fix bash outside-root assertion for Git bash * fix: use _to_bash_path in not-in assertion for Git bash Windows compat * fix: add ConvertFrom-Json fallback in PS script, write test config as JSON * fix: use OS-appropriate StringComparison in PS prefix-strip (matches common.ps1) * fix: emit project-relative POSIX path from mtime fallback; use upstream test helpers * fix: write config as JSON directly, drop _install_agent_context_config * fix: normalize backslashes to forward slashes in feature_directory before path ops * fix: treat drive-qualified paths (C:/...) as absolute after backslash normalization * fix: resolve symlinks when computing relative plan path; use UTF8 encoding in PS ConvertFrom-Yaml path * fix: use bash-side path for outside-root case to avoid WindowsPath backslashes * fix: use .as_posix() instead of PurePosixPath() to avoid backslashes on native Windows Python * fix: resolve ./.. segments in PS feature_directory via GetFullPath before relativizing * fix: replace $IsWindows guard with OSVersion.Platform check for PS 5.1 StrictMode compat * fix: guard empty relDir to avoid leading slash in PlanPath when feature_directory is project root * fix: remove unused PurePosixPath import; fix stale PS comment after ConvertFrom-Json fallback was added * fix: use cand.as_posix() for outside-root path instead of raw bash-side argv --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-26 12:07:44 -05:00
Si Zengyu	1add20341d	fix(extensions,presets,workflows): resolve private GHES release assets via /api/v3 (#3157 ) * feat(auth): add github_provider_hosts() to enumerate GHES hosts from auth.json Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous) * fix(extensions): resolve GHES release assets via /api/v3 Generalizes resolve_github_release_asset_api_url to GitHub Enterprise Server hosts (gated by auth.json github hosts), fixing private GHES extension/preset downloads. github/spec-kit#3147 Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous) * fix(extensions,presets): pass auth.json github hosts into release resolver Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous) * docs(auth): document GHES private catalog + release-asset auth Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous) * fix(presets,workflows): pass auth.json github hosts into remaining release resolvers Wires preset add --from and workflow add through github_provider_hosts() so private GHES release assets resolve via /api/v3 there too. github/spec-kit#3147 Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous) * test(presets): use module-level io.BytesIO in GHES preset test Addresses Copilot review on PR #3157: drop unnecessary __import__("io") in test_preset_add_from_ghes_release_url_resolves_via_api_v3 since io is already imported at module level. * fix(github-http): pass through GHES asset API URLs by path shape Addresses Copilot review on PR #3157. A direct GHES /api/v3 release asset URL was only returned as already-resolved when its host was in the allowlist; otherwise the resolver returned None and the caller downloaded the same URL without 'Accept: application/octet-stream', fetching JSON metadata instead of the binary. Gate the passthrough on path shape alone, mirroring the github.com case. This is safe: passthrough returns the input URL unchanged and the caller fetches it either way, so no new request to an arbitrary host is induced; the token stays independently gated by auth.json in open_url. The allowlist remains the anti-SSRF gate on the tag-lookup resolving path. Add test_passthrough_for_unlisted_ghes_api_asset_url.	2026-06-25 10:44:30 -05:00
Ali jawwad	9fe1c4cc5c	fix(scripts): keep PowerShell branch-name acronym match case-sensitive (parity with bash) (#3129 ) * fix(scripts): keep PowerShell branch-name acronym match case-sensitive Get-BranchName keeps a sub-3-character word only when it appears as an UPPERCASE acronym in the description. The bash twin checks this case-sensitively (grep "\b${word^^}\b" / grep -qw -- "${word^^}"), but the PowerShell twin used -match, which is case-INSENSITIVE, so it kept EVERY short word regardless of case -- contradicting its own comment and diverging from bash. The same description then produced different spec-directory and branch names on Windows/PowerShell vs macOS/Linux (e.g. "Add go support" -> 001-go-support instead of 001-support), desyncing specs/, feature.json, and git branches across a mixed-OS team. Use the case-sensitive -cmatch so a short word is kept only for a genuine uppercase acronym, matching bash. Applied to both the core scripts/powershell/create-new-feature.ps1 and the git extension's create-new-feature-branch.ps1. Add bash + PowerShell regression tests (core and git-extension) asserting a lowercase short word is dropped while an uppercase acronym is kept. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: fix article grammar in branch-name docstrings Address review: 'an UPPERCASE acronym' -> 'an acronym in UPPERCASE' across the four branch-name case-sensitivity test docstrings (the indefinite article reads cleanly before 'acronym'). Docstring-only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 07:56:59 -05:00
meymchen	dc840f07d0	feat(integration): update Kimi integration for Kimi Code CLI (#2979 ) * feat(integration): update Kimi integration for Kimi Code CLI Update the Kimi integration to target the new Kimi Code CLI (MoonshotAI/kimi-code) layout: - Change skills directory from .kimi/skills/ to .kimi-code/skills/ - Change context file from KIMI.md to AGENTS.md - Extend --migrate-legacy to move old .kimi/skills/ installs and migrate KIMI.md user content to AGENTS.md - Clean up leftover legacy .kimi/skills/ directories on teardown - Update devcontainer installer to @moonshot-ai/kimi-code - Update docs and tests Relates to #1532 * fix(integration): align Kimi dispatch and harden legacy migration - Override build_command_invocation to emit /skill:speckit-<stem> so dispatched commands match Kimi Code CLI's native slash syntax. - Skip symlinked .kimi/skills directories during legacy migration and teardown to avoid operating on files outside the project. - Remove kimi from the multi-install-safe integrations table. - Add tests for command invocation and symlink safety. * fix(integration): resolve custom context markers in Kimi legacy migration Use IntegrationBase._resolve_context_markers() when migrating legacy KIMI.md content so that projects with customized context_markers in .specify/extensions/agent-context/agent-context-config.yml have the managed section stripped with the correct markers instead of the hard-coded defaults. Adds a test verifying custom markers are respected during --migrate-legacy. * fix(integration): harden Kimi legacy migration against symlinked paths * fix(kimi): guard symlinked SKILL.md during migration and teardown * docs(kimi): mention KIMI.md→AGENTS.md migration in --migrate-legacy help The --migrate-legacy help text listed only the skills directory move and dotted→hyphenated renaming, but the flag also migrates KIMI.md user content into AGENTS.md. Align the help with the actual behavior, docs, and tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(kimi): validate legacy migration destination; clarify docstrings Address Copilot review feedback on PR #2979: - setup(): gate skills migration on _is_safe_legacy_dir(new_skills_dir) as well as the source. base setup() already rejects a destination that escapes the project root, but an in-tree symlinked .kimi-code/skills (e.g. -> .) could still misdirect the move; this gives the destination the same symlink-component protection as the source. - _migrate_legacy_kimi_dotted_skills: rewrite docstring as a compatibility shim describing same-path delegation to _migrate_legacy_kimi_skills_dir. - test_presets: clarify that the dotted-skill test exercises legacy naming under the current .kimi-code/ base, not the legacy .kimi/ location. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(kimi): harden legacy KIMI.md→AGENTS.md context migration - Skip context-file migration when the agent-context extension is disabled, matching upsert/remove_context_section opt-out behavior so an opted-out project's KIMI.md/AGENTS.md are left untouched. - Safely skip (instead of raising) on filesystem edge cases: unreadable or non-UTF-8 KIMI.md, and AGENTS.md existing as a non-file/unwritable. - Refuse to migrate a corrupted managed section (single marker, or end before start) so a partial managed block is never copied into AGENTS.md; KIMI.md is preserved for manual repair. Add regression tests for all three cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Approve fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * chore(kimi): revert CHANGELOG.md edit (auto-generated) The CHANGELOG is generated from merged PR titles, so a hand-written entry is redundant; it was also placed under the already-released 0.10.2 section, which would make those release notes historically inaccurate. Revert to match main per maintainer feedback. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(kimi): skip symlink-safety tests when symlinks are unavailable The Kimi legacy-migration safety tests create symlinks to assert that migration/teardown never follow them out of the project. Symlink creation fails on Windows without the create-symlink privilege and in some restricted CI sandboxes, so these tests errored during setup instead of skipping. Wrap every symlink_to() call in a shared _symlink_or_skip() helper that pytest.skip()s on OSError/NotImplementedError, matching the guard pattern already used by one of these tests. Verified on Windows: the 6 symlink tests now skip cleanly (51 passed, 6 skipped) instead of erroring. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(kimi): reject symlinked skills destination before install Add a destination symlink pre-check in KimiIntegration.setup() before super().setup() writes any SKILL.md. The base class only rejects a destination that escapes project_root after resolve(), so an in-tree symlinked .kimi-code/.kimi-code/skills (e.g. `-> .`) would still misdirect writes into an unintended in-tree location (./skills/). Extract the symlink-component walk into a shared _has_symlinked_component() helper and reuse it from _is_safe_legacy_dir(). Add a regression test. Also clarify that --migrate-legacy only migrates KIMI.md -> AGENTS.md when the agent-context extension is enabled, in the CLI help text and the integration docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Refactor formatting and simplify logic in Kimi integration * fix(kimi): reject symlinked target dir during legacy skills migration When the migration destination already exists, guard against a symlinked (or non-directory) target_dir before comparing SKILL.md bytes, so the comparison never follows a link outside the project root. Also skip a missing/non-file target SKILL.md explicitly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-24 15:22:08 -05:00
Ali jawwad	fdaaf18371	fix(workflows): preserve commas inside quoted list-literal elements (#3134 ) * fix(workflows): preserve commas inside quoted list-literal elements The simple-expression evaluator parsed a list literal with a naive `inner.split(",")`, which splits on commas inside quoted strings (and nested brackets). So `{{ ["a, b", "c"] }}` evaluated to three items (`["a", "b", "c"]`) instead of two, silently corrupting `fan-out` `items:` and any list expression that contains a comma inside a quoted element. Split list-literal elements on top-level commas only, ignoring commas inside quotes or nested brackets, via a small `_split_top_level_commas` helper. Plain and empty lists are unchanged. Add tests covering quoted commas, nested lists, and the existing plain/empty cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): cover single-quoted and nested list literals Address review: extend the list-literal regression test to assert single-quoted elements with commas and nested lists parse correctly, alongside the existing double-quoted cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 15:10:02 -05:00
Pascal THUET	e5df517ddc	ci: pin actions to commit SHAs and add shellcheck (#3126 ) * ci: pin actions to commit SHAs and add shellcheck Pin actions/github-script in catalog-assign.yml to a full commit SHA; all other workflows were already pinned. Add a repo-wide regression test that every workflow `uses:` ref is pinned to a 40-char commit SHA. Add a shellcheck job to lint.yml (--severity=error over scripts/bash/.sh) and document the local command in CONTRIBUTING.md. ci: use repo-standard actions/checkout v7.0.0 in shellcheck job * ci: shellcheck all tracked shell scripts Assisted-by: Codex (model: GPT-5, autonomous) * ci: address workflow hygiene review feedback Assisted-by: Codex (model: GPT-5, autonomous)	2026-06-24 15:08:16 -05:00
Zied Jlassi	b042d2a843	feat(extensions): verify catalog archive sha256 before install (#3080 ) * feat(extensions): verify catalog archive sha256 before install Extension and preset archives were downloaded over HTTPS and unpacked (with Zip-Slip protection) but their bytes were never checked against a known digest. Trust rested entirely on TLS and the integrity of the release host, so a tampered or swapped archive from a compromised third-party release would be installed silently. Maintainers do not audit extension code, so consumer-side integrity is the only available defence. Catalog entries may now pin an optional `sha256` digest. When present, the downloaded archive is verified before it is written to disk and installed; a mismatch aborts with a clear error. Entries without `sha256` keep working unchanged (a DEBUG line records that the download was unverified), so the change is backwards compatible. The check runs on both download paths (extensions and presets) via a single shared helper so the two stay in parity. - Add `verify_archive_sha256` helper in shared_infra (digest match, `sha256:` prefix, case-insensitive; DEBUG log when no digest declared) - Enforce it in ExtensionCatalog.download_extension and PresetCatalog.download_pack, before the archive is written to disk - Document the optional `sha256` field in the publishing guides - Tests: helper unit tests + matching/mismatch/no-digest on both paths Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> Assisted-by: AI * fix(extensions): harden sha256 parsing and tidy download test mocks Follow-up to the review on #3080: - shared_infra.verify_archive_sha256: strip only a literal `sha256:` algorithm prefix (case-insensitive) instead of `split(':', 1)[-1]`, which silently dropped any prefix — so `md5:<64-hex>` was accepted as if it were a valid SHA-256. Validate that the declared value is exactly 64 hex characters and raise a clear error otherwise, and compare with `hmac.compare_digest` for a constant-time check. Add tests covering a malformed digest and a non-`sha256:` prefix (both previously accepted). - Download test helpers: configure the context-manager mock via `__enter__.return_value`/`__exit__.return_value` rather than assigning a `lambda s: s`, which is clearer and independent of the invocation arity. Assisted-by: AI Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com> * fix(extensions): reject a declared-but-empty sha256 instead of skipping verification verify_archive_sha256 skipped on any falsy expected value, so a present-but-empty digest (e.g. sha256: "" reached via ...get("sha256")) silently disabled the integrity check instead of surfacing the authoring error. Guard on expected is None so only an absent digest skips; blank/whitespace/bare-prefix values fall through to the 64-hex validation and are rejected. Adds a regression test. Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> * docs(shared_infra): clarify _SHA256_HEX_RE accepts and normalizes uppercase The comment described the regex as matching '64 lowercase' hex characters, but verify_archive_sha256 lowercases the declared value (raw.lower()) before matching, so an uppercase digest is accepted and normalized rather than rejected. Clarify the comment to avoid misleading future readers. Addresses Copilot review feedback on shared_infra.py. Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> * test(presets): cover the no-sha256 backwards-compatible path Address Copilot review: download_pack's optional sha256 verification was tested for match/mismatch but not the backwards-compatible path where a catalog entry has no sha256 (pack_info.get("sha256") is None). Add a no-sha256 test mirroring the extensions coverage so the helper never silently becomes mandatory for presets. Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> --------- Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>	2026-06-24 14:52:24 -05:00
Zied Jlassi	f846d6526c	fix(workflows): validate requires keys and reject phantom permissions gate (#3079 ) * fix(workflows): validate requires keys and reject phantom permissions gate A workflow's `requires` block was parsed but its keys were never validated, so a typo or an unsupported key was silently ignored. Most importantly, authors could write `requires.permissions.shell: true` expecting a runtime capability gate — but no such gate exists: a `shell` step always runs with the user's privileges. The declaration gave a false sense of sandboxing. `validate_workflow` now accepts only the recognised keys (`speckit_version`, `integrations`, `tools`, `mcp`) and rejects anything else, with an explicit error for `requires.permissions` pointing authors to `gate` steps for approval. Docs and the model comment are updated to state that `requires` is advisory, not a security boundary. - Reject non-mapping `requires`, unknown keys, and `requires.permissions` - Clarify workflows reference + PUBLISHING.md shell-step guidance - Tests for valid keys, non-mapping, unknown key, and permissions Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> Assisted-by: AI * fix(workflows): address review feedback on requires validation Follow-up to the review on #3079: - Guard `requires` validation on `is not None` instead of truthiness so a falsy non-mapping value (e.g. `requires: []` or `requires: ''`) is reported as an error instead of being silently skipped; `requires:` (YAML null) is still treated as an omitted block. Add a regression test. - Reword the workflows security note so `requires.permissions` is shown as rejected/unsupported rather than as a valid example of `requires`. - Standardize on US spelling (`_RECOGNIZED_REQUIRES_KEYS`, "recognized") to match the surrounding code and ease searching. - Tighten the permissions-rejection test to assert on specific message markers (`requires.permissions` and the `gate` guidance) so it fails if the validation path or wording drifts. Assisted-by: AI Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com> * fix(workflows): scope requires validation to workflow keys (drop tools/mcp) tools and mcp belong to the bundle manifest requires schema (bundler/models/manifest.py, resolved in bundler/services/resolver.py), not the workflow requires validated here. Drop them from _RECOGNIZED_REQUIRES_KEYS and revert the PUBLISHING.md claim that this PR had introduced, so workflow requires only recognizes speckit_version and integrations. This keeps the existing docs accurate and resolves the inline doc-consistency review comments. Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> * refactor(workflows): type WorkflowDefinition.requires as Any pre-validation self.requires holds the raw parsed value, which before validate_workflow() runs may be a non-mapping (None for a bare 'requires:', a list for 'requires: []', etc.). Annotating it dict[str, Any] was misleading for editors/type-checkers; use Any and document that validate_workflow() enforces the mapping shape. Addresses Copilot review feedback on engine.py. Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> * fix(workflows): reject YAML-null requires: as a non-mapping Address Copilot review: validate requires the same way as inputs. A bare requires: parses as YAML null and was previously treated as an omitted block, which is inconsistent with inputs and lets a stray requires: line be silently ignored. Drop the is-not-None guard and check isinstance(..., dict) directly: an omitted block still defaults to {} (valid), but a present-but-non-mapping value -- YAML null, [] or '' -- is now an authoring error that surfaces. Tests: add YAML-null rejection + an omitted-is-still-valid guard test. Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> --------- Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com> Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>	2026-06-24 14:49:43 -05:00
Quratulain-bilal	37e0e71b4e	fix(scripts): use case-sensitive match for acronym retention in PS branch names (#3130 ) The branch-name generator keeps a short (<3 char) word only when it appears in uppercase in the description, treating it as an acronym (the comment says as much). The bash script uses a case-sensitive grep for this, but the PowerShell script used -match, which is case-insensitive by default. As a result every short non-stop word was retained on PowerShell even when lowercase, so the same description produced different branch names across the two shells (e.g. 'go AI now' -> 001-go-ai-now on PS vs 001-ai-now on bash). Switch to -cmatch so the check is case-sensitive and the two shells agree. Adds parity tests covering a dropped lowercase short word and a kept uppercase acronym.	2026-06-24 14:02:41 -05:00
Omar	44ef11aa18	feat(integrations): add omp support (#3107 ) * feat(integrations): add omp support * Update updated_at timestamp * refactor(integrations): delegate omp build_exec_args to base, register in issue templates Inherit MarkdownIntegration.build_exec_args so omp picks up shared CLI contract changes (requires_cli gating, extra-args ordering, --model handling) automatically; only specialize the --mode json flag. Also add Oh My Pi / omp to the issue-template agent lists so test_issue_template_agent_lists_match_runtime_integrations passes. * fix(integrations): use --print + positional prompt for omp argv OMP's CLI parser treats `-p`/`--print` as a boolean (one-shot mode) and consumes the prompt as a positional message; the previous inherited `-p <prompt>` shape worked by accident only because `-p` ignores its next token. Build the argv explicitly with flags first and the prompt as a trailing positional, matching upstream args.ts.	2026-06-24 13:44:34 -05:00
Ali jawwad	034fbfcbb4	fix: render valid TOML when a command body contains backslashes (#3135 ) render_toml_command() emitted the body inside a multiline basic TOML string ("""..."""), which processes backslash escape sequences. A command body containing a backslash — e.g. a Windows path like C:\Users\... whose \U reads as an invalid unicode escape — therefore produced unparseable TOML ("Invalid hex value"), so the generated Gemini/Tabnine command file failed to load. A body ending in a backslash also silently ate the closing newline via TOML line-continuation. Route bodies containing a backslash to the multiline literal form ('''...'''), which does not process escapes, or to the escaped basic string when both triple-quote styles are present. Mirrors the escaping already done by base.py's TomlIntegration. Add tests covering a Windows path, a trailing backslash, and the backslash + both-triple-quote-styles fallback. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 13:13:44 -05:00
Pascal THUET	8e76ff3d5c	harden: reject shell=True in run_command (#3132 ) run_command() forwarded shell= straight to subprocess.run, so a caller passing shell=True would invoke a shell. Reject shell=True with ValueError (keeping the parameter for signature compatibility) and drop shell= from both subprocess.run calls. Enable ruff S602/S604/S605 to flag any future shell=True reintroduction, annotate the one intentional workflow shell sink with # noqa: S602, and document the shell-step execution risk in workflows/PUBLISHING.md.	2026-06-24 13:05:21 -05:00
Quratulain-bilal	0ef53eb91f	fix(scripts): send check-prerequisites.ps1 errors to stderr (#3123 ) * fix(scripts): send check-prerequisites.ps1 errors to stderr The validation errors and run-hints in check-prerequisites.ps1 were written with Write-Output, so they went to stdout. This script is usually run with -Json and its stdout parsed by the agent, so an error (e.g. missing plan.md) leaves the parser with an error string instead of JSON. The bash counterpart already writes these to stderr (>&2), as do the sibling PowerShell scripts (setup-tasks.ps1, common.ps1's Get-FeaturePathsEnv). Switch the six error/hint lines to [Console]::Error.WriteLine so stdout stays clean and the two shells match. * test(scripts): assert check-prerequisites errors stay on stderr Per the #3122 bug assessment, tighten the failure-path tests so they verify stdout stays clean (empty / valid JSON) and the error text only appears on stderr, instead of checking the combined stdout+stderr string. Covers all three PowerShell validation paths (missing feature dir, missing plan.md, missing tasks.md with -RequireTasks) and the bash counterpart. The two new error-routing tests fail on the pre-fix script (errors on stdout) and pass after it.	2026-06-23 14:09:17 -05:00
JinHyuk Sung	0c975bbef7	fix: write Codex dev skills as files (#2988 ) * fix: write Codex dev skills as files * fix: route codex dev symlink policy through metadata * fix: replace codex dev symlinks on refresh * fix: migrate codex dev skill symlinks * fix: avoid inactive shared skill dev symlinks * fix: preserve unrelated dev skill symlinks	2026-06-23 10:55:38 -05:00
github-actions[bot]	45423d6bc6	[extension] Update Spec Kit Preview extension to v1.1.0 and sync Firebender agent lists (#3116 ) * Update Spec Kit Preview extension to v1.1.0 Update preview extension submitted by @bigsmartben to: - extensions/catalog.community.json (version, name, description, download_url, commands, tags, updated_at) - docs/community/extensions.md community extensions table (name, description, alphabetical order) Closes #3109 Assisted-by: GitHub Copilot (model: claude-sonnet-4.6, autonomous) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Sync issue templates with firebender integration Assisted-by: GitHub Copilot (model: GPT-5, autonomous) --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Manfred Riem <15701806+mnriem@users.noreply.github.com>	2026-06-23 09:32:16 -05:00
Ali jawwad	0a126256e0	feat: add Firebender integration (Android Studio / IntelliJ) (#3077 ) * feat: add Firebender integration (Android Studio / IntelliJ) Firebender (https://firebender.com/) is an AI coding agent for Android Studio and IntelliJ. It reads project-local custom slash commands from .firebender/commands/.mdc and project rules from .firebender/rules/.mdc. Add a FirebenderIntegration (MarkdownIntegration) that installs the speckit command templates as .mdc command files and writes the managed context section into .firebender/rules/specify-rules.mdc. command_filename is overridden so init-time commands also use the .mdc extension Firebender requires. Register it in the integration registry, add the catalog entry and docs row, and add an integration test covering the .mdc command output. Closes #1548 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat: address review - bump catalog updated_at and list firebender as multi-install safe Bump the catalog top-level updated_at to reflect the new entry, and add firebender (with its .firebender/commands + .firebender/rules/specify-rules.mdc isolation paths) to the 'currently declared multi-install safe integrations' table in the docs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-23 08:01:00 -05:00
WOLIKIMCHENG	ac4f646144	chore: sync issue template agent lists (#3052 ) * chore: sync issue template agent lists * test: harden agent template consistency check * test: harden agent template drift checks --------- Co-authored-by: root <kinsonnee@gmail.com>	2026-06-23 07:41:58 -05:00
José Villaseñor Montfort	e5a03bffc8	fix(shared-infra): remove stale managed scripts the core no longer ships (#3076 ) (#3098 ) * fix(shared-infra): remove stale managed scripts the core no longer ships (#3076) install_shared_infra never removed shared scripts a prior (pre-refactor) install recorded but the current core no longer ships — e.g. the legacy scripts/<variant>/update-agent-context.sh, superseded by the bundled agent-context extension. On a legacy project the orphan lingers and crashes when it sources a refreshed common.sh (HAS_GIT unbound under set -u). Apply the stale-removal that integration_upgrade already performs to install_shared_infra: manifest-tracked scripts the current bundle no longer produces are removed, but only managed copies (hash matches the manifest); user-customized files, symlinks, and recovered entries are preserved. Guarded so a missing/empty source can't trigger mass deletion, and the safe-destination check prevents unlinking through a symlinked ancestor. Add IntegrationManifest.remove(); drop the stale update-agent-context.sh reference in CONTRIBUTING.md. AI assistance: implemented with Claude Code (Anthropic); reviewed and validated locally (ruff clean, full suite 4176 passed, manual CLI repro). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(shared-infra): harden stale-cleanup per review (empty source + orphan manifest) - Set scripts_scanned only after a real source file is seen, so an empty variant source can't trigger mass deletion of tracked scripts. - Prune a stale manifest entry even when its file is already gone from disk, keeping the manifest consistent (previously left tracked forever). - Add a test for each edge case. Addresses the Copilot review comments on #3098. AI assistance: Claude Code (Anthropic), reviewed/validated locally (ruff clean, full suite 4178 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(shared-infra): guard unsafe manifest keys in stale-cleanup (review) - Skip absolute / '..' manifest keys before any filesystem access in stale-cleanup, so a corrupted/hand-edited manifest can't make it touch paths outside the project root (mirrors IntegrationManifest.check_modified / uninstall). - Clarify the scripts_scanned comment: the safety hinge is that flag, not seen_rels (which also holds template paths). - Add a containment test: a traversal manifest key is skipped, its target untouched. Addresses the second round of Copilot review on #3098. AI assistance: Claude Code (Anthropic); validated locally (ruff clean, full suite 4179 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(manifest): make remove() reject absolute/.. keys like its siblings (review) IntegrationManifest.remove() now applies the same lexical validation and normalization as record_existing() / is_recovered(): absolute paths and '..' segments are rejected (return False) instead of being used verbatim as a key. Keeps the manifest API consistent. Adds tests (valid drop + no-op, absolute rejected, traversal rejected). Addresses the third round of Copilot review on #3098. AI assistance: Claude Code (Anthropic); validated locally (ruff clean, full suite 4182 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(shared-infra): validate stale-cleanup keys for containment, not just lexically (review) The stale-script cleanup guarded manifest keys with a lexical check only (is_absolute() / ".." segments). On Windows a drive-relative key such as "C:tmp\\file" is not is_absolute(), yet joining it onto the project path discards the root — so cleanup could stat/unlink outside the project before _ensure_safe_shared_destination raised, and a corrupted manifest key turned into an install-time hard failure (ValueError) instead of being skipped. Reuse the canonical containment helper (_validate_rel_path, the same one IntegrationManifest.is_recovered / remove use): after the fast lexical reject, resolve the join and confirm it stays within the project root; a key that still escapes is skipped, never unlinked, never fatal. Adds a regression test that forces _validate_rel_path to reject a managed key (portably simulating the Windows drive-relative escape) and asserts the install skips it without failing and still installs the real scripts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-23 07:36:54 -05:00
Pascal THUET	ce01877610	fix: register enabled extensions for agent on integration use/upgrade (#2949 ) * fix: register enabled extensions for agent on integration install/upgrade install and upgrade only set up the integration's own core commands; only switch re-registered the enabled extensions' commands for the target agent. A second integration added via install (or refreshed via upgrade) was therefore silently missing the extension commands the existing agents already had (e.g. the bundled agent-context extension). Extract switch's registration into a shared _register_extensions_for_agent helper and call it from install and upgrade too, so every installed agent ends up with every enabled extension's commands — full parity with switch. Closes #2886 * test: pin skills-mode secondary-agent registration; document #2948 limitation Extension skill rendering is scoped to the active agent (init-options track a single ai / ai_skills pair), so a skills-mode agent registered while not active (e.g. Copilot --skills installed as a secondary integration) gets command files rather than skills. install/upgrade match extension add here; only switch renders skills, because it activates the target first. Add a regression test pinning this behavior and document the limitation on the shared helper. Per-agent skills parity is tracked separately in #2948. * fix: don't re-render the active agent's skills when registering a non-active agent register_enabled_extensions_for_agent runs an active-agent-scoped skills pass (_register_extension_skills resolves the skills dir from init-options["ai"], ignoring the passed agent). Routing install/upgrade of a secondary integration through it re-rendered the active skills-mode agent's extension skills as a side effect — resurrecting skill files the user had deliberately deleted. Gate the skills pass on the target being the active agent; switch is unaffected because it activates the target first. Also harden the skills-mode install test (assert a core skill so --skills is load-bearing, drop a vacuous registered_skills assertion) and add a regression test. Surfaced by review of the PR; skills parity for non-active agents stays tracked in #2948. * refactor: share the extension-op scaffold and run (un)registration post-commit Review cleanups, no behavior change on the success path: - Extract the best-effort ExtensionManager scaffold (lazy import, instantiate, except -> _print_cli_warning) into _best_effort_extension_op. Both _register_extensions_for_agent and a new _unregister_extensions_for_agent delegate to it, removing the duplicate block left inline in switch. - Invoke the best-effort extension registration AFTER the install/switch/upgrade try/except has committed, so a failure in it can never trigger the rollback (install and switch teardown on except). * docs: clarify extension registration parity scope * fix(integrations): defer extension registration until use * fix(tests): remove redundant shutil import * fix(integrations): backfill extensions for installed switch targets	2026-06-22 17:48:55 -05:00
darion-yaphet	826e193cee	refactor: move extension command handlers to extensions/_commands.py (PR-7/8) (#3014 ) * refactor: move extension command handlers to extensions/_commands.py (PR-7/8) Convert the flat extensions.py module into an extensions/ package and extract all extension_app and catalog_app command handlers plus their private helpers (_resolve_installed_extension, _resolve_catalog_extension, _print_extension_info) out of __init__.py into the new extensions/_commands.py, mirroring the domain-dir layout used for presets/_commands.py (PR-6) and integrations/_commands.py (PR-5). - extensions.py -> extensions/__init__.py (pure rename, 99%); intra-module relative imports bumped from `.x` to `..x` since they reference root siblings. - Root helpers (_require_specify_project, _locate_bundled_extension, load_init_options, _display_project_path) are reached through thin shims that re-fetch from the parent package at call time, so test monkeypatching of specify_cli.<helper> keeps working unchanged. - __init__.py drops ~1444 lines (3511 -> 2067); CLI surface preserved via register(app). No behavior change. Full suite failure set is identical before/after (82 pre-existing env failures, 0 new). * fix(extensions): preserve per-command path in update backup for skills agents Skills agents (extension == "/SKILL.md") name every command file SKILL.md, each in its own per-command subdir (e.g. speckit-plan/SKILL.md). The update backup keyed the backup path on cmd_file.name alone, so all of an agent's skill files collided onto a single backup path — each shutil.copy2 overwrote the previous one, and rollback restored one skill's content over all the others, corrupting or losing the rest. Mirror the real on-disk layout by using cmd_file.relative_to(commands_dir), keeping each backup path unique. This also makes backed_up_command_files values unique so restore copies the correct content back to each command. Add a regression test asserting two distinct skill files survive a backup -> failed-update -> rollback cycle with their own content. * style(extensions): use yaml.safe_dump when writing catalog config The catalog add/remove handlers wrote the integration catalog config with yaml.dump. Switch to yaml.safe_dump to align with the SafeDumper used by the presets commands and to refuse emitting !!python/object tags if a non-basic value ever reaches the config dict. Output is unchanged for the current basic-type payload (str/int/bool/dict/ list) — this is a defensive/consistency change, not a behavioral fix. * fix(extensions): correct _print_cli_warning import path in skill registration register_enabled_extensions_for_agent imported _print_cli_warning from `.` (the extensions package), but the helper lives in the parent specify_cli package. The wrong level raised ImportError inside the error handlers, aborting extension/skill registration on the first failure instead of warning and continuing. Use `..` to match the other parent-package imports. * fix(extensions): escape untrusted values in Rich markup output User-provided arguments and extension/catalog metadata (names, descriptions, versions, IDs, paths) were interpolated into Rich markup strings without escaping. Values containing markup sequences (e.g. [red]...) would be parsed as markup, allowing output injection that could corrupt or mislead CLI messages. Wrap all such interpolations with rich.markup.escape across the extension/catalog command handlers: list, search, info (_print_extension_info), add (including --dev paths), remove, enable, disable, set-priority, update, and the ambiguous-match resolvers (error strings and Table rows). Reuse the already-computed safe_extension where available. Escaping is a no-op for benign strings, so normal output is unchanged. * Prevent Rich markup injection in extension CLI output User-controlled catalog URLs and extension IDs are rendered through Rich-enabled console paths, so every remaining output-only interpolation now escapes markup while leaving stored values and filesystem behavior unchanged. Regression tests cover catalog add, install hints, remove hints, and state command messages with bracketed markup-like values. * Prevent markup injection from exception text Rich markup remains enabled for styled CLI messages, so exception text and config path labels must be escaped before rendering. YAML parser errors, URL validation failures, download errors, and extension validation errors can include user-controlled catalog or manifest values. Constraint: Preserve existing exception handling and user-facing error paths Rejected: Disable Rich markup for these messages \| existing output intentionally uses markup for labels and styling Confidence: high Scope-risk: narrow Directive: Escape user-controlled exception text before interpolating into Rich-rendered strings Tested: .venv/bin/python -m pytest tests/test_extensions.py -q Co-authored-by: OmX <omx@oh-my-codex.dev> * Prevent path and manifest review regressions Catalog path labels are rendered through Rich markup and downloaded update manifests are trusted long enough to validate extension IDs. Escape displayed project paths before rendering, and reject non-mapping extension.yml payloads before ID validation so bad archives fail with a clear rollback reason. --------- Co-authored-by: OmX <omx@oh-my-codex.dev>	2026-06-22 13:40:57 -05:00
meymchen	6a3ee9b64e	feat: add ZCode (Z.AI) integration (#3063 ) * feat: add ZCode (Z.AI) integration Add a skills-based integration for ZCode, Z.AI's Claude-Code-style agent. ZCode uses the same SKILL.md layout as Claude Code, so spec-kit installs workflows into .zcode/skills/speckit-<name>/SKILL.md, invoked in chat as $speckit-<name>. - ZcodeIntegration(SkillsIntegration) with .zcode/ folder and --skills option - Register in INTEGRATION_REGISTRY - Catalog entry (tags: cli, skills, z-ai) - Tests via SkillsIntegrationTests mixin - Document in integrations reference and README Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix: render $speckit-* invocations for ZCode skills ZCode is documented as a skills agent invoked with $speckit-<command>, but the central invocation rendering only special-cased codex, so specify init Next Steps and extension hooks rendered the dotted /speckit.<command> form instead. Centralize the $speckit-* decision in a DOLLAR_SKILLS_AGENTS set with an is_dollar_skills_agent() helper, and route both init Next Steps and HookExecutor._render_hook_invocation through it. Add ZCode invocation regression tests mirroring the existing Codex/Kimi coverage. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 12:14:18 -05:00
Austin Z.	bbdf1b8f40	fix(agent-context): support multiple context files safely (#2969 ) * fix(agent-context): support multiple context files safely * fix(agent-context): harden context file validation * fix(agent-context): preserve disabled context target * fix(agent-context): address review follow-ups * fix(agent-context): dedupe PowerShell context files * fix(agent-context): align context file dedupe * fix(agent-context): align bash context file dedupe * fix(agent-context): preserve disabled display target * fix(agent-context): require yaml-capable updater python * fix(agent-context): preserve context files config * fix(agent-context): align context file fallbacks * fix(agent-context): share context file resolution --------- Co-authored-by: AustinZ21 <AustinZ21@users.noreply.github.com>	2026-06-22 12:10:55 -05:00
Manfred Riem	79a34b892d	fix(presets): use _repo_root() for bundled-core source-checkout fallback (#3086 ) (#3091 ) * fix(presets): use _repo_root() for bundled-core source-checkout fallback The tier-5 fallback in PresetResolver.resolve() and _find_bundled_core() computed the repo root as Path(__file__).parent.parent.parent. After presets.py was moved to presets/__init__.py (#2826) that chain is one level short, resolving to src/ and looking for src/templates/commands/<cmd>.md, which never exists. As a result, wrap-strategy presets found no core base layer in source/editable installs. Use the shared _repo_root() helper so both fallbacks resolve against the actual repo-root templates/ tree. Wheel installs were unaffected (core_pack path), so this only impacts source/editable checkouts. Refs #3086 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(presets): restore dropped def for oserror-manifest test A prior edit accidentally removed the def test_resolve_extension_command_via_manifest_skips_oserror_manifests line, orphaning its body inside the new bundled-core test. Restore the test definition so pytest collects it again. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(presets): move bundled-core tests into TestPresetResolver The two tier-5 fallback regression tests exercise collect_all_layers() and resolve(), not resolve_core(), so they belong in TestPresetResolver rather than TestResolveCore. Relocate them for clearer suite navigation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-22 11:33:44 -05:00
Huy Do	1cb935997c	fix: fail loudly on an unknown workflow expression filter (#3074 ) * fix: fail loudly on an unknown workflow expression filter The expression evaluator's filter dispatch fell through to `return value` for any unregistered filter, so a typo'd or unsupported filter such as `{{ items \| length }}` rendered the value unchanged with no error and the run completed — a silent wrong result. Raise a clear ValueError instead, naming the offending filter and the valid ones, mirroring the strict handling already used for `from_json`. The five registered filters (default/join/map/contains/from_json) are unchanged; the `name(arg)` form of an unknown filter is now caught too. * fix: distinguish a misused registered filter from an unknown one; cover map Address the review feedback on the unknown-filter fail-loud path: - A registered filter used in an unsupported form (e.g. `\| join` or `\| map` with no argument) raised the misleading "unknown filter '<name>'" — the filter is registered, the syntax isn't. It now raises a message naming it as a known filter misused. A new `_REGISTERED_FILTERS` constant drives the distinction. - `test_registered_filters_unaffected` now also exercises `map('attr')`, which it previously claimed to cover but didn't. Add `test_registered_filter_unsupported_form_raises` to pin the new path. * fix: include the no-arg default form in the filter-error hint Copilot review: the hint listed default('x') but omitted the valid no-argument default form (\| default), which this module supports.	2026-06-22 10:44:23 -05:00
Manfred Riem	902f5431f9	Harden command registration path handling (#3088 ) * fix: validate command 'file' field against path traversal in registrar CommandRegistrar.register_commands() read each command body from source_dir / cmd_file without validating the manifest 'file' field, unlike the parallel skill and preset readers which already reject absolute paths and '..' traversal. A malicious extension/preset/bundle manifest with file: ../../../etc/passwd (or an absolute path) could read arbitrary host files verbatim into a generated agent command at a predictable path (GHSA-w5fv-7w9x-7fc5, CWE-22). Add the same containment guard at the command read site and reject a traversal/absolute 'file' at manifest-load time in ExtensionManifest._validate() for defense-in-depth, plus regression tests for both the read path and the manifest validator. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test/fix: address review — robust absolute-path test and tolerant reads - register_commands(): use is_file() instead of exists() and skip the command if read_text() raises (directory or non-UTF8 file), aligning with the other command/skill readers. - Traversal tests: point the absolute-path payload at the real temp secret.txt (guaranteed to exist on all platforms) instead of /etc/passwd, so the absolute-path guard is genuinely exercised and the test fails if it regresses, rather than passing because the target happens not to exist (e.g. on Windows runners). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: rename traversal fixtures to avoid CodeQL secret-storage false positive The regression fixtures named an out-of-tree file secret.txt with TOP-SECRET-CREDENTIAL content. CodeQL's clear-text-storage heuristic treated that read content as sensitive and followed the static path into the pre-existing write_text sinks in _write_registered_output, raising false 'clear-text storage of sensitive information' alerts on PR 3088. Rename the fixtures to neutral outside.txt / OUTSIDE-FILE-MARKER and drop /etc/passwd payloads; the test semantics (a file outside source_dir must never be read into a generated command) are unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: reject Windows drive-relative 'file' values in traversal guards is_absolute() is False for Windows drive-relative paths like C:outside.txt, which contain no '..' yet resolve against the process CWD on that drive — bypassing the containment guard on Windows. Evaluate the 'file' value under PureWindowsPath as well so both the registrar runtime guard and the manifest-load validator reject drive letters (and backslash '..' segments) cross-platform. Extend the regression tests with drive-relative cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use anchor under both path flavors so POSIX-absolute is rejected on Windows On a Windows runner WindowsPath('/abs/outside.md').is_absolute() is False (no drive), so the prior native-Path check let a leading-slash 'file' value through and the manifest validator did not raise. Evaluate the value under both PurePosixPath and PureWindowsPath and reject any non-empty anchor — covering POSIX-absolute, Windows drive-relative, Windows absolute, and rooted-without-drive — in both the registrar guard and the manifest validator. The registrar join now uses the raw 'file' string so native separators are handled by the resolve()/relative_to() containment check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: validate command 'file' field against path traversal in registrar CommandRegistrar.register_commands() read each command body from source_dir / cmd_file without validating the manifest 'file' field, unlike the parallel skill and preset readers which already reject absolute paths and '..' traversal. A malicious extension/preset/bundle manifest with file: ../../../etc/passwd (or an absolute path) could read arbitrary host files verbatim into a generated agent command at a predictable path (GHSA-w5fv-7w9x-7fc5, CWE-22). Add the same containment guard at the command read site and reject a traversal/absolute 'file' at manifest-load time in ExtensionManifest._validate() for defense-in-depth, plus regression tests for both the read path and the manifest validator. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test/fix: address review — robust absolute-path test and tolerant reads - register_commands(): use is_file() instead of exists() and skip the command if read_text() raises (directory or non-UTF8 file), aligning with the other command/skill readers. - Traversal tests: point the absolute-path payload at the real temp secret.txt (guaranteed to exist on all platforms) instead of /etc/passwd, so the absolute-path guard is genuinely exercised and the test fails if it regresses, rather than passing because the target happens not to exist (e.g. on Windows runners). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: rename traversal fixtures to avoid CodeQL secret-storage false positive The regression fixtures named an out-of-tree file secret.txt with TOP-SECRET-CREDENTIAL content. CodeQL's clear-text-storage heuristic treated that read content as sensitive and followed the static path into the pre-existing write_text sinks in _write_registered_output, raising false 'clear-text storage of sensitive information' alerts on PR 3088. Rename the fixtures to neutral outside.txt / OUTSIDE-FILE-MARKER and drop /etc/passwd payloads; the test semantics (a file outside source_dir must never be read into a generated command) are unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: reject Windows drive-relative 'file' values in traversal guards is_absolute() is False for Windows drive-relative paths like C:outside.txt, which contain no '..' yet resolve against the process CWD on that drive — bypassing the containment guard on Windows. Evaluate the 'file' value under PureWindowsPath as well so both the registrar runtime guard and the manifest-load validator reject drive letters (and backslash '..' segments) cross-platform. Extend the regression tests with drive-relative cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use anchor under both path flavors so POSIX-absolute is rejected on Windows On a Windows runner WindowsPath('/abs/outside.md').is_absolute() is False (no drive), so the prior native-Path check let a leading-slash 'file' value through and the manifest validator did not raise. Evaluate the value under both PurePosixPath and PureWindowsPath and reject any non-empty anchor — covering POSIX-absolute, Windows drive-relative, Windows absolute, and rooted-without-drive — in both the registrar guard and the manifest validator. The registrar join now uses the raw 'file' string so native separators are handled by the resolve()/relative_to() containment check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: harden register_commands inputs and tighten manifest 'file' validation Address review feedback on #3088: - register_commands(): skip non-string/empty 'file' values instead of raising TypeError, and hoist source_dir.resolve() out of the per-command loop. - ExtensionManifest._validate(): reject 'file' values with leading/trailing whitespace with a clear ValidationError instead of a confusing missing-file failure later. - tests: add non-string 'file' and whitespace cases; use yaml.safe_dump with explicit utf-8 encoding in the manifest validation test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: align runtime '..' policy, correct comment, dedupe test helper Address review feedback on #3088: - register_commands(): also reject '..' segments under both POSIX and Windows semantics, keeping runtime policy consistent with ExtensionManifest._validate() and the skill/preset readers (not just relying on the resolve()/relative_to() containment backstop). - Replace the version-dependent is_absolute() claim in the extensions.py comment with the actual portability rationale (native Path is OS- dependent; C:foo is anchored but not absolute). - Extract the duplicated leak-detection assertion into _assert_no_marker_leak() and add an in-bounds '..' payload that exercises the new runtime '..' rejection. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Extract shared path-safety policy and warn on unreadable command files Introduce relative_extension_path_violation() in _utils.py as the single source of truth for the extension-relative `file` path-safety policy, and use it from both the runtime registrar guard (agents.py) and the manifest-load validator (extensions.py) so the two cannot drift. Warn (instead of silently skipping) when an in-bounds command file exists but cannot be read/decoded, surfacing misconfigured extensions. Add unit tests for the shared helper, a read-skip warning test, and make the in-bounds `..` test create its target file so the skip is attributable to the `..` rejection rather than file absence. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Retrigger CI Empty commit to re-trigger code scanning / CodeQL analysis on the PR merge ref. Assisted-by: GitHub Copilot CLI (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-22 10:25:29 -05:00
Ali jawwad	f9c6cf83e5	fix(presets): preserve argument-hint in preset SKILL.md generation (#2978 ) * fix(presets): preserve argument-hint in preset SKILL.md generation Preset-provided and extension-override commands that declare `argument-hint:` in their frontmatter had it dropped from the generated Claude SKILL.md, and it was re-dropped when a preset was removed and its overridden skill restored. This is the preset-side analog of the extension fix in #2903 / #2916. Factor the argument-hint carry-over into a shared CommandRegistrar.apply_argument_hint() helper and apply it at the four preset skill-generation sites (register, reconcile override-restore, and the core/extension unregister-restore paths). The extension path from The helper writes argument-hint into the frontmatter dict before serialization (so a folded multi-line description cannot be split into invalid YAML) and only for integrations that support it (those exposing inject_argument_hint -- currently Claude), leaving build_skill_frontmatter's shared shape unchanged for every other agent. Core templates carry no argument-hint, so the core-restore path is a no-op. No behavior change for non-Claude agents or the core path. Add regression tests covering a folding description (Claude) and the non-Claude gate (codex). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(presets): address review - guard skill_frontmatter type and tighten apply_argument_hint annotations Add a symmetric isinstance(skill_frontmatter, dict) guard so the helper stays a safe no-op if a caller passes a non-dict, and annotate the parameters as Dict[str, Any] with an optional integration to match real call-site usage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 09:52:13 -05:00
Huy Do	f5f76160a3	feat: surface gate detail in the workflow run/resume --json payload (#2965 ) * feat: surface gate detail in the workflow run/resume --json payload A paused run was indistinguishable from any other pause in the machine-readable outcome, and the gate's prompt/options/choice never left the human-facing stream. Record each step's type in the run state's step results (one engine line) and, when the run sits at a gate, add a gate block (step_id/message/options/choice) to the payload so orchestrators can drive review gates without parsing stdout. Reference implementation for the proposal in #2964. Addresses #2964 * fix(workflow): only surface gate detail in --json when the run is paused Address review (#2965): _gate_outcome() emitted a gate block whenever current_step_id pointed at a gate step. Since RunState.current_step_id is never cleared on completion, a completed/failed run whose last step was a gate leaked stale gate detail in run/resume/status --json. Guard on status == paused. Also assert CLI success in the _run_json test helper before JSON-parsing, and add direct coverage for the suppression guard. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(workflows): surface gate block on aborted runs; stabilize message Address Copilot review: - `_gate_outcome` now also surfaces the gate block when a run is `aborted` by a gate rejection (`on_reject: abort`), not only when `paused`. Abort is the only path that sets ABORTED and it leaves current_step_id on the gate, so an orchestrator can read the recorded `choice` for the stop. - Coerce `message` to a string (it may be a non-string YAML literal that GateStep only coerces for interpolation) so the JSON schema stays stable. - Tests: add a CLI-level aborted-path test, a message-coercion test, and extend the suppression test to allow `aborted`; share the run helper via `_invoke_json` to avoid duplicating the invoke boilerplate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): assert clean exit in gate-abort JSON test Address Copilot review: the gate-abort test parsed stdout without first asserting the CLI exited cleanly, so an invoke failure would surface as an opaque JSON decode error. Route it through `_run_json` (which asserts exit_code == 0 before parsing) and drop the now-redundant `_invoke_json` helper — a gate abort emits the payload and returns, so the run exits 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: use result.output in run-helper assert; document step_data shape Address Copilot review: - `_run_json` asserted with `result.stdout` in the message, but under `--json` step output is redirected off stdout — the useful diagnostics live on `result.output`. Switch the assertion message to `result.output` (the JSON parse still reads stdout), matching the other CLI tests. - `StepContext.steps` documented a 5-key entry shape; the engine now also persists `type` and `status`. Update the docstring to the canonical 7-key shape so step authors/debuggers see the real record. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): align gate-abort JSON test with aborted→exit-1 After rebasing onto main, a gate abort now emits the --json payload and then exits non-zero (`_run_outcome_exit_code` maps aborted → 1, from the merged exit-code work). Give `_run_json` an `expected_exit` parameter (default 0) so the abort case asserts exit 1 while the paused/completed cases stay at 0 — keeping a single shared helper rather than duplicating the invoke boilerplate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): backward-compat gate detection + normalize gate options Address Copilot review: - A run paused by an older version has no persisted step `type`, so `_gate_outcome` would never surface its gate block on resume. Add `_is_gate_step`: prefer the `type` field, but when it is absent fall back to the gate's unique output signature (`on_reject`, written only by GateStep). A record with a different known `type` is still not a gate. - Normalize `options` to a list of strings (mirroring the `message` coercion) so an unvalidated workflow with non-string options can't destabilize the JSON schema. - Tests: options coercion, type-less gate detection, and a type-less non-gate negative case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): normalize non-list gate options to a stable list[str] Address Copilot review: the prior options normalization only mapped a `list`, returning the raw value for any other shape (scalar/tuple), which contradicted the "stable list[str]" intent. Extract `_normalize_gate_options`: None stays None; list/tuple maps each element through str; any other scalar becomes a single-element list (a bare string is one option, never iterated character-by-character). The emitted schema is now always list[str] \| None. Extend the options test to cover list, tuple, bare string, numeric scalar, and None. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): normalize gate choice to str; portable plain-gate test Address Copilot review: - `_gate_outcome` normalized `message` and `options` but passed `choice` through as-is; an unvalidated gate can record a non-string `choice`, which contradicts the stable-schema rationale. Coerce `choice` to `str \| None` (None still means "no decision yet"), consistent with the other two fields. Adds a focused choice-coercion test. - The plain (no-gate) test workflow used `run: "true"`, which fails under cmd.exe on Windows (ShellStep uses shell=True). Use the cross-platform `run: "exit 0"` (matching the exit-code suite's workflows). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-22 07:05:54 -05:00
Manfred Riem	487af97864	feat: add `specify bundle` command (#3070 ) * docs: dogfood Spec Kit — bundler SDD artifacts + constitution Scaffold Spec Kit (--integration copilot) and run the full SDD workflow against the `specify bundle` subcommand feature: - spec.md (4 user stories, 31 FRs, 8 success criteria) + clarifications - plan.md, research.md, data-model.md, contracts/, quickstart.md - tasks.md (43 dependency-ordered tasks, organized by user story) - Spec Kit Constitution v1.0.0 (code quality, testing, UX, performance, dependency/security principles) derived from deep codebase analysis - plan Constitution Check + tasks grounded against the ratified principles Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(bundler): add `specify bundle` subcommand for role-based setups Implements the Spec Kit Bundler as a `specify bundle ...` subcommand group that calls existing primitive machinery in-process with zero new dependencies, per the v1.0.0 constitution (Principles I-V). Adds the `specify_cli.bundler` package (models, services, lib helpers) and the `commands/bundle` Typer group wiring search, info, list, install, update, remove, validate, build, init, and catalog list/add/remove (with --json and --offline). Includes manifest/catalog schemas, version + integration-clash gating, discovery-only refusal, idempotent install with atomic rollback, non-collateral removal, and offline-first catalog resolution. Ships an 82-test suite (contract/unit/integration), four sample role bundles (product-manager, business-analyst, security-researcher, developer), README "Bundles" docs, and an AGENTS.md pitfall on the test-venv gotcha. Marks tasks T001-T043 complete and records follow-ups T044 (live in-process primitive dispatch) and T045 (install from a local artifact path). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(contributing): document running the full test suite via project .venv Add a "Running the full test suite" subsection under Automated checks covering `uv pip install -e ".[test]"` + `.venv/bin/python -m pytest`, with the shared/global editable-install contamination caveat that mirrors the AGENTS.md pitfall. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(bundler): wire real in-process primitive install + local-artifact install Closes the two follow-ups left after the initial bundler landing. T044 — DefaultPrimitiveInstaller now performs real installs through existing machinery instead of raising "use the primitive command" errors: - presets/extensions install via their reusable managers (install_from_directory / install_from_zip); bundled assets install fully offline, catalog assets are fetched only when the network is allowed. - workflows/steps delegate to the existing `workflow add` / `workflow step add` command callables in-process (project root as cwd), avoiding any duplicated download/validation logic (Principle I). - `--offline` is threaded through DefaultPrimitiveInstaller(allow_network=…) so network-only kinds refuse with an actionable message rather than silently reaching out. T045 — `specify bundle install` now accepts a local path (a built .zip artifact, a bundle directory, or a bundle.yml) and installs directly without consulting the catalog stack; bundle-ids still resolve via the stack. Adds 13 tests (routing, offline gating, local-source resolution, and an end-to-end offline build → install → list → remove of the bundled agent-context extension). Bundler suite: 95 passing; ruff clean. Marks T044 and T045 complete in tasks.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(bundler): append Phase 8 convergence tasks from converge assessment Ran the converge command: assessed the codebase against spec.md, plan.md, tasks.md, and the v1.0.0 constitution. Appended 7 traceable gap-closure tasks (T046–T052) as a new "Phase 8: Convergence" section. Append-only — no existing tasks were modified and no application code was changed. Findings: 1 CRITICAL (Constitution III — bundle group undocumented under docs/reference/), 3 HIGH (FR-005/SC-007 validate references; FR-009/SC-002 info expansion; FR-012 install-time init), 3 MEDIUM (FR-013 integration precedence; FR-020 surface overlaps; FR-028 update refresh). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Implement Phase 8 convergence tasks (T046–T052) Close the gaps the converge command found between the bundler spec/plan/ constitution and the code: - T046: add docs/reference/bundles.md documenting the full `specify bundle` command group; link it from docs/reference/overview.md (Constitution III). - T047: wire a reference checker into `bundle validate` (services/references.py); online runs fail and name unresolved component references, offline runs warn. - T048: expand `bundle info` to enumerate the full component set (versions, preset priority/strategy) plus the bundle integration — info == install. - T049/T050: `bundle install`/`bundle init` now scaffold an uninitialized project via the existing `specify init` machinery, choosing the integration by precedence (override → bundle-declared → Copilot + OS default script type). - T051: surface foreseeable component overlaps during info and install. - T052: `bundle update` refreshes already-installed components via a new refresh path in install_bundle, preserving primitive-level overrides. Adds unit/contract/integration coverage (107 tests pass). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * converge: append Phase 9 (T053) — surface bundle trust indicator Re-run of converge after Phase 8. The seven Phase 8 tasks are verified closed. One residual partial gap remains: the `verified`/trust indicator (FR-010, FR-027) is exposed only in `bundle info --json`, absent from `bundle search` (the primary discovery surface) and `bundle info` text. Appended as a single new task for implement to complete. Append-only; no code changed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Implement T053 — surface bundle trust indicator in discovery `bundle search` (text + JSON) and `bundle info` (text + JSON) now expose each catalog entry's verification/trust level (verified vs community), so users can judge a bundle's trust before installing, per FR-010 / FR-027. Previously `verified` was only present in `bundle info --json`. Adds contract coverage; 108 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: dogfood Spec Kit — bundler SDD artifacts + constitution Scaffold Spec Kit (--integration copilot) and run the full SDD workflow against the `specify bundle` subcommand feature: - spec.md (4 user stories, 31 FRs, 8 success criteria) + clarifications - plan.md, research.md, data-model.md, contracts/, quickstart.md - tasks.md (43 dependency-ordered tasks, organized by user story) - Spec Kit Constitution v1.0.0 (code quality, testing, UX, performance, dependency/security principles) derived from deep codebase analysis - plan Constitution Check + tasks grounded against the ratified principles Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(bundler): add `specify bundle` subcommand for role-based setups Implements the Spec Kit Bundler as a `specify bundle ...` subcommand group that calls existing primitive machinery in-process with zero new dependencies, per the v1.0.0 constitution (Principles I-V). Adds the `specify_cli.bundler` package (models, services, lib helpers) and the `commands/bundle` Typer group wiring search, info, list, install, update, remove, validate, build, init, and catalog list/add/remove (with --json and --offline). Includes manifest/catalog schemas, version + integration-clash gating, discovery-only refusal, idempotent install with atomic rollback, non-collateral removal, and offline-first catalog resolution. Ships an 82-test suite (contract/unit/integration), four sample role bundles (product-manager, business-analyst, security-researcher, developer), README "Bundles" docs, and an AGENTS.md pitfall on the test-venv gotcha. Marks tasks T001-T043 complete and records follow-ups T044 (live in-process primitive dispatch) and T045 (install from a local artifact path). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(contributing): document running the full test suite via project .venv Add a "Running the full test suite" subsection under Automated checks covering `uv pip install -e ".[test]"` + `.venv/bin/python -m pytest`, with the shared/global editable-install contamination caveat that mirrors the AGENTS.md pitfall. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(bundler): wire real in-process primitive install + local-artifact install Closes the two follow-ups left after the initial bundler landing. T044 — DefaultPrimitiveInstaller now performs real installs through existing machinery instead of raising "use the primitive command" errors: - presets/extensions install via their reusable managers (install_from_directory / install_from_zip); bundled assets install fully offline, catalog assets are fetched only when the network is allowed. - workflows/steps delegate to the existing `workflow add` / `workflow step add` command callables in-process (project root as cwd), avoiding any duplicated download/validation logic (Principle I). - `--offline` is threaded through DefaultPrimitiveInstaller(allow_network=…) so network-only kinds refuse with an actionable message rather than silently reaching out. T045 — `specify bundle install` now accepts a local path (a built .zip artifact, a bundle directory, or a bundle.yml) and installs directly without consulting the catalog stack; bundle-ids still resolve via the stack. Adds 13 tests (routing, offline gating, local-source resolution, and an end-to-end offline build → install → list → remove of the bundled agent-context extension). Bundler suite: 95 passing; ruff clean. Marks T044 and T045 complete in tasks.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(bundler): append Phase 8 convergence tasks from converge assessment Ran the converge command: assessed the codebase against spec.md, plan.md, tasks.md, and the v1.0.0 constitution. Appended 7 traceable gap-closure tasks (T046–T052) as a new "Phase 8: Convergence" section. Append-only — no existing tasks were modified and no application code was changed. Findings: 1 CRITICAL (Constitution III — bundle group undocumented under docs/reference/), 3 HIGH (FR-005/SC-007 validate references; FR-009/SC-002 info expansion; FR-012 install-time init), 3 MEDIUM (FR-013 integration precedence; FR-020 surface overlaps; FR-028 update refresh). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Implement Phase 8 convergence tasks (T046–T052) Close the gaps the converge command found between the bundler spec/plan/ constitution and the code: - T046: add docs/reference/bundles.md documenting the full `specify bundle` command group; link it from docs/reference/overview.md (Constitution III). - T047: wire a reference checker into `bundle validate` (services/references.py); online runs fail and name unresolved component references, offline runs warn. - T048: expand `bundle info` to enumerate the full component set (versions, preset priority/strategy) plus the bundle integration — info == install. - T049/T050: `bundle install`/`bundle init` now scaffold an uninitialized project via the existing `specify init` machinery, choosing the integration by precedence (override → bundle-declared → Copilot + OS default script type). - T051: surface foreseeable component overlaps during info and install. - T052: `bundle update` refreshes already-installed components via a new refresh path in install_bundle, preserving primitive-level overrides. Adds unit/contract/integration coverage (107 tests pass). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * converge: append Phase 9 (T053) — surface bundle trust indicator Re-run of converge after Phase 8. The seven Phase 8 tasks are verified closed. One residual partial gap remains: the `verified`/trust indicator (FR-010, FR-027) is exposed only in `bundle info --json`, absent from `bundle search` (the primary discovery surface) and `bundle info` text. Appended as a single new task for implement to complete. Append-only; no code changed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Implement T053 — surface bundle trust indicator in discovery `bundle search` (text + JSON) and `bundle info` (text + JSON) now expose each catalog entry's verification/trust level (verified vs community), so users can judge a bundle's trust before installing, per FR-010 / FR-027. Previously `verified` was only present in `bundle info --json`. Adds contract coverage; 108 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): address PR review — annotations, Windows paths, HTTPS, errors, reproducible builds Resolves automated review feedback on github/spec-kit#3070: - validator: drop redundant string-quoting on ReferenceChecker's `str \| None` return so the annotation evaluates as a real union under `from __future__ import annotations`. - adapters: normalize Windows drive-letter paths (e.g. C:\...) to the local-file branch so offline file catalogs resolve on Windows. - adapters: enforce HTTPS (HTTP only for localhost) and require a host on remote catalog URLs before any network call, mirroring specify_cli.catalogs URL validation (MITM/downgrade protection). - adapters: pass `origin` to loads_json for local files and HTTP payloads so JSON parse errors name the real source instead of <string>. - manifest: parse component `priority` defensively, raising an actionable BundlerError on non-integer values instead of a raw ValueError. - packager: write zip members with a fixed timestamp + permissions so identical inputs yield byte-for-byte identical artifacts (genuinely reproducible builds), and strengthen the determinism test accordingly. Adds regression tests for priority validation, plain-HTTP/host rejection, and byte-level artifact reproducibility (111 bundler tests pass; ruff clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): address PR review round 2 — nested output dir + file:// URLs - packager: when --output points inside the bundle directory, exclude the whole output subtree from collection so previously-built artifacts are never re-packaged (prevents broken reproducibility and unbounded growth). - adapters: resolve file:// catalog URLs via url2pathname and preserve netloc, so Windows file URLs (file:///C:/...) and UNC shares (file://server/share) resolve correctly instead of dropping the host or producing /C:/x. Adds regression tests for nested-output exclusion and file:// resolution (113 bundler tests pass; ruff clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): address PR review round 3 — discovery UX + hardening - bundle search/info: fall back to the built-in/user catalog stack instead of requiring a Spec Kit project, so discovery works in a fresh directory (and the README/quickstart examples now match actual behavior). install still auto-initializes a project as before. - packager: traverse with os.walk(followlinks=False) and prune symlinked directories before descending, so a symlink-to-dir can no longer pull in out-of-tree files (which previously turned "skip symlinks" into a hard ensure_within() failure and did extra filesystem work). - records: parse contributed-component priority defensively, raising an actionable BundlerError on a corrupt records file instead of leaking a raw ValueError/traceback. - installer: give install_bundle's manifest parameter an explicit BundleManifest \| None type for a clearer, safer service API. Adds regression tests for project-less search/info, symlinked-dir pruning, and corrupt-priority records (117 bundler tests pass; ruff clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): address PR review round 4 + markdownlint exclusions Review fixes: - bundle info: expand the manifest regardless of install policy so discovery-only bundles remain inspectable (only install is refused). - _download_manifest: handle local .zip download_url by extracting bundle.yml (via _local_manifest_source), and add a real remote HTTPS fetch path using the shared authenticated, redirect-validated open_url client (HTTPS enforced on the initial URL and every redirect; offline still refuses). - _run_init: thread the --offline flag through to the init callback so `bundle install/init --offline` never performs network init. - conflict.ConflictReport: use field(default_factory=list) and drop the None + __post_init__ workaround. - CatalogSource.from_dict: parse priority defensively, raising an actionable BundlerError naming the source + offending value instead of a raw ValueError. markdownlint: - Exclude .specify/, .github/, and specs/ (and their subdirectories) from markdownlint so the in-flight dogfooding scaffolding doesn't trip the linter. Adds regression tests for discovery-only info, local-zip download_url, and non-integer catalog priority (120 bundler tests pass; ruff clean; the PR's own markdown lints clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): address PR review round 5 + ignore generated files in whitespace check Review fixes: - packager: exclude any prior build artifact for this bundle (matching <id>-.zip), not just the current output path, so older artifacts next to bundle.yml are never re-packaged. - docs(bundles): correct the note — `search` and `info` work without a project (they fall back to the built-in/user catalog stack); only list/update/remove/ catalog require an initialized project. CI / generated files: - .gitattributes: mark the generated dogfooding scaffolding (.specify/, the speckit .github agent/prompt files, copilot-instructions.md, specs/) with -whitespace so `git diff --check` (the Lint workflow's whitespace gate) stops flagging emitted trailing whitespace. These files are produced by `specify init` and are scrubbed before merge. Adds a regression test for prior-artifact exclusion (121 bundler tests pass; ruff clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> fix(bundler): collision-resistant catalog ids, canonical local paths, explicit uninstalled result Addresses review round 6 (PR #3070): - catalog_config._derive_id now combines host label with the URL path stem so multiple catalogs from the same host get distinct, stable default ids. - add_source canonicalizes local file paths to absolute before persisting, so project config no longer depends on the caller's cwd. - InstallResult gains a dedicated `uninstalled` list; remove_bundle no longer overloads `installed` for removals, and the CLI prints from `uninstalled`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): confine config writes, guard indeterminate integration, fix validate docs Addresses review round 7 (PR #3070): - save_records and catalog_config._write now pass within=project_root to dump_json/dump_yaml, refusing symlinked .specify paths that escape the project (defense-in-depth, matching the rest of the codebase). - resolve_install_plan now fails when a bundle pins an integration but the project's active integration cannot be determined and no explicit --integration override was given, instead of silently adopting the bundle's required integration (FR-019 guard). CLI passes integration_explicit. - docs/reference/bundles.md: corrected the validate semantics to describe the actual best-effort online behavior (unreachable catalogs warn, not fail). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): Windows path handling + review round 8 hardening Fix Windows CI failures: - is_safe_relpath now rejects POSIX-absolute (/abs) and Windows drive-absolute (C:\x, UNC) paths on every OS, instead of passing them through on Windows where os.path.isabs('/abs') is False and Path('/abs').parts yields '\\'. - _download_manifest treats a Windows drive-letter download_url (C:\bundle.yml, which urlparse reads as scheme 'c') as a local file, fixing the empty component set in `bundle info` on Windows. Address review round 8 (PR #3070): - Bundled workflows now install under --offline (locate via _locate_bundled_workflow) instead of being refused unconditionally. - bundle update preserves the original installed_at timestamp on refresh (import find_record; reuse the existing record's timestamp). - _derive_id lowercases the host label so 'Example.com' and 'example.com' produce the same deterministic id. - CatalogEntry.from_dict validates 'tags' is a list and 'verified' is a real boolean, raising BundlerError on invalid untrusted shapes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): normalize SemVer prerelease spellings before version parsing Addresses review round 9 (PR #3070): parse_version and is_semver now apply the same prerelease normalization (mirroring specify_cli._version._normalize_tag) so SemVer spellings like 1.2.3-rc1 / 1.2.3-alpha1 validate and compare consistently across is_semver, parse_version, and satisfies. Leading 'v' is also stripped. Keeps the manifest validator and constraint checks in agreement. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): no collateral removal + enforce manifest-pinned versions Addresses review round 10 (PR #3070): - install_bundle records only the components this bundle actually contributed: freshly-installed components, plus pre-existing ones already owned by this bundle (refresh) or a sibling bundle (shared/refcounted). A component that is installed on disk but tracked by no bundle was installed independently and is no longer attributed, so `bundle remove` won't uninstall it (FR-022). - preset/extension/workflow install paths now verify the active catalog's advertised version matches the manifest-pinned component.version before downloading/installing, raising BundlerError on mismatch so bundles stay reproducible. When a catalog advertises no version the pin can't be enforced and installation proceeds. Added regression tests: independent pre-existing component survives removal; version-mismatch refusal (helper + workflow path). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(scripts): add SPECIFY_INIT_DIR to target a member project from the repo root (#2892) * feat(scripts): add SPECIFY_INIT_DIR to target a member project from the repo root Resolve an explicit SPECIFY_INIT_DIR project override once in the core get_repo_root / Get-RepoRoot, so a non-interactive / CI caller can target a member project (the directory containing .specify/) from a monorepo root without cd. Strict by design: the path must exist and contain .specify/, otherwise it hard-errors with no silent fallback. - Single resolver in core; the git feature-branch script inherits it by sourcing core, with no per-extension copies. - PS resolver verifies the resolved path is a directory (Resolve-Path also succeeds for files) so a file value errors as "not an existing directory". - get_feature_paths splits decl/assignment so a SPECIFY_INIT_DIR failure propagates instead of being masked by `local`. - create-new-feature-branch: when core is absent (only git-common loaded) and SPECIFY_INIT_DIR is set, hard-error rather than silently using the git root. - Document SPECIFY_INIT_DIR and SPECIFY_FEATURE_DIRECTORY in the core reference. - Tests for valid/relative/trailing-slash/file/missing/no-.specify targets, feature-axis composition, the no-core guard, and a PowerShell mirror. * fix: guard SPECIFY_INIT_DIR with stale core scripts * docs: clarify SPECIFY_FEATURE_DIRECTORY precedence wording * fix: normalize trailing slash in PowerShell SPECIFY_INIT_DIR resolver Resolve-Path preserves a trailing separator from its input, so a SPECIFY_INIT_DIR ending in a slash returned a root that didn't match the bash resolver (whose `cd && pwd` strips it). That broke test_ps_trailing_slash_tolerated on the CI runners, which do have pwsh. Trim it with TrimEndingDirectorySeparator (no-op on a bare root or a path with no trailing separator). Also fix the misleading test comment: the PowerShell mirror runs on the CI ubuntu/windows runners (they ship pwsh), it is not skipped there. * test: normalize bash path expectations on Windows * docs: clarify SPECIFY_INIT_DIR root helpers * chore: sync dogfooded .specify core scripts with SPECIFY_INIT_DIR Mirror the SPECIFY_INIT_DIR resolver (resolve_specify_init_dir in common.sh) into the committed dogfooding .specify/scripts/bash copies so the git extension's create-new-feature-branch.sh finds an up-to-date common.sh instead of failing with "requires updated Spec Kit core scripts". Fixes the test_init_dir.py CI failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): harden remote catalog fetch and config parsing - adapters: route catalog HTTP fetches through the shared authenticated client (authentication.http.open_url) so auth.json tokens apply and the Authorization header is stripped on cross-host/downgrade redirects. Reject any redirect that leaves HTTPS via a redirect_validator and re-validate the final URL after redirects, closing the urlopen auto-redirect MITM/downgrade gap. - catalog_config._read: raise an actionable BundlerError when the config top level is not a mapping, 'catalogs' is not a list, or an entry is not a mapping, instead of letting list(<str>) produce a downstream AttributeError. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): tighten record read confinement, policy gate, and precedence Addresses review 4534504799: - records.load_records: confine the read via ensure_within(project_root, ...) so a symlinked/traversal-escaping .specify cannot read arbitrary files outside the project (matches the write path's within= guard). - catalog_config._slug: lowercase so derived catalog ids are deterministic across platforms and case-variant duplicates can't slip past the case-sensitive dup check. - installer.install_bundle: reword the docstring's misleading "atomic on failure" claim to describe the real scoped guarantee (record written only on full success; rollback limited to newly-installed components). - bundle update: enforce the source install_policy like install, refusing to update from a discovery-only source (FR-025). - catalog source precedence: the CLI now passes ~/.specify as the user config dir so project > user > built-in precedence is actually reachable (previously the user scope was silently ignored). - .gitattributes: scope the specs whitespace exemption to the generated dogfooding feature dir (specs/001-spec-kit-bundler/) instead of all of specs/. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): no collateral refresh, catalog id integrity, loud info Addresses review 4534571362: - installer: in refresh mode (bundle update) only re-apply already- installed components that this bundle (or a sibling) owns. Components installed independently and tracked by no bundle are now skipped, never refreshed, so update cannot make collateral changes (FR-022). - catalog.load_catalog_payload: validate each entry's own id is present and matches its enclosing bundles key, rejecting catalogs that would otherwise list a spoofed or unresolvable id. - bundle info: stop swallowing manifest download failures. If the manifest can't be resolved (e.g. --offline against an https download_url or a download failure), surface the error and exit non-zero instead of silently degrading to catalog `provides` counts, preserving the "info == what install applies" guarantee. Added regressions: refresh leaves independently-installed components untouched, catalog id key/field mismatch + missing id rejection, and info exits non-zero when the manifest is unresolvable offline. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): confine catalog-config and integration-marker reads Addresses review 4534716790: two more state reads bypassed the symlink/path-escape confinement that records and the write paths already enforce. - catalog_config._read: validate the config path with ensure_within(project_root, ...) before exists()/read, so a symlinked .specify resolving outside project_root is rejected instead of read. - lib.project.active_integration: confine the .specify/integration.json read the same way; an out-of-tree escape is treated as "not determinable" (returns None) rather than followed. Added regressions covering both via a symlinked .specify pointing outside the project root. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): validate manifest tags, disambiguate derived ids by full host Addresses review 4534768419: - manifest.from_dict: reject a non-list `tags` (e.g. a bare string) instead of splitting it character-by-character, matching the catalog parser and the schema contract (tags = list of strings). - catalog_config._derive_id: derive ids from the full host (TLD included) so example.com and example.net no longer collide on the same id. Updated the affected id assertions. - CHANGELOG: call out the new `specify bundle` command group in the unreleased section (the PR's headline user-facing feature). - .gitattributes: clarify the specs whitespace exemption — the dogfooding feature dir is scrubbed before merge (not retained), so it doesn't weaken checks for kept docs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(gitattributes): retain whitespace exemption for constitution.md The project constitution (.specify/memory/constitution.md) is the one dogfooding artifact carried forward past the pre-merge scrub. Give it its own standalone whitespace exemption so it survives removal of the broader .specify/** generated-scaffolding exemption. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): accurate uninstall count, confine catalog read, safe bundle id Addresses review 4534812056: - installer.remove_bundle: only count a component as uninstalled when installer.remove() actually ran; components already absent on disk are reported as skipped, keeping the uninstalled count accurate. - catalog.load_source_stack: confine the project-scoped .specify config read with ensure_within, so a symlinked .specify/ resolving outside the project root is refused (consistent with the bundler's other guarded reads). - manifest: enforce a filesystem-safe slug for bundle.id in structural validation; packager.build_bundle adds an ensure_within defense-in-depth check so a crafted id can never push the artifact outside the output dir. Also reverts the CHANGELOG entry (the changelog is updated separately). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): validate requires/provides shapes in manifest and catalog Addresses review 4534855443: - manifest: validate requires.tools and requires.mcp as list-of-strings via a shared _parse_str_list helper (also reused for tags), so a bare string like `tools: docker` is rejected with an actionable BundlerError instead of being split character-by-character. - catalog.CatalogEntry.from_dict: validate that `requires` and `provides` are mappings before accessing them, so an untrusted catalog payload with `requires: "..."` raises a named BundlerError rather than escaping as a raw AttributeError traceback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): require README.md when building a bundle artifact Addresses review 4534938014: build_bundle now fails early with an actionable error when README.md is missing, matching the documented artifact contract (manifest + README) instead of silently producing a bundle with no human-facing description. Also reverts CHANGELOG.md to the upstream/main copy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): validate record shapes; drop stale install --refresh claim Addresses review 4534969692: - records.InstalledBundleRecord.from_dict: hard-error when contributed_components is not a list, instead of iterating a corrupt bare string character-by-character. - records.load_records: validate the top-level 'bundles' field is a list and fail with a clear BundlerError when a corrupt file makes it a mapping/string. - PR description: remove the inaccurate "supports --refresh" note from `bundle install` (refresh is the `bundle update` path); docs already omit it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): refuse symlinked .specify, reject bad url schemes, IPv6 ids Addresses review 4534997724: - lib.project.find_project_root: a symlinked .specify is no longer accepted as a project root (is_dir() follows symlinks), matching the confinement the rest of the CLI applies and avoiding confusing downstream failures. - catalog_config.add_source: reject unsupported url schemes (ssh://, ftp://, ...) up front instead of silently treating them as local paths; local paths containing ':' but not '://' are still allowed. - catalog_config._derive_id: derive the host via urlparse().hostname so IPv6 literals, credentials, and ports no longer corrupt the derived id. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): strict semver, narrow artifact skip, preserve priority 0 Addresses review 4535084048: - versioning.is_semver: enforce a full MAJOR.MINOR.PATCH SemVer (with optional pre-release/build) via a dedicated regex, instead of accepting any packaging.version.Version-parseable string (e.g. "1", "1.0"). This makes BundleManifest.structural_errors() reject non-semver versions. - packager: narrow the prior-artifact skip pattern to semver-named zips (<id>-<x.y.z>.zip) so legitimate assets like <id>-assets.zip are still packaged. - primitives (preset + extension install): use an explicit `is None` check so an intentional priority of 0 is preserved instead of being replaced by the default. Adds regressions: non-semver rejection ("1"/"1.0"/"1.2.3.4"), asset-not- excluded vs semver-artifact-excluded, and priority-0 pass-through. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): artifact regex for prerelease+build; clarify integration/priority docs Addresses review 4535132279: - packager: the prior-artifact skip regex now matches semver names carrying both a prerelease and build-metadata segment (e.g. 1.0.0-rc1+build5), so such an existing artifact is excluded rather than re-packaged — keeping builds bounded/deterministic, consistent with is_semver(). - docs/reference/bundles.md: correct the install integration wording. --integration selects the integration when initializing a new project and confirms the target when a pinned bundle's active integration can't be determined; it does NOT override a bundle that targets a specific integration (a mismatch aborts with no changes). - examples/security-researcher README: reword the preset priority note in terms of the numeric comparison (ascending priority order) to avoid inverting the meaning. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): --integration can't bypass clash guard; honest rollback docs Addresses review 4535159341: - bundle install: for an already-initialized project, the project's recorded active integration is now authoritative. --integration no longer overrides it (which let a copilot project install a claude-pinned bundle via `--integration claude`, bypassing the FR-019 clash guard). The override still selects the integration at init time and confirms the target only when the active integration cannot be determined. - docs/reference/bundles.md: reword the install guarantee to match the implementation — no provenance record is written unless the install fully succeeds, and rollback of this run's components is best-effort (removal errors are swallowed, so partial on-disk state may remain). Dropped the inaccurate "atomic / rolls back everything" claim. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): validate component kind/id when loading records Addresses review 4535194606: _component_from_dict now rejects a contributed component whose 'kind' is not a supported component kind or whose 'id' is empty, raising a BundlerError that explicitly flags the records file as corrupt. Previously such a record loaded successfully and only failed later (e.g. in primitive_manager() during bundle remove/update) with a less actionable error. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): address review 4535234003 (7 findings) - versioning: tolerate an uppercase `V` prefix in `_normalize_semver` and `is_semver`, mirroring specify_cli._version tag normalization (V -> v) so `V1.2.3` parses and validates consistently. - validator: import BundlerError and narrow the speckit_version constraint except clause to `BundlerError` only, so programming errors are no longer masked behind an "invalid constraint" message. - bundle update: accept `--integration` and thread it through resolve_install_plan the same way `bundle install` does (override used only when the active integration can't be auto-detected), so integration-pinned bundles can be updated where `.specify/integration.json` is missing/unreadable. - bundle validate: fold reference warnings into `report.warnings` so the ValidationReport is the single warning channel at the CLI layer. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(bundler): make update --integration help assertion ANSI-safe Rich can split the "--integration" option label with ANSI escape codes between the two leading dashes, so the literal substring check failed under CI's terminal settings. Match the un-split option word instead, mirroring how test_bundle_help_lists_all_commands checks bare command names. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): preserve exec bits in artifacts; document install-time pins Addresses review 4535280786: - packager.build_bundle: no longer forces every ZIP member to 0644, which stripped the executable bit from bundled scripts (e.g. extension hook scripts) and could break them after extraction. Permissions are now normalized reproducibly to 0755 when the source file has any execute bit set, otherwise 0644 — identical inputs still yield byte-for-byte identical artifacts. - installer.install_bundle + docs/reference/bundles.md: document that version pins are enforced install-time only. Because primitive is_installed checks are id-based (not version-aware), an already-present component is skipped during install without comparing its on-disk version to the manifest pin; pins are guaranteed applied only on a real install or `bundle update` refresh. Added a regression asserting executable sources map to 0755 and plain files to 0644 in the built artifact. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(bundler): skip exec-bit packager test on Windows Windows filesystems do not carry Unix execute bits, so chmod(0o755) is a no-op and the source file reports no execute bit — the packager then correctly stores the member as 0644. The assertion that an executable source maps to 0755 is only meaningful on POSIX, so skip it on nt rather than asserting platform-specific behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): normalize prerelease spellings inside version constraints Addresses review 4535327154: parse_version() normalized SemVer prerelease spellings (e.g. 1.2.3-rc1 -> 1.2.3rc1) but parse_constraint() passed the constraint to packaging.SpecifierSet unmodified, so ">=1.2.3-rc1" raised InvalidSpecifier even though the same spelling is accepted for installed versions. parse_constraint() now normalizes the version portion of each comma-separated clause via the shared _normalize_semver helper, so prerelease handling is consistent across versions and constraints. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(bundler): validate schema versions and required record identity fields Addresses review 4535351596: - records.load_records: validate the on-disk 'schema_version' (required; forward-compatible across same-major minor bumps) and fail fast with an actionable error on a missing/unknown version, rather than silently parsing a possibly-incompatible format and risking incorrect bundle attribution/removal. - records.InstalledBundleRecord.from_dict: treat missing 'bundle_id' or 'version' as corruption and raise BundlerError, instead of coercing them to empty strings that let later list/remove/update operations behave unpredictably. - catalog_config._read: validate 'schema_version' when present (same-major compatibility) and fail fast on an unsupported version so an incompatible future config shape can't be mis-parsed into a wrong effective catalog stack. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(bundler): scrub generated dogfooding scaffold before merge The bundler feature was developed by dogfooding Spec Kit on itself. Now that the work is complete, remove all generated scaffolding so it does not land in the repository on merge: - specs/001-spec-kit-bundler/ (spec, plan, research, data-model, contracts, quickstart, tasks, checklists) - .specify/ (extensions, integrations, scripts, templates, workflows, feature/init/integration metadata) - .github/agents/speckit..agent.md, .github/prompts/speckit..prompt.md, and .github/copilot-instructions.md (Copilot integration scaffold) Retained: .specify/memory/constitution.md — the single dogfooding artifact carried forward — with its whitespace exemption in .gitattributes. .gitattributes and .markdownlint-cli2.jsonc are reverted to the upstream baseline (plus the constitution whitespace exemption), dropping the now-moot exemptions for the removed scaffold. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Pascal THUET <pascal.thuet@arte.tv>	2026-06-19 17:07:20 -05:00
Pascal THUET	d9370d909d	fix: isolate per-extension failures so one bad extension can't drop the rest (#2951 ) * fix: isolate per-extension failures in register_enabled_extensions_for_agent The per-extension loop had no error isolation: if registering one enabled extension raised (e.g. an OSError writing a command file), the loop aborted and the exception propagated, so every subsequent enabled extension was silently skipped. Callers wrap the whole call in a single best-effort try/except, so the wholesale abort surfaced as one warning while the command still exited 0 — leaving the agent with only a prefix of its extensions. Wrap the per-extension body in try/except: warn (naming the extension) and continue, so one bad extension can no longer drop the others. Add a regression test that forces the first-iterated extension to raise and asserts the rest still register. Closes #2950 * fix(extensions): preserve command registry when skills fail * fix: clarify skill registration warning	2026-06-19 12:41:02 -05:00
Pascal THUET	a17a658bbd	feat(scripts): add SPECIFY_INIT_DIR to target a member project from the repo root (#2892 ) * feat(scripts): add SPECIFY_INIT_DIR to target a member project from the repo root Resolve an explicit SPECIFY_INIT_DIR project override once in the core get_repo_root / Get-RepoRoot, so a non-interactive / CI caller can target a member project (the directory containing .specify/) from a monorepo root without cd. Strict by design: the path must exist and contain .specify/, otherwise it hard-errors with no silent fallback. - Single resolver in core; the git feature-branch script inherits it by sourcing core, with no per-extension copies. - PS resolver verifies the resolved path is a directory (Resolve-Path also succeeds for files) so a file value errors as "not an existing directory". - get_feature_paths splits decl/assignment so a SPECIFY_INIT_DIR failure propagates instead of being masked by `local`. - create-new-feature-branch: when core is absent (only git-common loaded) and SPECIFY_INIT_DIR is set, hard-error rather than silently using the git root. - Document SPECIFY_INIT_DIR and SPECIFY_FEATURE_DIRECTORY in the core reference. - Tests for valid/relative/trailing-slash/file/missing/no-.specify targets, feature-axis composition, the no-core guard, and a PowerShell mirror. * fix: guard SPECIFY_INIT_DIR with stale core scripts * docs: clarify SPECIFY_FEATURE_DIRECTORY precedence wording * fix: normalize trailing slash in PowerShell SPECIFY_INIT_DIR resolver Resolve-Path preserves a trailing separator from its input, so a SPECIFY_INIT_DIR ending in a slash returned a root that didn't match the bash resolver (whose `cd && pwd` strips it). That broke test_ps_trailing_slash_tolerated on the CI runners, which do have pwsh. Trim it with TrimEndingDirectorySeparator (no-op on a bare root or a path with no trailing separator). Also fix the misleading test comment: the PowerShell mirror runs on the CI ubuntu/windows runners (they ship pwsh), it is not skipped there. * test: normalize bash path expectations on Windows * docs: clarify SPECIFY_INIT_DIR root helpers	2026-06-19 12:05:42 -05:00
Ed Harrod	98ee02a98b	feat(claude): run /analyze in a forked subagent (#2511 ) * claude: run /analyze in a forked subagent /analyze is explicitly read-only and produces a compact analysis report from heavy artefact reads (spec.md, plan.md, tasks.md). It matches the canonical use case for context: fork — bulk inputs that collapse to a short summary, no need for conversation history. Forking keeps the artefact contents out of the main conversation context, which is the concern raised in #752. Done as a per-command opt-in via FORK_CONTEXT_COMMANDS so other spec-kit commands (which are interactive or have side effects) are unaffected. Refs #752 * claude: apply per-command frontmatter on every skill-generation path argument-hint and fork context were injected only in setup(), so skills produced via post_process_skill_content() directly (presets, extensions) lost them - e.g. a preset overriding speckit-analyze dropped context: fork. Move the per-command injection into post_process_skill_content(), deriving the command stem from the frontmatter name, so all generation paths stay consistent. setup() now just calls post_process_skill_content(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * claude: drop redundant post-process loop from setup SkillsIntegration.setup() already runs post_process_skill_content() on every SKILL.md before writing it, and that method now applies the argument-hint and fork-context injection. The per-file re-process loop in ClaudeIntegration.setup() was therefore a no-op, so inherit the base setup() directly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-19 10:28:45 -05:00
LuoHui1	4eda983950	fix: count worktree branches in git extension numbering (#3054 ) * fix: count worktree branches in git extension numbering * fix: preserve literal plus branch prefixes	2026-06-18 09:40:32 -05:00
Ben Buttigieg	0c29d890ab	feat: add /speckit.converge command (#3001 ) * Add /speckit.converge SDD artifacts and project scaffolding Dogfood the converge feature through Spec Kit's own workflow: - spec.md, plan.md, tasks.md, research, data-model, contracts, quickstart - requirements checklist for the feature - ratified constitution v1.0.0 (.specify/memory) - Specify project scaffolding (.specify/, .github agent + prompt files) Defines a built-in /speckit.converge command that assesses spec/plan/tasks against the codebase and appends remaining work as new tasks (no git, no change tracking, append-only). Implementation not yet started. Excludes unrelated working-tree changes to agents.py, extensions.py, test_extensions.py, catalog.community.json, and README.md. * Implement /speckit.converge command Add the built-in converge command that assesses the codebase against a feature's spec.md, plan.md, and tasks.md and appends remaining unbuilt work as new traceable tasks to tasks.md (append-only; no git, no change tracking). - templates/commands/converge.md: full command body (load artifacts, assess code, classify findings missing/partial/contradicts/unrequested, append '## Phase N — Convergence' tasks with source-ref + gap-type, read-only guardrails, converged branch, handoff, before/after_converge hooks) - Register converge as a core command across all enumeration sites (SKILL_DESCRIPTIONS, _FALLBACK_CORE_COMMAND_NAMES, ARGUMENT_HINTS, and the integration test command lists incl. copilot/generic file inventories) - init.py Next Steps panel + README Core Commands table - tasks.md: T001-T024 complete (T025 manual quickstart pending) Full suite green: 2343 passed. * Record quickstart validation results for /speckit.converge (T025) All six quickstart scenarios validated (GitHub Copilot agent, macOS/zsh): S1 gap->appended traceable task, S2 implement+re-converge, S3 converged leaves tasks.md unchanged, S4 read-only boundaries, S5 missing-prereq stop, S6 cross- integration install (copilot + windsurf). Automated suite: 2343 passed. * Record 2026-06-16 re-verification results for /speckit.converge (T025) * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Fix integration upgrade deleting settings.json and dropping script +x Two upgrade-path bugs surfaced during converge E2E validation: - copilot upgrade stale-deleted .vscode/settings.json because setup() only tracks the file when it creates it; on upgrade the pre-existing file is merged and left untracked, so Phase 2 stale cleanup removed it. Add an integration-level stale_cleanup_exclusions() hook (CopilotIntegration returns {.vscode/settings.json}) and subtract it from stale_keys. - shared .specify/scripts/.sh lost their execute bit because the managed refresh rewrites them with the bundled source mode (often 0o644) and nothing restored perms. Call ensure_executable_scripts() after the managed-refresh block (POSIX only). Add regression tests in TestIntegrationUpgrade covering both fixes (validated to fail without the fixes). fix: resolve markdownlint errors in PR files Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: clean up runtime state files from PR Remove .specify state files that are per-project runtime artifacts: - feature.json, init-options.json, integration.json - manifest files, extension registry, bug artifacts These are generated by 'specify init' and should not be committed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: fold converge artifacts from #3003 and #3005 - Add speckit.converge Copilot agent and prompt files (#3003) - Add regression test for Claude argument hints (#3005) - Remove invalid converge entry from Claude argument hints - Fix documentation removing branch-prefix fallback claims Supersedes: #3003, #3005 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: remove non-converge specify scaffolding from PR Remove .specify/ artifacts, non-converge .github/agents and prompts, and copilot-instructions.md that were generated by 'specify init' and are not part of the converge command feature. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: remove SDD spec artifacts from PR Remove specs/001-converge-command/ — the spec/plan/tasks/research SDD artifacts produced while building this feature. spec-kit does not track a specs/ directory on main (those are outputs of running the workflow on the repo, not part of the shipped tool). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: remove generated Copilot converge command files Remove .github/agents/speckit.converge.agent.md and .github/prompts/speckit.converge.prompt.md — these are generated by 'specify init --integration copilot' from templates/commands/converge.md (all __SPECKIT_COMMAND___/{SCRIPT} tokens are resolved). main tracks no .github/agents or .github/prompts files; the template is the source of truth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> chore: split out unrelated integration-upgrade fix Move the stale_cleanup_exclusions / executable-bit upgrade fix (base.py, copilot, _migrate_commands.py, test_integration_subcommand.py) out of this PR into its own change. This PR is now scoped purely to the /speckit.converge command. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add converge to core command template ordering converge is a core command in SKILL_DESCRIPTIONS but was missing from _CORE_COMMAND_TEMPLATE_ORDER, so it sorted with the fallback rank. Add it after 'implement' to keep core-command ordering consistent across integrations. Addresses review feedback on #3001. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: make converge findings example neutral Replace the self-referential sample evidence text in the Convergence Findings table with a neutral placeholder so agents are less likely to copy nonsensical template-specific findings into real output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * docs: clarify converge scope and hook outcome wording - Remove FR-specific parenthetical from code-scope rule so it doesn't imply a hard FR-001 reference exists in every feature - Replace unsupported 'pass outcome to hook context' instruction with explicit in-session outcome reporting before hook listing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: align converge task example with tasks format Use (no colon) in the convergence task example so it matches tasks-template formatting and downstream expectations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Clarification of usage Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * docs: align converge phase/task-id format with tasks template - Use (colon) for consistency with tasks template - Clarify appended task IDs must be zero-padded ( style) - Update checklist example to a concrete zero-padded ID () Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: standardize converge phase heading format Use consistently in converge.md (including the append-only contract section) to match Step 7 and tasks template style. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-17 14:47:00 -05:00
Ben Buttigieg	84db931f18	fix: preserve .vscode/settings.json and script +x bit on integration upgrade (#3020 ) * fix: preserve .vscode/settings.json and script +x bit on integration upgrade During 'specify integration upgrade', Phase 2 stale-cleanup removes files present in the old manifest but absent from the new one. Copilot's setup() merges into an existing .vscode/settings.json and stops tracking it, so the file was being deleted on upgrade (destroying user settings). Add a stale_cleanup_exclusions() hook that integrations use to protect such conditionally-tracked merge targets. Also restore the executable bit on shared .sh scripts after the managed-refresh step on POSIX. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address review on stale-cleanup fix - Normalize stale_cleanup_exclusions() to POSIX before subtracting from manifest keys, so exclusions built with os.path.join / backslashes still match on Windows. - Strengthen test_upgrade_preserves_existing_vscode_settings to add a user-defined key and assert it survives the upgrade (via --force, exercising the merge + stale-cleanup path) instead of the brittle after == before check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-17 14:22:04 -05:00
Huy Do	affbf5ead5	feat(workflows): add from_json expression filter (#2961 ) * feat(workflows): add from_json expression filter Step outputs captured as strings could never become typed values in templates - the filter set was default/join/map/contains only, so e.g. a fan-out items: could never consume a step's JSON stdout. Add an arg-less from_json pipe filter with parse-or-raise semantics: invalid JSON or non-string input raises a clear ValueError rather than passing through silently. Fixes #2960 * fix(expressions): make from_json strict — reject any arguments Address review (#2961): from_json('x') and from_json() previously fell through to a silent passthrough of the unparsed value. Reject any parenthesized form with a clear error so mis-wired templates fail loudly. Rename test to ...parses_object (JSON under test is an object) and add coverage for the strict no-arguments behavior. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(workflows): document the from_json expression filter Address Copilot review: the user-facing filter references omitted the newly added `from_json` filter. Add it to the ARCHITECTURE.md filter table (with the `{{ steps.emit.output.stdout \| from_json }}` example) and to the filter enumerations in workflows/README.md and docs/reference/workflows.md so the docs match the evaluator's capabilities. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): make from_json strictness reject trailing tokens; fix docstring Address Copilot review: - Strictness only rejected parenthesized forms, so typos like `\| from_json)` or `\| from_json extra` still fell through to the unknown-filter path and silently returned the unparsed value. Match on the leading filter token and require the whole filter to be exactly `from_json`, so every mis-wired form raises. Extend the rejection test to cover the trailing-token cases. - The module docstring claimed "no imports", which is misleading now that the module imports `json`. Reword to state the actual sandbox guarantee: templates cannot do file I/O, import modules, or run arbitrary code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-17 13:43:26 -05:00
Copilot	00bff788c9	Add `init` workflow step to bootstrap projects like `specify init` (#2838 ) * Initial plan * Add init workflow step to bootstrap projects like `specify init` * Address review: simplify stderr capture and extract VALID_SCRIPT_TYPES * Address review: fail fast on non-empty dir, stdout fallback, README force fix * Populate exit_code/stdout/stderr in non-empty-dir fast-fail * fix: address three unresolved review comments in InitStep - Use `with os.scandir(...)` context manager so the iterator is always closed even when `any()` short-circuits, preventing file-descriptor leaks in long-running workflow runs. - Guard `os.chdir(prev_cwd)` in the `finally` block with a try/except so an `OSError` (e.g. directory deleted) doesn't bypass returning the captured `StepResult`. - Reject non-string `script` values in `validate()` with a clear error message, rather than silently passing them through to become `--script True` at runtime. * Potential fix for pull request finding 'Empty except' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * fix: remove no_git and branch_numbering options removed upstream The --no-git and --branch-numbering flags were removed from `specify init` on main. Update InitStep to drop these unsupported config fields and fix tests accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address review — integration defaults, integration_options, engine-owned dirs - Apply DEFAULT_INIT_INTEGRATION fallback when neither step config nor workflow context provides an integration, so output.integration always reflects the actual integration used. - Add integration_options config field to support --integration-options passthrough (required for generic integration and --skills mode). - Exclude .specify/ from the non-empty directory fast-fail check so that here: true works when the engine has already created its run-state directory before steps execute. - Note: mix_stderr=False is not needed — Click 8.2+ captures stderr separately by default and the existing try/except handles access. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: implicitly add --force when only engine-owned dirs exist When the workflow engine creates .specify/workflows/runs/ before steps execute, the directory is technically non-empty. Previously, specify init would prompt for confirmation (hanging in unattended mode) unless the user explicitly set force: true. Now the step detects that only engine-owned directories (.specify/) are present and implicitly adds --force so init proceeds without user interaction. Also fixes the test to exercise the implicit-force path rather than passing force: True explicitly (which bypassed the check entirely). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: derive VALID_SCRIPT_TYPES from shared constant, fail fast on OSError, include all resolved fields in output - Derive VALID_SCRIPT_TYPES from SCRIPT_TYPE_CHOICES in _agent_config so the valid set cannot drift from the specify init CLI. - Fail fast with a clear error when os.scandir() raises OSError (e.g. permission denied) instead of silently treating the directory as empty. - Include preset, force, and ignore_agent_tools in all output dicts (both fast-fail and normal paths) for consistent interpolation and debugging downstream. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: populate stderr from stdout on older Click, fix force comment wording - When Click does not expose result.stderr (older versions where stderr is mixed into stdout), use stdout as stderr on non-zero exit so workflows can consistently read steps.<id>.output.stderr for errors. - Update README inline comment for force: wording to say 'when target directory already exists' rather than 'non-empty directory', matching the actual specify init behavior for the project: form. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: build argv flags before early returns, use any() for dir scan - Move argv flag-building (--integration, --script, --preset, --ignore-agent-tools) before the non-empty-dir and OSError early returns so output['argv'] always reflects the complete command. - --force is appended after the check since it may be set implicitly. - Replace list comprehension with any() generator expression to short-circuit without allocating a full list of DirEntry objects. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: only treat .specify as engine-owned when it is a real directory A file or symlink named .specify should not be excluded from the non-empty check. Use entry.is_dir(follow_symlinks=False) to ensure only an actual directory is considered engine-owned content. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: guard implicit force for engine dirs only, fix integration fallback order - Only set implicit --force when engine-owned directories (.specify/) are actually present. A completely empty directory no longer gets --force added unnecessarily. - Fix integration resolution precedence: resolve step config expression first, then fall back to workflow default (also resolved), then to DEFAULT_INIT_INTEGRATION. Previously, a step expression resolving to falsy would bypass the workflow default entirely. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Manfred Riem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-17 11:46:51 -05:00
Huy Do	071f784dfa	feat(workflows): opt-in output_format: json exposes parsed shell stdout as output.data (#2963 ) * feat(workflows): opt-in output_format: json exposes parsed shell stdout as output.data No step that runs external code could hand a typed value to a later step, so e.g. a fan-out could never consume a runtime-computed collection. With output_format: json declared, stdout is parsed and exposed under output.data (raw keys unchanged); a parse failure fails the step with a clear error. Without the key, behavior is unchanged. Reference implementation for the proposal in #2962. Addresses #2962 * test(shell): emit JSON via sys.executable for cross-platform output_format tests Address review (#2963): replace non-portable echo '{...}' (Windows cmd.exe keeps the single quotes, breaking JSON) with the established '"{py}" "{script}"' pattern using sys.executable + a temp script, so the output_format tests pass on the Windows CI matrix. Also make the validate test's run inert (exit 0). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-17 08:09:17 -05:00
Huy Do	1ee2b626a8	fix: non-zero exit code when a workflow run ends failed or aborted (#2959 ) * fix: non-zero exit code when a workflow run ends failed or aborted workflow run and workflow resume printed Status: failed (or emitted the --json payload) and exited 0, so scripts and CI could not rely on the process exit code. Map terminal outcomes: failed\|aborted -> 1, completed\|paused -> 0, on both the text and --json paths. The previous exit-0-on-failed behavior was pinned by test_workflow_run_failing_yaml_without_project; the pin is updated to the new contract. Fixes #2958 * test: portable exit-code step commands + cover resume failed->exit-1 Address review (#2959): replace non-portable run: 'true'/'false' with 'exit 0'/'exit 1' (Windows cmd.exe has no true/false builtins under shell=True), and add an end-to-end 'workflow resume' test asserting a resumed failed run exits non-zero. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-17 08:02:00 -05:00
Seiya Kojima	811a3aa447	fix(skills): preserve non-ASCII characters in skill frontmatter (#2917 ) * fix(skills): preserve non-ASCII chars in skill frontmatter Skill SKILL.md frontmatter descriptions containing non-ASCII characters were escaped to \uXXXX / \xXX sequences because yaml.safe_dump() was called without allow_unicode=True. - Add allow_unicode=True to the 7 skill/command frontmatter safe_dump sites (extensions, presets, claude integration) - Add regression tests for the render and extension-install paths Follows the approach of #1936; encoding="utf-8" is already set on the affected write paths, so no encoding change is needed here. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(_utils): add dump_frontmatter helper Centralize skill/command frontmatter YAML serialization into a single _utils.dump_frontmatter helper so no call site can drop allow_unicode or diverge on formatting. Route the 7 existing sites through it and drop a now-unused local yaml import. Switch the extension test fixtures to yaml.safe_dump for parity with the production safe-dump/safe-load codepaths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-17 07:57:54 -05:00
Jina Park	de18d21b1c	fix: prevent extension self-install from deleting source dir (#2990 ) (#2991 ) * fix: prevent extension self-install from deleting source dir (#2990) `specify extension add <path> --dev --force` permanently deleted the extension directory without registering it when the source path resolved to the extension's own install location (`.specify/extensions/<id>`). With `--force`, `install_from_directory()` removed the existing installation (the source) and then `shutil.copytree()` tried to copy from the now-deleted directory, destroying it and crashing. Add a guard that fails fast with a clear ValidationError when the resolved source path equals the install destination, before any destructive operation runs. Includes a regression test asserting the directory and its contents survive. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix: harden extension self-install guard --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-17 07:56:17 -05:00
Manfred Riem	75aee19c6e	fix: disable Rich Live transient mode on Windows to prevent PS 5.1 hang (#2938 ) * fix: disable Rich Live transient mode on Windows to prevent PS 5.1 hang PowerShell 5.1's legacy console host does not reliably support VT escape sequences. Rich's Live(transient=True) attempts cursor restoration on context exit, which hangs indefinitely on that console. Set transient=False when sys.platform == 'win32' in both init.py (progress tracker) and _console.py (select_with_arrows). The only cosmetic effect is that progress output remains visible after completion on Windows. Fixes #2927 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: address review feedback on test quality - Use captured['transient'] instead of .get() for clearer KeyError on failure - Source guards now assert both the platform check AND transient=_transient usage - Remove unused imports (MagicMock retained as it's used, removed pytest) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: use regex in source guards for resilience to formatting changes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: use single DOTALL regex to verify assignment flows into Live() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: skip duplicate tracker print on Windows when transient=False When transient is False, Rich leaves the Live output on screen. The subsequent console.print(tracker.render()) would duplicate it. Gate it behind _transient so Windows users see the tracker exactly once. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-17 06:48:42 -05:00
Copilot	c52ccd7dc7	Add workflow step catalog — community-installable step types (#2394 ) * Initial plan * Add workflow step catalog: StepRegistry, StepCatalog, CLI commands, and tests Agent-Logs-Url: https://github.com/github/spec-kit/sessions/2885e646-477d-4df8-b9a3-06d8cb29e748 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Potential fix for pull request finding 'An assert statement has a side-effect' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Address PR review: path traversal, cache robustness, collision check, failed-to-load display - Add resolve()+relative_to() path traversal guards in workflow_step_add and workflow_step_remove to prevent directory escape via step_id - Harden _is_url_cache_valid in both StepCatalog and WorkflowCatalog to coerce fetched_at to float and catch TypeError/ValueError - Check STEP_REGISTRY and StepRegistry before installing to prevent collisions with built-in step types or already-installed steps - Show 'Custom (installed, failed to load)' section in workflow step list for steps in the registry that failed to load into STEP_REGISTRY * Fix StepRegistry shape validation and StepCatalog empty-YAML handling Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Polish: rename _default to default_registry, strengthen unreadable-file test Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address PR review: atomic install, hostname validation, cache resilience, no dynamic imports in list/info Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Fix shutil.move with existing step_dir: remove before move to avoid subdirectory nesting Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Call load_custom_steps at execution time; enforce hostname in _safe_fetch and _validate_url Agent-Logs-Url: https://github.com/github/spec-kit/sessions/73865880-fb25-4061-a43e-4e4b4d1c4de6 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Wrap YAML parsing in try/except; atomic step install via os.rename() under same fs Agent-Logs-Url: https://github.com/github/spec-kit/sessions/ff915bc5-ec7e-4e6a-b505-35f5795250df Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Validate YAML root is a dict in _load_catalog_config and workflow_step_add; fix WorkflowCatalog hostname validation Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Fix load_custom_steps() package imports and add reserved step ID validation * Move _re/_sys imports out of loop and _RESERVED_STEP_IDS to module level * Address review: collision-resistant module names, extra_files support, remove orphan dir * Harden extra_files: warn on non-dict, resolve symlinks in path traversal check * Switch _safe_fetch and StepCatalog._fetch_single_catalog to use open_url for auth consistency * Harden step_id validation against path-segment tricks; raise on StepRegistry.save() OSError * Clean up sys.modules on broken step packages; handle StepValidationError in step add/remove * Address review thread: int-coerce priorities, sys.modules cleanup, _require_specify_project, registry-first remove * fix: normalize workflow step catalog metadata fallbacks * fix: address latest workflow step and catalog review findings * Handle non-string extra_files keys in workflow step add * Harden StepRegistry symlink reads and extra_files path/URL validation * Harden custom step loader and step remove against symlinks and OSError * Fix StepCatalog.search() to coerce non-string fields before joining * Fix WorkflowCatalog YAML parsing error handling and isinstance checks * Harden step registry save and custom step/catalog ID handling * Harden cache validation and staging OSError handling * Address review: reorder symlink guard and split mixed test - Move symlink-parent check before is_dir() in load_custom_steps() so we never stat an external target through a symlink - Split test_get_merged_steps_normalizes_list_ids_to_strings into two focused tests: one for list-id normalization, one for get_step_info return values Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review: symlink-before-stat in loader, restore registry on rmtree failure - load_custom_steps(): check is_symlink() before is_dir() on step directories so symlinked entries are skipped without statting external targets - workflow_step_remove: restore the registry entry when shutil.rmtree() fails so filesystem and registry state stay consistent and a future 'step add' isn't blocked Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Harden step_id validation and file-write error handling - _validate_step_id_or_exit: reject whitespace-only/padded IDs, Windows-invalid characters (<>:"\|?), control characters, trailing dots/spaces, and Windows reserved device names (con, nul, etc.) - Wrap step.yml/__init__.py staging writes in OSError handler - Wrap extra_files disk writes (mkdir + write_bytes) in OSError handler that names the failing relative path - Registry rollback on rmtree failure: restore verbatim metadata and emit a warning if the restore itself fails Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Address review: cache symlink guard, verbatim registry rollback, Windows test fix - StepCatalog: add _is_cache_path_safe() guard that checks for symlinks in .specify/workflows/steps/.cache path; skip cache reads and writes when any component is symlinked to prevent writes outside project root - Registry rollback: write metadata directly to registry.data['steps'] and call save() instead of using add() which overwrites timestamps - temp_dir fixture: use ignore_errors=True on Windows to avoid flaky teardown from locked file handles (WinError 32) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Simplify exec_module call by removing redundant nested try/except Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix empty YAML tolerance in WorkflowCatalog.add_catalog, scope ignore_errors to Windows - WorkflowCatalog.add_catalog(): treat None from yaml.safe_load() (empty file) as an empty mapping instead of raising 'corrupted' - temp_dir fixture: limit ignore_errors to sys.platform == 'win32' so real cleanup issues surface on non-Windows platforms Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Chain exceptions in _load_catalog_config for both catalog classes Add 'from exc' to preserve root cause in tracebacks while keeping clean user-facing messages. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Make default catalog tests hermetic by isolating HOME Monkeypatch Path.home() to project_dir and clear catalog env vars so tests don't break on machines with a real ~/.specify/step-catalogs.yml or ~/.specify/workflow-catalogs.yml. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix falsy ID handling in _get_merged_steps for list-based catalogs Check for None explicitly instead of using 'or' which drops valid falsy IDs like 0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Compare reserved step IDs case-insensitively for filesystem safety On case-insensitive filesystems (Windows, common macOS), variants like STEP-REGISTRY.JSON would collide with the actual registry file. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add explanatory comments to intentional empty except blocks Document why cache-read failures are silently ignored in both WorkflowCatalog and StepCatalog _fetch_single_catalog methods. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-16 18:03:45 -05:00
Pascal THUET	9cd20c6c25	feat(dev): add integration scaffolder (#2685 ) * feat(dev): add integration scaffolder * fix(dev): address integration scaffold review feedback * fix(dev): address scaffold follow-up review * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(dev): default scaffolded integrations to multi_install_safe = False The scaffold template emitted `multi_install_safe = True` alongside a placeholder `context_file = "AGENTS.md"`. Registered as-is, that violates the registry contract (test_safe_integrations_have_distinct_context_files): codex already pairs AGENTS.md with multi_install_safe = True, so the generated boilerplate would collide on first registration. Default the scaffold to False (matching IntegrationBase) so generated code is registry-test-friendly out of the box; contributors opt in once they pick a unique context_file. Aligns the generated test skeleton and both scaffold tests, which previously contradicted each other (one expected True, one False). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(dev): harden scaffold writes and accept case-insensitive --type - Guard scaffold_integration() against symlinked target directories: walk each path component under the repo root and refuse symlinked dirs, then confirm the write destination resolves inside the repo (mirrors the manifest directory guard). Prevents scaffolding outside the repo when a contributor's integrations/tests path is symlinked. - Make the `--type` click.Choice case-insensitive so `--type YAML` is accepted, matching scaffold_integration()'s strip()/lower() normalization instead of rejecting at the CLI layer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(dev): report scaffold filesystem failures as a clean CLI error The `dev integration scaffold` command only caught FileExistsError/ValueError, so an OSError raised during mkdir()/write_text() (permission denied, read-only checkout, a path component that is a file, ...) bubbled up as a traceback instead of a clean error + exit code. Broaden the handler to OSError (which also covers FileExistsError) and add coverage for the filesystem-error path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(dev): move scaffold command under integration * fix(dev): roll back partial scaffold writes * fix(dev): correct lint docs and generated test docstring - local-development.md: ruff check src/ is enforced in CI, not absent - scaffolded test docstring: drop misleading 'scaffold' wording * fix(scaffold): create only leaf integration directory --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 17:48:40 -05:00
Alicia Sykes	6d057b6239	fix(tests): don't run PowerShell tests via WSL-interop powershell.exe (#2971 ) * fix(tests): don't run PowerShell tests via WSL-interop powershell.exe * fix(tests): applies copilot feedback, with rename	2026-06-16 17:36:24 -05:00
Ahmet TOK	1150d32aee	Add Zed integration (#2780 ) * feat: add Zed integration * fix: update integrations stats grid to 31 for consistency * fix: address Copilot review feedback - Remove non-actionable --skills flag from ZedIntegration (Zed is always skills-based, like Agy) - Align zed_skill_mode predicate with ai_skills for consistency across init output and hook rendering - Consolidate claude/cursor/zed slash-skill return blocks in _render_hook_invocation to reduce duplication - Override test_options_include_skills_flag for Zed (no --skills flag) * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: address Copilot review round 2 - Make zed_skill_mode unconditional in hook rendering (Zed is always skills-based, no --skills option) - Add test_init_persists_ai_skills_for_zed that exercises the actual CLI init path and verifies HookExecutor renders /speckit-plan without manual init-options manipulation * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: address copilot review feedback for zed integration - Update integration count from 31 to 33 in docs/index.md (32 integrations + Generic) - Make zed_skill_mode unconditional to match extensions.py behavior - Consolidate slash-skill integrations into a set for consistency - Move os import to module level in test_integration_zed.py * fix: refine slash-skill logic and ai-skills validation - Fix slash-skill integrations: Claude/Cursor require ai_skills=true; Zed/Agy/Devin are always skills - Allow --ai-skills with --integration (not just --ai) to fix validation error * fix: remove unused variables and update ai-skills help text - Add agy_skill_mode and devin_skill_mode variables to fix F841 lint error - Use all skill mode variables in the slash-skill conditional check - Update --ai-skills help text to reflect it works with --integration too * fix: add trae_skill_mode to hook invocation for consistency Trae is a SkillsIntegration like Zed/Agy/Devin, so it should also be treated as always-skills-based in hook invocation rendering. * fix: make Agy always skills-based for consistency AgyIntegration is a SkillsIntegration subclass with no --skills option, so it should be treated as always skills-based (like Zed, Devin, Trae). This aligns init.py skill mode detection with extensions.py hook rendering. * fix: gate agy_skill_mode and refactor _render_hook_invocation to use sets Addressed Copilot review comments: - Restored _is_skills_integration guard on agy_skill_mode in init.py to be defensive about runtime integration type. - Refactored _render_hook_invocation() in extensions.py to use always_slash/conditional_slash frozensets instead of individual per-agent booleans, eliminating unused variables (F841) and making it harder for conditions to drift between integrations. - Centralized slash-skill determination so adding a new unconditional slash-skill integration is a one-key addition. * fix: address latest Copilot review comments - Added copilot to CONDITIONAL_SLASH_AGENTS for consistent hook invocation rendering with init.py - Moved always_slash/conditional_slash frozensets to module scope to avoid per-call reallocation - Replaced manual os.chdir() with monkeypatch.chdir() in test - Overrode test_options_include_skills_flag for Zed (no --skills) * fix: address latest Copilot review comments - Removed redundant local import yaml in _register_extension_skills (yaml is already imported at module scope) - Split --ai-skills usage hint into two separate print statements for better readability - Changed integrations count from '33' to '30+' to avoid future drift * fix: re-add _is_skills_integration definition lost in merge The _is_skills_integration variable was accidentally dropped during the web UI merge resolution of upstream/main's removal of legacy --ai flags. Re-added the definition via isinstance(resolved_integration, SkillsIntegration) check so that skill-mode booleans work correctly. * fix: gate zed_skill_mode on _is_skills_integration for consistency Aligns zed_skill_mode with the other skills-based agents (codex, claude, cursor-agent, copilot) which all use _is_skills_integration gating. Since ZedIntegration extends SkillsIntegration, behavior is unchanged. * fix: remove unused claude_skill_mode and cursor_skill_mode locals in _render_hook_invocation These variables became unused after the refactor to ALWAYS_SLASH_AGENTS / CONDITIONAL_SLASH_AGENTS sets. Claude and Cursor-Agent are now handled by the CONDITIONAL_SLASH_AGENTS path, so the separate boolean locals are dead code. Fixes ruff F841 and addresses Copilot review feedback that was repeated across multiple review rounds. * fix: align agy/trae invocation format in init next-steps with hook rendering and build_command_invocation - Moved agy and trae from '-<name>' (dollar/Codex format) to '/speckit-<name>' (slash format) in _display_cmd() to match: - HookExecutor._render_hook_invocation() (ALWAYS_SLASH_AGENTS for trae, CONDITIONAL_SLASH_AGENTS for agy) - SkillsIntegration.build_command_invocation() (default: /speckit-<name>) - The '$' prefix is specific to Codex; all other skills agents use '/'. * fix: address Copilot review comments on hook invocation consistency - Add is_slash_skills_agent() helper to extensions.py to centralize the agent-to-invocation-format mapping, reducing drift risk between HookExecutor._render_hook_invocation() and init.py _display_cmd() - Use the shared helper in both locations; init.py now imports and delegates to is_slash_skills_agent() instead of maintaining its own per-agent boolean matrix - Fix test_hooks_render_skill_invocation to use ai_skills=False, proving Zed renders /speckit-<name> unconditionally - Add parameterized TestSlashSkillsSets covering all agents in ALWAYS_SLASH_AGENTS and CONDITIONAL_SLASH_AGENTS with ai_skills both true and false * fix: address Copilot review comments on type safety and test API - Make is_slash_skills_agent() accept str \| None to match its call sites (init_options.get("ai") can return None) - Refactor TestSlashSkillsSets to use public execute_hook() API instead of private _render_hook_invocation() method * fix: address Copilot review comments on typing and naming clarity - Add from __future__ import annotations to extensions.py so PEP 604 unions (str \| None) are safe regardless of Python version - Add clarifying _ai_skills_enabled local variable in init.py's _display_cmd() to make the semantic meaning explicit when passing it to is_slash_skills_agent() * fix: move invocation-style logic into shared _invocation_style module - Extract ALWAYS_SLASH_AGENTS, CONDITIONAL_SLASH_AGENTS, and is_slash_skills_agent() from extensions.py into new _invocation_style.py module, eliminating the awkward init.py -> extensions.py import dependency for invocation-style decision logic - Both HookExecutor._render_hook_invocation() and init.py _display_cmd() now import from the shared module instead of one subsystem importing from the other - Revert /SKILL.md change: the leading slash is semantically significant (path component vs filename suffix) * fix: add None guard before i.options() in test_options_include_skills_flag get_integration() returns IntegrationBase \| None, so i.options() is a type error without a None check. * fix: override test_options_include_skills_flag for Zed (always skills, no --skills flag) Zed is always skills-based and doesn't expose a --skills option. Override the inherited base test to assert --skills is absent. * fix: rename test and skip inherited test_options_include_skills_flag for Zed - Skip inherited test_options_include_skills_flag (not applicable — Zed is always skills-based with no --skills flag) - Add test_options_do_not_include_skills_flag with correct name matching the assertion (--skills is absent) * fix: add defensive non-string check in is_slash_skills_agent Reject non-string values for selected_ai to prevent TypeError from set membership checks when persisted init-options contain corrupted data (e.g. list or dict instead of string). --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-16 17:29:08 -05:00

1 2 3 4 5

231 Commits