github-spec-kit

mirror of https://github.com/github/spec-kit.git synced 2026-07-03 12:28:06 +08:00

Author	SHA1	Message	Date
Pascal THUET	a17a658bbd	feat(scripts): add SPECIFY_INIT_DIR to target a member project from the repo root (#2892 ) * feat(scripts): add SPECIFY_INIT_DIR to target a member project from the repo root Resolve an explicit SPECIFY_INIT_DIR project override once in the core get_repo_root / Get-RepoRoot, so a non-interactive / CI caller can target a member project (the directory containing .specify/) from a monorepo root without cd. Strict by design: the path must exist and contain .specify/, otherwise it hard-errors with no silent fallback. - Single resolver in core; the git feature-branch script inherits it by sourcing core, with no per-extension copies. - PS resolver verifies the resolved path is a directory (Resolve-Path also succeeds for files) so a file value errors as "not an existing directory". - get_feature_paths splits decl/assignment so a SPECIFY_INIT_DIR failure propagates instead of being masked by `local`. - create-new-feature-branch: when core is absent (only git-common loaded) and SPECIFY_INIT_DIR is set, hard-error rather than silently using the git root. - Document SPECIFY_INIT_DIR and SPECIFY_FEATURE_DIRECTORY in the core reference. - Tests for valid/relative/trailing-slash/file/missing/no-.specify targets, feature-axis composition, the no-core guard, and a PowerShell mirror. * fix: guard SPECIFY_INIT_DIR with stale core scripts * docs: clarify SPECIFY_FEATURE_DIRECTORY precedence wording * fix: normalize trailing slash in PowerShell SPECIFY_INIT_DIR resolver Resolve-Path preserves a trailing separator from its input, so a SPECIFY_INIT_DIR ending in a slash returned a root that didn't match the bash resolver (whose `cd && pwd` strips it). That broke test_ps_trailing_slash_tolerated on the CI runners, which do have pwsh. Trim it with TrimEndingDirectorySeparator (no-op on a bare root or a path with no trailing separator). Also fix the misleading test comment: the PowerShell mirror runs on the CI ubuntu/windows runners (they ship pwsh), it is not skipped there. * test: normalize bash path expectations on Windows * docs: clarify SPECIFY_INIT_DIR root helpers	2026-06-19 12:05:42 -05:00
Ed Harrod	98ee02a98b	feat(claude): run /analyze in a forked subagent (#2511 ) * claude: run /analyze in a forked subagent /analyze is explicitly read-only and produces a compact analysis report from heavy artefact reads (spec.md, plan.md, tasks.md). It matches the canonical use case for context: fork — bulk inputs that collapse to a short summary, no need for conversation history. Forking keeps the artefact contents out of the main conversation context, which is the concern raised in #752. Done as a per-command opt-in via FORK_CONTEXT_COMMANDS so other spec-kit commands (which are interactive or have side effects) are unaffected. Refs #752 * claude: apply per-command frontmatter on every skill-generation path argument-hint and fork context were injected only in setup(), so skills produced via post_process_skill_content() directly (presets, extensions) lost them - e.g. a preset overriding speckit-analyze dropped context: fork. Move the per-command injection into post_process_skill_content(), deriving the command stem from the frontmatter name, so all generation paths stay consistent. setup() now just calls post_process_skill_content(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * claude: drop redundant post-process loop from setup SkillsIntegration.setup() already runs post_process_skill_content() on every SKILL.md before writing it, and that method now applies the argument-hint and fork-context injection. The per-file re-process loop in ClaudeIntegration.setup() was therefore a no-op, so inherit the base setup() directly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-19 10:28:45 -05:00
LuoHui1	4eda983950	fix: count worktree branches in git extension numbering (#3054 ) * fix: count worktree branches in git extension numbering * fix: preserve literal plus branch prefixes	2026-06-18 09:40:32 -05:00
Ben Buttigieg	0c29d890ab	feat: add /speckit.converge command (#3001 ) * Add /speckit.converge SDD artifacts and project scaffolding Dogfood the converge feature through Spec Kit's own workflow: - spec.md, plan.md, tasks.md, research, data-model, contracts, quickstart - requirements checklist for the feature - ratified constitution v1.0.0 (.specify/memory) - Specify project scaffolding (.specify/, .github agent + prompt files) Defines a built-in /speckit.converge command that assesses spec/plan/tasks against the codebase and appends remaining work as new tasks (no git, no change tracking, append-only). Implementation not yet started. Excludes unrelated working-tree changes to agents.py, extensions.py, test_extensions.py, catalog.community.json, and README.md. * Implement /speckit.converge command Add the built-in converge command that assesses the codebase against a feature's spec.md, plan.md, and tasks.md and appends remaining unbuilt work as new traceable tasks to tasks.md (append-only; no git, no change tracking). - templates/commands/converge.md: full command body (load artifacts, assess code, classify findings missing/partial/contradicts/unrequested, append '## Phase N — Convergence' tasks with source-ref + gap-type, read-only guardrails, converged branch, handoff, before/after_converge hooks) - Register converge as a core command across all enumeration sites (SKILL_DESCRIPTIONS, _FALLBACK_CORE_COMMAND_NAMES, ARGUMENT_HINTS, and the integration test command lists incl. copilot/generic file inventories) - init.py Next Steps panel + README Core Commands table - tasks.md: T001-T024 complete (T025 manual quickstart pending) Full suite green: 2343 passed. * Record quickstart validation results for /speckit.converge (T025) All six quickstart scenarios validated (GitHub Copilot agent, macOS/zsh): S1 gap->appended traceable task, S2 implement+re-converge, S3 converged leaves tasks.md unchanged, S4 read-only boundaries, S5 missing-prereq stop, S6 cross- integration install (copilot + windsurf). Automated suite: 2343 passed. * Record 2026-06-16 re-verification results for /speckit.converge (T025) * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Fix integration upgrade deleting settings.json and dropping script +x Two upgrade-path bugs surfaced during converge E2E validation: - copilot upgrade stale-deleted .vscode/settings.json because setup() only tracks the file when it creates it; on upgrade the pre-existing file is merged and left untracked, so Phase 2 stale cleanup removed it. Add an integration-level stale_cleanup_exclusions() hook (CopilotIntegration returns {.vscode/settings.json}) and subtract it from stale_keys. - shared .specify/scripts/.sh lost their execute bit because the managed refresh rewrites them with the bundled source mode (often 0o644) and nothing restored perms. Call ensure_executable_scripts() after the managed-refresh block (POSIX only). Add regression tests in TestIntegrationUpgrade covering both fixes (validated to fail without the fixes). fix: resolve markdownlint errors in PR files Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: clean up runtime state files from PR Remove .specify state files that are per-project runtime artifacts: - feature.json, init-options.json, integration.json - manifest files, extension registry, bug artifacts These are generated by 'specify init' and should not be committed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: fold converge artifacts from #3003 and #3005 - Add speckit.converge Copilot agent and prompt files (#3003) - Add regression test for Claude argument hints (#3005) - Remove invalid converge entry from Claude argument hints - Fix documentation removing branch-prefix fallback claims Supersedes: #3003, #3005 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: remove non-converge specify scaffolding from PR Remove .specify/ artifacts, non-converge .github/agents and prompts, and copilot-instructions.md that were generated by 'specify init' and are not part of the converge command feature. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: remove SDD spec artifacts from PR Remove specs/001-converge-command/ — the spec/plan/tasks/research SDD artifacts produced while building this feature. spec-kit does not track a specs/ directory on main (those are outputs of running the workflow on the repo, not part of the shipped tool). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: remove generated Copilot converge command files Remove .github/agents/speckit.converge.agent.md and .github/prompts/speckit.converge.prompt.md — these are generated by 'specify init --integration copilot' from templates/commands/converge.md (all __SPECKIT_COMMAND___/{SCRIPT} tokens are resolved). main tracks no .github/agents or .github/prompts files; the template is the source of truth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> chore: split out unrelated integration-upgrade fix Move the stale_cleanup_exclusions / executable-bit upgrade fix (base.py, copilot, _migrate_commands.py, test_integration_subcommand.py) out of this PR into its own change. This PR is now scoped purely to the /speckit.converge command. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add converge to core command template ordering converge is a core command in SKILL_DESCRIPTIONS but was missing from _CORE_COMMAND_TEMPLATE_ORDER, so it sorted with the fallback rank. Add it after 'implement' to keep core-command ordering consistent across integrations. Addresses review feedback on #3001. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: make converge findings example neutral Replace the self-referential sample evidence text in the Convergence Findings table with a neutral placeholder so agents are less likely to copy nonsensical template-specific findings into real output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * docs: clarify converge scope and hook outcome wording - Remove FR-specific parenthetical from code-scope rule so it doesn't imply a hard FR-001 reference exists in every feature - Replace unsupported 'pass outcome to hook context' instruction with explicit in-session outcome reporting before hook listing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: align converge task example with tasks format Use (no colon) in the convergence task example so it matches tasks-template formatting and downstream expectations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Clarification of usage Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * docs: align converge phase/task-id format with tasks template - Use (colon) for consistency with tasks template - Clarify appended task IDs must be zero-padded ( style) - Update checklist example to a concrete zero-padded ID () Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: standardize converge phase heading format Use consistently in converge.md (including the append-only contract section) to match Step 7 and tasks template style. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-17 14:47:00 -05:00
Ben Buttigieg	84db931f18	fix: preserve .vscode/settings.json and script +x bit on integration upgrade (#3020 ) * fix: preserve .vscode/settings.json and script +x bit on integration upgrade During 'specify integration upgrade', Phase 2 stale-cleanup removes files present in the old manifest but absent from the new one. Copilot's setup() merges into an existing .vscode/settings.json and stops tracking it, so the file was being deleted on upgrade (destroying user settings). Add a stale_cleanup_exclusions() hook that integrations use to protect such conditionally-tracked merge targets. Also restore the executable bit on shared .sh scripts after the managed-refresh step on POSIX. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address review on stale-cleanup fix - Normalize stale_cleanup_exclusions() to POSIX before subtracting from manifest keys, so exclusions built with os.path.join / backslashes still match on Windows. - Strengthen test_upgrade_preserves_existing_vscode_settings to add a user-defined key and assert it survives the upgrade (via --force, exercising the merge + stale-cleanup path) instead of the brittle after == before check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-17 14:22:04 -05:00
Huy Do	affbf5ead5	feat(workflows): add from_json expression filter (#2961 ) * feat(workflows): add from_json expression filter Step outputs captured as strings could never become typed values in templates - the filter set was default/join/map/contains only, so e.g. a fan-out items: could never consume a step's JSON stdout. Add an arg-less from_json pipe filter with parse-or-raise semantics: invalid JSON or non-string input raises a clear ValueError rather than passing through silently. Fixes #2960 * fix(expressions): make from_json strict — reject any arguments Address review (#2961): from_json('x') and from_json() previously fell through to a silent passthrough of the unparsed value. Reject any parenthesized form with a clear error so mis-wired templates fail loudly. Rename test to ...parses_object (JSON under test is an object) and add coverage for the strict no-arguments behavior. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(workflows): document the from_json expression filter Address Copilot review: the user-facing filter references omitted the newly added `from_json` filter. Add it to the ARCHITECTURE.md filter table (with the `{{ steps.emit.output.stdout \| from_json }}` example) and to the filter enumerations in workflows/README.md and docs/reference/workflows.md so the docs match the evaluator's capabilities. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): make from_json strictness reject trailing tokens; fix docstring Address Copilot review: - Strictness only rejected parenthesized forms, so typos like `\| from_json)` or `\| from_json extra` still fell through to the unknown-filter path and silently returned the unparsed value. Match on the leading filter token and require the whole filter to be exactly `from_json`, so every mis-wired form raises. Extend the rejection test to cover the trailing-token cases. - The module docstring claimed "no imports", which is misleading now that the module imports `json`. Reword to state the actual sandbox guarantee: templates cannot do file I/O, import modules, or run arbitrary code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-17 13:43:26 -05:00
Copilot	00bff788c9	Add `init` workflow step to bootstrap projects like `specify init` (#2838 ) * Initial plan * Add init workflow step to bootstrap projects like `specify init` * Address review: simplify stderr capture and extract VALID_SCRIPT_TYPES * Address review: fail fast on non-empty dir, stdout fallback, README force fix * Populate exit_code/stdout/stderr in non-empty-dir fast-fail * fix: address three unresolved review comments in InitStep - Use `with os.scandir(...)` context manager so the iterator is always closed even when `any()` short-circuits, preventing file-descriptor leaks in long-running workflow runs. - Guard `os.chdir(prev_cwd)` in the `finally` block with a try/except so an `OSError` (e.g. directory deleted) doesn't bypass returning the captured `StepResult`. - Reject non-string `script` values in `validate()` with a clear error message, rather than silently passing them through to become `--script True` at runtime. * Potential fix for pull request finding 'Empty except' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * fix: remove no_git and branch_numbering options removed upstream The --no-git and --branch-numbering flags were removed from `specify init` on main. Update InitStep to drop these unsupported config fields and fix tests accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address review — integration defaults, integration_options, engine-owned dirs - Apply DEFAULT_INIT_INTEGRATION fallback when neither step config nor workflow context provides an integration, so output.integration always reflects the actual integration used. - Add integration_options config field to support --integration-options passthrough (required for generic integration and --skills mode). - Exclude .specify/ from the non-empty directory fast-fail check so that here: true works when the engine has already created its run-state directory before steps execute. - Note: mix_stderr=False is not needed — Click 8.2+ captures stderr separately by default and the existing try/except handles access. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: implicitly add --force when only engine-owned dirs exist When the workflow engine creates .specify/workflows/runs/ before steps execute, the directory is technically non-empty. Previously, specify init would prompt for confirmation (hanging in unattended mode) unless the user explicitly set force: true. Now the step detects that only engine-owned directories (.specify/) are present and implicitly adds --force so init proceeds without user interaction. Also fixes the test to exercise the implicit-force path rather than passing force: True explicitly (which bypassed the check entirely). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: derive VALID_SCRIPT_TYPES from shared constant, fail fast on OSError, include all resolved fields in output - Derive VALID_SCRIPT_TYPES from SCRIPT_TYPE_CHOICES in _agent_config so the valid set cannot drift from the specify init CLI. - Fail fast with a clear error when os.scandir() raises OSError (e.g. permission denied) instead of silently treating the directory as empty. - Include preset, force, and ignore_agent_tools in all output dicts (both fast-fail and normal paths) for consistent interpolation and debugging downstream. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: populate stderr from stdout on older Click, fix force comment wording - When Click does not expose result.stderr (older versions where stderr is mixed into stdout), use stdout as stderr on non-zero exit so workflows can consistently read steps.<id>.output.stderr for errors. - Update README inline comment for force: wording to say 'when target directory already exists' rather than 'non-empty directory', matching the actual specify init behavior for the project: form. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: build argv flags before early returns, use any() for dir scan - Move argv flag-building (--integration, --script, --preset, --ignore-agent-tools) before the non-empty-dir and OSError early returns so output['argv'] always reflects the complete command. - --force is appended after the check since it may be set implicitly. - Replace list comprehension with any() generator expression to short-circuit without allocating a full list of DirEntry objects. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: only treat .specify as engine-owned when it is a real directory A file or symlink named .specify should not be excluded from the non-empty check. Use entry.is_dir(follow_symlinks=False) to ensure only an actual directory is considered engine-owned content. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: guard implicit force for engine dirs only, fix integration fallback order - Only set implicit --force when engine-owned directories (.specify/) are actually present. A completely empty directory no longer gets --force added unnecessarily. - Fix integration resolution precedence: resolve step config expression first, then fall back to workflow default (also resolved), then to DEFAULT_INIT_INTEGRATION. Previously, a step expression resolving to falsy would bypass the workflow default entirely. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Manfred Riem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-17 11:46:51 -05:00
Huy Do	071f784dfa	feat(workflows): opt-in output_format: json exposes parsed shell stdout as output.data (#2963 ) * feat(workflows): opt-in output_format: json exposes parsed shell stdout as output.data No step that runs external code could hand a typed value to a later step, so e.g. a fan-out could never consume a runtime-computed collection. With output_format: json declared, stdout is parsed and exposed under output.data (raw keys unchanged); a parse failure fails the step with a clear error. Without the key, behavior is unchanged. Reference implementation for the proposal in #2962. Addresses #2962 * test(shell): emit JSON via sys.executable for cross-platform output_format tests Address review (#2963): replace non-portable echo '{...}' (Windows cmd.exe keeps the single quotes, breaking JSON) with the established '"{py}" "{script}"' pattern using sys.executable + a temp script, so the output_format tests pass on the Windows CI matrix. Also make the validate test's run inert (exit 0). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-17 08:09:17 -05:00
Huy Do	1ee2b626a8	fix: non-zero exit code when a workflow run ends failed or aborted (#2959 ) * fix: non-zero exit code when a workflow run ends failed or aborted workflow run and workflow resume printed Status: failed (or emitted the --json payload) and exited 0, so scripts and CI could not rely on the process exit code. Map terminal outcomes: failed\|aborted -> 1, completed\|paused -> 0, on both the text and --json paths. The previous exit-0-on-failed behavior was pinned by test_workflow_run_failing_yaml_without_project; the pin is updated to the new contract. Fixes #2958 * test: portable exit-code step commands + cover resume failed->exit-1 Address review (#2959): replace non-portable run: 'true'/'false' with 'exit 0'/'exit 1' (Windows cmd.exe has no true/false builtins under shell=True), and add an end-to-end 'workflow resume' test asserting a resumed failed run exits non-zero. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-17 08:02:00 -05:00
Seiya Kojima	811a3aa447	fix(skills): preserve non-ASCII characters in skill frontmatter (#2917 ) * fix(skills): preserve non-ASCII chars in skill frontmatter Skill SKILL.md frontmatter descriptions containing non-ASCII characters were escaped to \uXXXX / \xXX sequences because yaml.safe_dump() was called without allow_unicode=True. - Add allow_unicode=True to the 7 skill/command frontmatter safe_dump sites (extensions, presets, claude integration) - Add regression tests for the render and extension-install paths Follows the approach of #1936; encoding="utf-8" is already set on the affected write paths, so no encoding change is needed here. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(_utils): add dump_frontmatter helper Centralize skill/command frontmatter YAML serialization into a single _utils.dump_frontmatter helper so no call site can drop allow_unicode or diverge on formatting. Route the 7 existing sites through it and drop a now-unused local yaml import. Switch the extension test fixtures to yaml.safe_dump for parity with the production safe-dump/safe-load codepaths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-17 07:57:54 -05:00
Jina Park	de18d21b1c	fix: prevent extension self-install from deleting source dir (#2990 ) (#2991 ) * fix: prevent extension self-install from deleting source dir (#2990) `specify extension add <path> --dev --force` permanently deleted the extension directory without registering it when the source path resolved to the extension's own install location (`.specify/extensions/<id>`). With `--force`, `install_from_directory()` removed the existing installation (the source) and then `shutil.copytree()` tried to copy from the now-deleted directory, destroying it and crashing. Add a guard that fails fast with a clear ValidationError when the resolved source path equals the install destination, before any destructive operation runs. Includes a regression test asserting the directory and its contents survive. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix: harden extension self-install guard --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-17 07:56:17 -05:00
Manfred Riem	75aee19c6e	fix: disable Rich Live transient mode on Windows to prevent PS 5.1 hang (#2938 ) * fix: disable Rich Live transient mode on Windows to prevent PS 5.1 hang PowerShell 5.1's legacy console host does not reliably support VT escape sequences. Rich's Live(transient=True) attempts cursor restoration on context exit, which hangs indefinitely on that console. Set transient=False when sys.platform == 'win32' in both init.py (progress tracker) and _console.py (select_with_arrows). The only cosmetic effect is that progress output remains visible after completion on Windows. Fixes #2927 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: address review feedback on test quality - Use captured['transient'] instead of .get() for clearer KeyError on failure - Source guards now assert both the platform check AND transient=_transient usage - Remove unused imports (MagicMock retained as it's used, removed pytest) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: use regex in source guards for resilience to formatting changes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: use single DOTALL regex to verify assignment flows into Live() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: skip duplicate tracker print on Windows when transient=False When transient is False, Rich leaves the Live output on screen. The subsequent console.print(tracker.render()) would duplicate it. Gate it behind _transient so Windows users see the tracker exactly once. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-17 06:48:42 -05:00
Copilot	c52ccd7dc7	Add workflow step catalog — community-installable step types (#2394 ) * Initial plan * Add workflow step catalog: StepRegistry, StepCatalog, CLI commands, and tests Agent-Logs-Url: https://github.com/github/spec-kit/sessions/2885e646-477d-4df8-b9a3-06d8cb29e748 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Potential fix for pull request finding 'An assert statement has a side-effect' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * Address PR review: path traversal, cache robustness, collision check, failed-to-load display - Add resolve()+relative_to() path traversal guards in workflow_step_add and workflow_step_remove to prevent directory escape via step_id - Harden _is_url_cache_valid in both StepCatalog and WorkflowCatalog to coerce fetched_at to float and catch TypeError/ValueError - Check STEP_REGISTRY and StepRegistry before installing to prevent collisions with built-in step types or already-installed steps - Show 'Custom (installed, failed to load)' section in workflow step list for steps in the registry that failed to load into STEP_REGISTRY * Fix StepRegistry shape validation and StepCatalog empty-YAML handling Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Polish: rename _default to default_registry, strengthen unreadable-file test Agent-Logs-Url: https://github.com/github/spec-kit/sessions/0dca6393-f5a9-40de-bb5c-77ba6af033d2 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Address PR review: atomic install, hostname validation, cache resilience, no dynamic imports in list/info Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Fix shutil.move with existing step_dir: remove before move to avoid subdirectory nesting Agent-Logs-Url: https://github.com/github/spec-kit/sessions/3e18fef0-e2e6-4b3e-9e8d-9adb1e5e464e Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Call load_custom_steps at execution time; enforce hostname in _safe_fetch and _validate_url Agent-Logs-Url: https://github.com/github/spec-kit/sessions/73865880-fb25-4061-a43e-4e4b4d1c4de6 Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Wrap YAML parsing in try/except; atomic step install via os.rename() under same fs Agent-Logs-Url: https://github.com/github/spec-kit/sessions/ff915bc5-ec7e-4e6a-b505-35f5795250df Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Validate YAML root is a dict in _load_catalog_config and workflow_step_add; fix WorkflowCatalog hostname validation Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> * Fix load_custom_steps() package imports and add reserved step ID validation * Move _re/_sys imports out of loop and _RESERVED_STEP_IDS to module level * Address review: collision-resistant module names, extra_files support, remove orphan dir * Harden extra_files: warn on non-dict, resolve symlinks in path traversal check * Switch _safe_fetch and StepCatalog._fetch_single_catalog to use open_url for auth consistency * Harden step_id validation against path-segment tricks; raise on StepRegistry.save() OSError * Clean up sys.modules on broken step packages; handle StepValidationError in step add/remove * Address review thread: int-coerce priorities, sys.modules cleanup, _require_specify_project, registry-first remove * fix: normalize workflow step catalog metadata fallbacks * fix: address latest workflow step and catalog review findings * Handle non-string extra_files keys in workflow step add * Harden StepRegistry symlink reads and extra_files path/URL validation * Harden custom step loader and step remove against symlinks and OSError * Fix StepCatalog.search() to coerce non-string fields before joining * Fix WorkflowCatalog YAML parsing error handling and isinstance checks * Harden step registry save and custom step/catalog ID handling * Harden cache validation and staging OSError handling * Address review: reorder symlink guard and split mixed test - Move symlink-parent check before is_dir() in load_custom_steps() so we never stat an external target through a symlink - Split test_get_merged_steps_normalizes_list_ids_to_strings into two focused tests: one for list-id normalization, one for get_step_info return values Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review: symlink-before-stat in loader, restore registry on rmtree failure - load_custom_steps(): check is_symlink() before is_dir() on step directories so symlinked entries are skipped without statting external targets - workflow_step_remove: restore the registry entry when shutil.rmtree() fails so filesystem and registry state stay consistent and a future 'step add' isn't blocked Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Harden step_id validation and file-write error handling - _validate_step_id_or_exit: reject whitespace-only/padded IDs, Windows-invalid characters (<>:"\|?), control characters, trailing dots/spaces, and Windows reserved device names (con, nul, etc.) - Wrap step.yml/__init__.py staging writes in OSError handler - Wrap extra_files disk writes (mkdir + write_bytes) in OSError handler that names the failing relative path - Registry rollback on rmtree failure: restore verbatim metadata and emit a warning if the restore itself fails Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Address review: cache symlink guard, verbatim registry rollback, Windows test fix - StepCatalog: add _is_cache_path_safe() guard that checks for symlinks in .specify/workflows/steps/.cache path; skip cache reads and writes when any component is symlinked to prevent writes outside project root - Registry rollback: write metadata directly to registry.data['steps'] and call save() instead of using add() which overwrites timestamps - temp_dir fixture: use ignore_errors=True on Windows to avoid flaky teardown from locked file handles (WinError 32) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Simplify exec_module call by removing redundant nested try/except Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix empty YAML tolerance in WorkflowCatalog.add_catalog, scope ignore_errors to Windows - WorkflowCatalog.add_catalog(): treat None from yaml.safe_load() (empty file) as an empty mapping instead of raising 'corrupted' - temp_dir fixture: limit ignore_errors to sys.platform == 'win32' so real cleanup issues surface on non-Windows platforms Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Chain exceptions in _load_catalog_config for both catalog classes Add 'from exc' to preserve root cause in tracebacks while keeping clean user-facing messages. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Make default catalog tests hermetic by isolating HOME Monkeypatch Path.home() to project_dir and clear catalog env vars so tests don't break on machines with a real ~/.specify/step-catalogs.yml or ~/.specify/workflow-catalogs.yml. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix falsy ID handling in _get_merged_steps for list-based catalogs Check for None explicitly instead of using 'or' which drops valid falsy IDs like 0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Compare reserved step IDs case-insensitively for filesystem safety On case-insensitive filesystems (Windows, common macOS), variants like STEP-REGISTRY.JSON would collide with the actual registry file. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add explanatory comments to intentional empty except blocks Document why cache-read failures are silently ignored in both WorkflowCatalog and StepCatalog _fetch_single_catalog methods. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-16 18:03:45 -05:00
Pascal THUET	9cd20c6c25	feat(dev): add integration scaffolder (#2685 ) * feat(dev): add integration scaffolder * fix(dev): address integration scaffold review feedback * fix(dev): address scaffold follow-up review * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(dev): default scaffolded integrations to multi_install_safe = False The scaffold template emitted `multi_install_safe = True` alongside a placeholder `context_file = "AGENTS.md"`. Registered as-is, that violates the registry contract (test_safe_integrations_have_distinct_context_files): codex already pairs AGENTS.md with multi_install_safe = True, so the generated boilerplate would collide on first registration. Default the scaffold to False (matching IntegrationBase) so generated code is registry-test-friendly out of the box; contributors opt in once they pick a unique context_file. Aligns the generated test skeleton and both scaffold tests, which previously contradicted each other (one expected True, one False). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(dev): harden scaffold writes and accept case-insensitive --type - Guard scaffold_integration() against symlinked target directories: walk each path component under the repo root and refuse symlinked dirs, then confirm the write destination resolves inside the repo (mirrors the manifest directory guard). Prevents scaffolding outside the repo when a contributor's integrations/tests path is symlinked. - Make the `--type` click.Choice case-insensitive so `--type YAML` is accepted, matching scaffold_integration()'s strip()/lower() normalization instead of rejecting at the CLI layer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(dev): report scaffold filesystem failures as a clean CLI error The `dev integration scaffold` command only caught FileExistsError/ValueError, so an OSError raised during mkdir()/write_text() (permission denied, read-only checkout, a path component that is a file, ...) bubbled up as a traceback instead of a clean error + exit code. Broaden the handler to OSError (which also covers FileExistsError) and add coverage for the filesystem-error path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(dev): move scaffold command under integration * fix(dev): roll back partial scaffold writes * fix(dev): correct lint docs and generated test docstring - local-development.md: ruff check src/ is enforced in CI, not absent - scaffolded test docstring: drop misleading 'scaffold' wording * fix(scaffold): create only leaf integration directory --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 17:48:40 -05:00
Alicia Sykes	6d057b6239	fix(tests): don't run PowerShell tests via WSL-interop powershell.exe (#2971 ) * fix(tests): don't run PowerShell tests via WSL-interop powershell.exe * fix(tests): applies copilot feedback, with rename	2026-06-16 17:36:24 -05:00
Ahmet TOK	1150d32aee	Add Zed integration (#2780 ) * feat: add Zed integration * fix: update integrations stats grid to 31 for consistency * fix: address Copilot review feedback - Remove non-actionable --skills flag from ZedIntegration (Zed is always skills-based, like Agy) - Align zed_skill_mode predicate with ai_skills for consistency across init output and hook rendering - Consolidate claude/cursor/zed slash-skill return blocks in _render_hook_invocation to reduce duplication - Override test_options_include_skills_flag for Zed (no --skills flag) * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: address Copilot review round 2 - Make zed_skill_mode unconditional in hook rendering (Zed is always skills-based, no --skills option) - Add test_init_persists_ai_skills_for_zed that exercises the actual CLI init path and verifies HookExecutor renders /speckit-plan without manual init-options manipulation * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: address copilot review feedback for zed integration - Update integration count from 31 to 33 in docs/index.md (32 integrations + Generic) - Make zed_skill_mode unconditional to match extensions.py behavior - Consolidate slash-skill integrations into a set for consistency - Move os import to module level in test_integration_zed.py * fix: refine slash-skill logic and ai-skills validation - Fix slash-skill integrations: Claude/Cursor require ai_skills=true; Zed/Agy/Devin are always skills - Allow --ai-skills with --integration (not just --ai) to fix validation error * fix: remove unused variables and update ai-skills help text - Add agy_skill_mode and devin_skill_mode variables to fix F841 lint error - Use all skill mode variables in the slash-skill conditional check - Update --ai-skills help text to reflect it works with --integration too * fix: add trae_skill_mode to hook invocation for consistency Trae is a SkillsIntegration like Zed/Agy/Devin, so it should also be treated as always-skills-based in hook invocation rendering. * fix: make Agy always skills-based for consistency AgyIntegration is a SkillsIntegration subclass with no --skills option, so it should be treated as always skills-based (like Zed, Devin, Trae). This aligns init.py skill mode detection with extensions.py hook rendering. * fix: gate agy_skill_mode and refactor _render_hook_invocation to use sets Addressed Copilot review comments: - Restored _is_skills_integration guard on agy_skill_mode in init.py to be defensive about runtime integration type. - Refactored _render_hook_invocation() in extensions.py to use always_slash/conditional_slash frozensets instead of individual per-agent booleans, eliminating unused variables (F841) and making it harder for conditions to drift between integrations. - Centralized slash-skill determination so adding a new unconditional slash-skill integration is a one-key addition. * fix: address latest Copilot review comments - Added copilot to CONDITIONAL_SLASH_AGENTS for consistent hook invocation rendering with init.py - Moved always_slash/conditional_slash frozensets to module scope to avoid per-call reallocation - Replaced manual os.chdir() with monkeypatch.chdir() in test - Overrode test_options_include_skills_flag for Zed (no --skills) * fix: address latest Copilot review comments - Removed redundant local import yaml in _register_extension_skills (yaml is already imported at module scope) - Split --ai-skills usage hint into two separate print statements for better readability - Changed integrations count from '33' to '30+' to avoid future drift * fix: re-add _is_skills_integration definition lost in merge The _is_skills_integration variable was accidentally dropped during the web UI merge resolution of upstream/main's removal of legacy --ai flags. Re-added the definition via isinstance(resolved_integration, SkillsIntegration) check so that skill-mode booleans work correctly. * fix: gate zed_skill_mode on _is_skills_integration for consistency Aligns zed_skill_mode with the other skills-based agents (codex, claude, cursor-agent, copilot) which all use _is_skills_integration gating. Since ZedIntegration extends SkillsIntegration, behavior is unchanged. * fix: remove unused claude_skill_mode and cursor_skill_mode locals in _render_hook_invocation These variables became unused after the refactor to ALWAYS_SLASH_AGENTS / CONDITIONAL_SLASH_AGENTS sets. Claude and Cursor-Agent are now handled by the CONDITIONAL_SLASH_AGENTS path, so the separate boolean locals are dead code. Fixes ruff F841 and addresses Copilot review feedback that was repeated across multiple review rounds. * fix: align agy/trae invocation format in init next-steps with hook rendering and build_command_invocation - Moved agy and trae from '-<name>' (dollar/Codex format) to '/speckit-<name>' (slash format) in _display_cmd() to match: - HookExecutor._render_hook_invocation() (ALWAYS_SLASH_AGENTS for trae, CONDITIONAL_SLASH_AGENTS for agy) - SkillsIntegration.build_command_invocation() (default: /speckit-<name>) - The '$' prefix is specific to Codex; all other skills agents use '/'. * fix: address Copilot review comments on hook invocation consistency - Add is_slash_skills_agent() helper to extensions.py to centralize the agent-to-invocation-format mapping, reducing drift risk between HookExecutor._render_hook_invocation() and init.py _display_cmd() - Use the shared helper in both locations; init.py now imports and delegates to is_slash_skills_agent() instead of maintaining its own per-agent boolean matrix - Fix test_hooks_render_skill_invocation to use ai_skills=False, proving Zed renders /speckit-<name> unconditionally - Add parameterized TestSlashSkillsSets covering all agents in ALWAYS_SLASH_AGENTS and CONDITIONAL_SLASH_AGENTS with ai_skills both true and false * fix: address Copilot review comments on type safety and test API - Make is_slash_skills_agent() accept str \| None to match its call sites (init_options.get("ai") can return None) - Refactor TestSlashSkillsSets to use public execute_hook() API instead of private _render_hook_invocation() method * fix: address Copilot review comments on typing and naming clarity - Add from __future__ import annotations to extensions.py so PEP 604 unions (str \| None) are safe regardless of Python version - Add clarifying _ai_skills_enabled local variable in init.py's _display_cmd() to make the semantic meaning explicit when passing it to is_slash_skills_agent() * fix: move invocation-style logic into shared _invocation_style module - Extract ALWAYS_SLASH_AGENTS, CONDITIONAL_SLASH_AGENTS, and is_slash_skills_agent() from extensions.py into new _invocation_style.py module, eliminating the awkward init.py -> extensions.py import dependency for invocation-style decision logic - Both HookExecutor._render_hook_invocation() and init.py _display_cmd() now import from the shared module instead of one subsystem importing from the other - Revert /SKILL.md change: the leading slash is semantically significant (path component vs filename suffix) * fix: add None guard before i.options() in test_options_include_skills_flag get_integration() returns IntegrationBase \| None, so i.options() is a type error without a None check. * fix: override test_options_include_skills_flag for Zed (always skills, no --skills flag) Zed is always skills-based and doesn't expose a --skills option. Override the inherited base test to assert --skills is absent. * fix: rename test and skip inherited test_options_include_skills_flag for Zed - Skip inherited test_options_include_skills_flag (not applicable — Zed is always skills-based with no --skills flag) - Add test_options_do_not_include_skills_flag with correct name matching the assertion (--skills is absent) * fix: add defensive non-string check in is_slash_skills_agent Reject non-string values for selected_ai to prevent TypeError from set membership checks when persisted init-options contain corrupted data (e.g. list or dict instead of string). --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-16 17:29:08 -05:00
Huy Do	36fd5f6f49	fix: fail loudly when a fan-out 'items' expression does not resolve to a list (#2957 ) A non-list result from the items expression is a wiring error (the template did not resolve to a collection); silently fanning out over zero items hides it until a confusing downstream failure. Fail the step with an error naming the expression instead. An explicit empty list remains valid input. Fixes #2956	2026-06-16 15:33:11 -05:00
darion-yaphet	f20e8ee6f7	refactor: move preset command handlers to presets/_commands.py (PR-6/8) (#2826 ) * refactor(presets): convert presets.py module to presets/ package Pure structural move to mirror integrations/. presets.py becomes presets/__init__.py with relative imports rebased one level deeper. No behavior change; public import surface (from .presets import ...) preserved. Prepares for co-locating preset command handlers in PR-6/8. * refactor: move preset command handlers to presets/_commands.py (PR-6/8) Cut the preset_app / preset_catalog_app Typer groups and all 12 command handlers out of __init__.py into presets/_commands.py, exposing register(app) — mirrors the integration co-location from PR-5. __init__.py now registers via _register_preset_cmds(app), dropping ~620 lines (3282 -> 2663). Handlers lazy-import root helpers (_require_specify_project, get_speckit_version, _locate_bundled_preset, _display_project_path) via 'from .. import' so test monkeypatching of specify_cli.<helper> keeps working. _locate_bundled_preset kept as an explicit re-export in __init__.py for that resolution path. CLI surface and public imports unchanged. Full suite: 3162 passed, 40 skipped.	2026-06-16 14:52:12 -05:00
Ali jawwad	63a2a17305	fix(extensions): preserve argument-hint in extension Claude SKILL.md (#2916 ) Extension-provided commands that declare `argument-hint:` in their frontmatter had that field dropped from the generated Claude `.claude/skills/<name>/SKILL.md`, while core template commands keep it. The extension skill generator built the frontmatter via the shared build_skill_frontmatter() (name/description/compatibility/metadata only) and never forwarded argument-hint. Carry argument-hint from the parsed source command frontmatter into the skill frontmatter dict before serialization, gated on the integration exposing inject_argument_hint so only argument-hint-aware agents (Claude) receive the key and build_skill_frontmatter's shared shape stays unchanged for every other agent. The value is injected into the dict rather than via the string-based inject_argument_hint helper, so a folded multi-line description cannot be split into invalid YAML. Add regression tests covering a folding description (Claude) and the non-Claude gate (kimi). Closes #2903 Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-11 07:28:39 -05:00
darion-yaphet	36ad3cde1b	fix(presets): harden preset URL installs against unsafe redirects (#2911 ) * Harden preset URL installs against unsafe redirects Preset URL installs already rejected non-HTTPS source URLs, but the authenticated opener follows redirects. Validate the final response URL before writing the ZIP, preserve GitHub release asset URL resolution after the preset command module split, stream the response to disk, and keep catalog config serialization on safe YAML output. Constraint: open_url follows redirects, so source URL validation alone does not constrain the downloaded target Rejected: Keep response.read() for simplicity \| large preset downloads should not be buffered entirely in memory Confidence: high Scope-risk: narrow Directive: Keep preset URL policy aligned with workflow installer redirect validation Tested: uvx ruff check src/specify_cli/__init__.py src/specify_cli/presets/__init__.py src/specify_cli/presets/_commands.py tests/test_presets.py Tested: uv run pytest tests/test_presets.py -q Not-tested: Real network redirect integration against a live HTTP server Co-authored-by: OmX <omx@oh-my-codex.dev> * Reject malformed preset download URLs Preset downloads should fail early when a URL lacks a hostname, even if the scheme is HTTPS. The redirect error now describes any disallowed target instead of implying that only non-HTTPS redirects are blocked. * Prevent credentialed preset redirects from downgrading transport Preset URL downloads already checked the final URL after urllib followed redirects, but that was too late for authenticated requests because same-host redirects could preserve Authorization during the redirect itself. The authenticated HTTP helper now supports an opt-in redirect validator, and preset downloads use it to reject disallowed redirect targets before following them. The redirect auth handlers also stop preserving credentials across HTTPS to non-HTTPS downgrades as defense in depth. * test(presets): 修复 URL 解析测试 mock 缺少 redirect_validator 参数重定向安全加固为 open_url 新增 redirect_validator 参数，两处 fake_open_url mock 签名未同步导致 TypeError。补齐参数后全部 3717 个测试通过。 --------- Co-authored-by: OmX <omx@oh-my-codex.dev>	2026-06-11 07:21:50 -05:00
Manfred Riem	5ae7ff53d0	fix: skip recovered files during refresh_managed overwrite check (#2918 ) (#2919 ) _is_managed() in install_shared_infra now consults manifest.is_recovered() before treating a hash-matching file as managed. Files marked recovered (pre-existing on disk, not installed by Spec Kit) are no longer overwritten by integration use/switch even when their hash matches the manifest entry. This closes the gap documented in the manifest API: callers using refresh_managed MUST check is_recovered first. Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-10 16:39:57 -05:00
Manfred Riem	40e48ed22c	feat: add category and effect as first-class fields in extension schema (#2899 ) * feat: add category and effect as first-class fields in extension schema Add `category` and `effect` as optional fields in the extension schema (`extension.yml`) and community catalog (`catalog.community.json`). Schema changes: - Valid categories: docs, code, process, integration, visibility - Valid effects: read-only, read-write - Both fields are optional (backward-compatible with existing extensions) - Validation raises ValidationError for invalid values when present Propagation: - Added `category` and `effect` to all 108 entries in catalog.community.json (populated from the existing docs/community/extensions.md table) - Updated extension template with commented category/effect fields - Updated add-community-extension skill with new JSON template fields - Updated `specify extension info` CLI output to display category/effect - Added properties to ExtensionManifest class Tests: - test_valid_category: all 5 category values pass - test_valid_effect: both effect values pass - test_invalid_category: invalid value raises ValidationError - test_invalid_effect: invalid value raises ValidationError - test_category_and_effect_optional: omitting fields still works Closes #2874 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: make category free-form, keep effect validated Category is a free-form string (only validated as non-empty when present), while effect remains restricted to 'read-only' or 'read-write'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address PR review feedback - Add type guard before 'in' check for effect to prevent TypeError on unhashable YAML values (list/dict) - Comment out category/effect in template so authors must opt in - Use VALID_EFFECTS constant in test instead of hard-coded values Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: update category docstring to reflect free-form semantics Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: clarify canonical extension effect values --------- Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-06-10 07:44:27 -05:00
Quratulain-bilal	de88c23bb6	fix(catalogs): validate extension and preset catalog payload shape (#2621 ) * fix(catalogs): validate extension and preset catalog payload shape `ExtensionCatalog._fetch_single_catalog` and `PresetCatalog._fetch_single_catalog` only check that the `extensions` / `presets` key is present in the parsed catalog JSON. They don't check that the value is a JSON object, and they don't check that the root is a JSON object at all. A malformed (or compromised) upstream catalog returning: {"schema_version": "1.0", "extensions": []} passes both `"extensions" not in catalog_data` and the subsequent `response.read()` JSON parse, gets cached on disk, and then crashes deep inside `_get_merged_extensions` (resp. `_get_merged_packs`) with: AttributeError: 'list' object has no attribute 'items' instead of the existing user-facing `ExtensionError("Invalid catalog format from <url>")` / `PresetError("Invalid preset catalog format")` that the surrounding code is clearly trying to produce. The sibling integration-catalog reader already validates this — see `src/specify_cli/integrations/catalog.py` where the fetch path explicitly checks both `isinstance(catalog_data, dict)` and `isinstance(catalog_data.get("integrations"), dict)` before returning. This change mirrors that pattern in the extension and preset readers so the three catalog fetchers stay consistent and a malformed upstream surfaces as the user-facing error instead of a raw Python traceback. Adds parametrized regression tests covering: - root payload is not a JSON object (list, str, int, null) - root is a dict but `extensions` / `presets` value is the wrong type (list, str, null, int) All eight bad-payload shapes now raise the expected catalog error. * fix(catalogs): skip non-mapping entries during extension and preset merge Addresses Copilot review feedback on this PR. `_fetch_single_catalog` now validates that the ``extensions`` / ``presets`` value is a mapping, but it doesn't (and shouldn't) validate every entry inside that mapping. A payload like: {"schema_version": "1.0", "extensions": {"good": {...}, "bad": []}} passes the fetch-level guard, then later crashes inside ``_get_merged_extensions`` (resp. ``_get_merged_packs``) at ``{*ext_data, ...}`` with ``TypeError: 'list' object is not a mapping``. The sibling integration-catalog reader at ``src/specify_cli/integrations/catalog.py:245`` handles this with a per-entry ``isinstance(integ_data, dict)`` skip during merge, so one malformed entry doesn't poison an otherwise valid catalog. This change mirrors that pattern in the extension and preset mergers and adds regression tests asserting that valid entries continue to merge while malformed siblings are silently dropped. fix(catalogs): validate cached extension and preset payload shape Addresses Copilot review feedback on this PR (round 2). The earlier commits in this branch added payload-shape validation on the network fetch path. The cache-hit path still returned ``json.loads(cache_file.read_text())`` directly without re-checking the shape, so a cache poisoned by an older spec-kit version (or a manual edit, or an upstream that briefly served a bad payload before the network guards landed) would re-crash every invocation of ``_get_merged_extensions`` / ``_get_merged_packs`` with ``AttributeError: 'list' object has no attribute 'items'`` despite the cache being "valid" by age. Extracts the shape validation into ``_validate_catalog_payload`` on both ``ExtensionCatalog`` and ``PresetCatalog``, and calls it from both the cache-load and network-fetch branches of ``_fetch_single_catalog``. If the cached payload fails validation, the cache read is treated like a ``json.JSONDecodeError`` — the cached value is discarded and the function falls through to the network fetch, which refreshes the cache with a clean payload on success. Never propagates ``AttributeError`` to the caller. Regression tests parametrize the four root-bad-type variants plus three ``extensions``/``presets``-bad-type variants per file, asserting that a poisoned cache silently recovers via network refetch and returns the freshly-fetched payload. * fix(catalogs): include URL in missing-keys error to match sibling branches Addresses Copilot review feedback on this PR (round 3). ``_validate_catalog_payload`` advertises in its docstring that the catalog URL is included in error messages "so the user can tell which catalog in a multi-catalog stack is malformed" — but the missing-keys branch raised ``PresetError("Invalid preset catalog format")`` without the URL, breaking that contract and making multi-catalog debugging harder. The root-bad-type and nested-bad-type branches in the same helper already include the URL; this commit brings the middle branch in line. For consistency, the same fix is applied to the legacy single-catalog fetch paths in ``ExtensionCatalog.fetch_catalog`` and ``PresetCatalog.fetch_catalog`` (where the URL was likewise dropped from the missing-keys error). The existing regex matchers in the regression tests target the ``"Invalid (preset )?catalog format"`` prefix, which is preserved verbatim before the ``from <url>`` suffix — no test changes needed. * fix(catalogs): broaden cache except tuples and reuse validator in fetch_catalog Addresses Copilot review feedback on this PR (round 4): 1. ``ExtensionCatalog.fetch_catalog`` and ``PresetCatalog.fetch_catalog`` — the legacy single-catalog methods — still only checked key presence. A payload like ``42`` (root non-object) crashed with ``TypeError: argument of type 'int' is not iterable`` during the ``"schema_version" in catalog_data`` check, and an entry mapping of the wrong type crashed downstream. Both now reuse ``_validate_catalog_payload`` so the network-side behaviour of the legacy methods stays consistent with the multi-catalog ``_fetch_single_catalog`` path. (Copilot #3335623482, #3335623556.) 2. The cache-read ``except`` tuples in ``_fetch_single_catalog`` and ``fetch_catalog`` were too narrow. ``read_text`` can raise ``OSError`` (permissions / disk / handle limit) or ``UnicodeError`` (cache file written by an older client in a different encoding) in addition to ``json.JSONDecodeError``. Without those in the tuple, an unreadable cache crashed the caller instead of falling through to the network refetch the cache contract documents. Both sites now catch ``(json.JSONDecodeError, OSError, UnicodeError, <DomainError>)``. (Copilot #3335623588, #3335623608.) 3. While here, pinned ``encoding="utf-8"`` on every cache ``read_text`` call so cache files written by an older Windows client (with a non-UTF-8 default locale) decode the same way on a newer client. Regression tests: - ``test_fetch_catalog_rejects_malformed_payload`` — 7 parametrized payloads per file covering root-non-object + nested-bad-type variants asserting ``fetch_catalog`` raises the named domain error. - ``test_fetch_catalog_recovers_from_unreadable_cache`` — writes ``b"\xff\xfe\x00not-utf-8"`` to the cache file and asserts ``fetch_catalog`` silently falls through to the mocked network and returns the freshly-fetched payload. * fix(catalogs): harden cache-validity checks and pin UTF-8 on writes The cache-best-effort contract added in `7f44b25` was incomplete on two points raised by Copilot: 1. The cache-validity helpers (is_cache_valid / _is_url_cache_valid, plus the inline metadata-age check inside _fetch_single_catalog for per-URL caches) read the metadata file without specifying an encoding and only caught json.JSONDecodeError / ValueError / KeyError / TypeError. A metadata file written by a tool using the system locale codec, or one whose handle is briefly unavailable, would raise UnicodeDecodeError / OSError and propagate past the read-side try/except in fetch_catalog — the very crash the read-side guard was meant to prevent. The validity checks now read with encoding="utf-8" and treat OSError / UnicodeError as cache-invalid, matching the documented contract. 2. The network-fetch path wrote the cache and metadata files with bare write_text(...), picking up the platform default encoding. The read path was already pinned to UTF-8 (and the integrations/catalog.py:193-203 sibling writes UTF-8 too), so on hosts whose default codec isn't UTF-8 the write/read pair could disagree and force an unnecessary refetch on every invocation. All four write_text calls now pass encoding="utf-8" so the cache survives a round trip on any platform. Also rewords the misleading # Fetch from network comment in extensions.fetch_catalog — it sat above the cache-check block, which read as if the cache step had been skipped. Tests ----- Adds two parametrized regression tests per catalog: * test_fetch_catalog_recovers_from_unreadable_metadata plants non-UTF-8 bytes in the metadata file, asserts is_cache_valid() returns False (rather than raising), and confirms fetch_catalog falls through to the network instead of crashing. * test_fetch_catalog_writes_cache_as_utf8 round-trips a payload containing a non-ASCII identifier (café) through the public fetch path and reads the cache back with read_text(encoding="utf-8"), catching encoding drift at the byte level rather than relying on the system codec to happen to be UTF-8. Both pairs follow the established sibling-file symmetry — the extension and preset suites stay in lock-step. * test(catalogs): assert UTF-8 write encoding by recording write_text kwargs Copilot's review on this PR caught that test_fetch_catalog_writes_cache_as_utf8 claimed to validate UTF-8 at the byte level but actually only round-tripped a non-ASCII string through json.dumps/read_text. Because json.dumps defaults to ensure_ascii=True, 'café' was serialized as the all-ASCII escape 'caf\u00e9' before reaching write_text — the bytes on disk were identical regardless of the encoding kwarg, so a locale-encoded write would have round-tripped just fine. The drift guard the test name advertised was not actually being enforced. Rewriting these tests to observe the production code's argument directly: each test now monkey-patches pathlib.Path.write_text with a recorder that captures the encoding kwarg for every call, runs the production fetch, and asserts every write into the cache directory passed encoding='utf-8'. That is the substantive thing the regression guard cares about — non-ASCII payload tricks were the wrong lever to pull, because json.dumps was masking the encoding choice before write_text ever ran. Both tests verified locally against the current production code (492 passed in the extensions+presets suites) and confirmed to fail against a synthetic no-encoding write (the recorder records None instead of 'utf-8', the assertion catches it). Same change applied symmetrically to test_extensions.py and test_presets.py to keep the sibling files in lockstep with the production code paths in extensions.py and presets.py. * fix(catalogs): catch AttributeError on non-mapping cache metadata; drop stale line refs Copilot's review on the previous push pointed out that the cache-validity helpers still had a gap: metadata.get("cached_at", "") assumes metadata is a dict, but json.loads happily parses a file containing [] / "oops" / 42 / true / null into a non-mapping. The except tuple covered json.JSONDecodeError, OSError, UnicodeError, ValueError, KeyError and TypeError but not AttributeError, so a valid-JSON-but-non-dict metadata payload would still crash the caller instead of degrading to "cache invalid" as the docstring promised. This affected four cache-validity sites — symmetric across the two catalog modules: * extensions.py — inline per-URL metadata-age check in _fetch_single_catalog * extensions.py — is_cache_valid (legacy default-URL path) * presets.py — _is_url_cache_valid * presets.py — is_cache_valid All four except tuples now include AttributeError with a comment naming the exact failure (metadata.get(...) on a non-mapping) so the next reader doesn't have to reconstruct the reasoning. Separately, Copilot flagged that several comments hard-coded a line range from a sibling file (integrations/catalog.py:193-203) — those references will go stale the moment that file changes. Replaced the hard-coded ranges with file-only references (integrations/catalog.py) so the pointer stays accurate as that file evolves. Same change applied to both modules. Tests ----- test_is_cache_valid_handles_non_mapping_metadata is added to both test_extensions.py and test_presets.py, parametrized over the five JSON non-mapping root types ([], "oops", 42, true, null). Each variant plants the metadata file with that exact content and asserts is_cache_valid() returns False without raising. The parametrize covers every JSON type the public spec allows at the root, so a regression that drops AttributeError from any except tuple is caught against every observable shape rather than relying on the next reviewer to remember the .get / non-mapping interaction. pytest tests/test_extensions.py tests/test_presets.py — 502 passed (was 492 before; the parametrize adds five vectors per file). * fix(catalogs): make cache writes best-effort to match read-side contract	2026-06-09 07:22:49 -05:00
Pascal THUET	f65d9f9382	feat(integration): add status reporting (#2674 ) * feat(integration): add status reporting * docs(integration): include status in query command docstring * fix(integration): handle Windows extended-length paths in status containment On Windows, os.readlink() (and sometimes Path.resolve()) return paths with the \\?\ extended-length prefix. Comparing such a target against a plain project root via Path.relative_to() spuriously fails, so an in-project dangling symlink was classified as `invalid` instead of `missing` — failing test_status_treats_dangling_symlink_as_missing and the windows-style variant on the Windows CI runners. Centralize the containment check in _is_within_project() and strip the \\?\ / \\?\UNC\ prefix from both sides before relative_to(). Add portable regression tests for the prefix-stripping helper and the containment contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * test(integration): restore top-level pytest import after rebase A three-way merge / rebase onto main silently dropped the module-level `import pytest` from test_integration_subcommand.py: main reorganized the import block without it (using only a local `import pytest as _pytest`), while this branch added top-level fixtures and `pytest.skip`/`pytest.raises` usage. The overlapping import-hunk edits resolved by dropping the import, breaking collection with `NameError: name 'pytest' is not defined` on every runner. Re-add the import in the third-party group. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(integration): fix Windows UNC path assertion in status helper test `test_strip_extended_length_prefix_normalizes_windows_paths` compared the str() form of the helper's output against a hand-built string. On Windows, pathlib renders a UNC root with a trailing separator (`\\server\share\`), so the exact string match failed there (`\\server\share\` != `\\server\share`) even though `_strip_extended_length_prefix` behaves correctly — the trailing separator is irrelevant to the `relative_to` containment check it feeds. Compare Path objects (semantic equality) instead of exact strings so the assertion holds on both POSIX and Windows. No production code change needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(integration): make shared-manifest remediation specify --integration The fallback `_manifest_suggestion` for the shared `speckit` manifest (used when no usable default integration is recorded) suggested `specify init --here --force`, which can trigger interactive integration selection. For CI/agent consumers of `integration status`, surface an explicit `--integration <key>` placeholder, matching the file's existing `<key>` suggestion style. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-09 07:20:43 -05:00
Copilot	927f54feea	feat: make git extension opt-in and remove --no-git at v0.10.0 (#2873 ) * feat(init)!: make git extension opt-in and remove --no-git at v0.10.0 - Remove --no-git parameter from specify init command - Remove git extension auto-installation from init flow - Git repository initialization (git init) still runs when git is available - Remove --no-git from all test invocations across the test suite - Update docs to reflect opt-in git extension behavior - Replace TestGitExtensionAutoInstall with TestGitExtensionOptIn tests BREAKING CHANGE: specify init no longer auto-installs the git extension. Use `specify extension add git` to install it explicitly. The --no-git flag has been removed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(scripts): remove git operations from core scripts Git functionality is now entirely managed by the git extension. Core scripts only handle directory-based feature creation and numbering. - Remove has_git(), check_feature_branch(), git branch creation from core - Simplify number detection to use only spec directory scanning - Remove HAS_GIT output from get_feature_paths() - Remove git remote fetching and branch querying - Keep BRANCH_NAME output key for backward compatibility Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: remove all git operations from core - Remove is_git_repo() and init_git_repo() dead code from _utils.py - Remove --branch-numbering from init command - Remove git from 'specify check' (now extension-only) - Update docs: git is optional prerequisite, check command description - Fix tests to reflect no-git-in-core reality (fallback to main) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(scripts): remove directory scanning and branch fallback from core Core scripts now resolve feature context exclusively from: 1. SPECIFY_FEATURE env var (set by git extension) 2. .specify/feature.json (persisted by specify command) Removed find_feature_dir_by_prefix() and directory scanning heuristics — these are the git extension's responsibility. Scripts error clearly when no feature context is available. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: introduce feature_numbering, deprecate branch_numbering in init-options - specify command template now reads feature_numbering (preferred) with fallback to branch_numbering (deprecated) from init-options.json - Git extension reads git-config.yml > feature_numbering > branch_numbering - init now writes feature_numbering: sequential to init-options.json - Deprecation warning emitted when branch_numbering is used as fallback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: remove trailing whitespace in common.ps1 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(scripts): persist SPECIFY_FEATURE_DIRECTORY env var to feature.json When SPECIFY_FEATURE_DIRECTORY is set, get_feature_paths() now writes the value to .specify/feature.json so future sessions without the env var can still resolve the feature directory. The write is idempotent — it skips when the file already contains the same value. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address review feedback — error messages and docs - Update error messages in common.sh and common.ps1 to reference SPECIFY_FEATURE_DIRECTORY instead of SPECIFY_FEATURE (which no longer resolves feature directories) - Fix get_current_branch comment (returns empty string, not error) - Update upgrade.md to reference SPECIFY_FEATURE_DIRECTORY with correct example paths - Update local-development.md troubleshooting: replace stale 'Git step skipped' row with actionable git extension guidance Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(scripts): harden feature.json persistence - Use json_escape in printf fallback when jq is unavailable (common.sh) - Replace utf8NoBOM encoding with UTF8Encoding($false) for PowerShell 5.1 compatibility (common.ps1) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(scripts): remove dead feature_json_matches_feature_dir functions These guards are no longer needed since the branch-name validation they protected against has been removed from check-prerequisites. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(git-ext): rename create-new-feature to create-new-feature-branch The git extension's script only creates the git branch — rename it to reflect that responsibility. The core create-new-feature.sh/.ps1 handles feature directory creation and feature.json persistence. Also includes fixes from review feedback: - common.sh: _persist_feature_json uses json_escape fallback - common.ps1: Save-FeatureJson uses UTF8Encoding for PS 5.1 compat - common.ps1: case-sensitive path stripping on non-Windows - create-new-feature.sh/ps1: output both SPECIFY_FEATURE and SPECIFY_FEATURE_DIRECTORY - setup-tasks.sh: fix stale 'Validate branch' comment Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(tests): update references to renamed git extension scripts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(tests): remove duplicate EXT_CREATE_FEATURE assignments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-09 06:13:07 -05:00
WOLIKIMCHENG	d8a81b23b5	test(workflows): cover executable override fallback preflight (#2843 )	2026-06-08 15:33:37 -05:00
Seiya Kojima	4ec4635dd1	feat(extensions): per-event hook lists with priority ordering (#2798 ) * feat(extensions): per-event hook lists with priority ordering The manifest validator restricted each hook event to a single mapping, even though HookExecutor stores entries as a list per event. This blocked an extension from running multiple commands on one event (e.g. a verification step plus a doc-generation step after speckit.plan), and get_hooks_for_event returned entries in raw insertion order with no way to influence execution order across or within extensions. This change: 1. Validator: accept hooks.<event> as either a single mapping or a list of mappings. Each entry is validated individually and may carry an optional integer `priority` (>= 1, default 10; bool rejected). 2. Command-ref normalization: apply rename / alias->canonical rewriting to every entry in the list, not just the head. 3. register_hooks: expand list entries, persist `priority`, and purge-and-replace all entries owned by the extension on each event so a reinstall whose shape changed (single<->list, or a shorter list) leaves no orphaned entries behind. 4. get_hooks_for_event: sort enabled entries by `priority` ascending with a stable sort (ties keep insertion order). The existing normalize_priority helper is reused as the sort key so corrupted on-disk values fall back to the default instead of raising. Backward compatible: existing single-mapping manifests parse and register unchanged with priority defaulting to 10. The extension-level `priority` used by preset/template resolution is independent of the new hook-entry `priority`. Implements #2378 * fix(extensions): harden register_hooks per PR review - Skip non-dict hook entries before .get() so a manifest that bypasses validation can't crash register_hooks with AttributeError. - Normalize `priority` on save via normalize_priority so the on-disk config stays clean, mirroring the read-side defense in get_hooks_for_event. - Tests: cover the non-dict-entry skip and add encoding="utf-8" to the new tests' manifest writes. * fix(extensions): purge dropped-event hook orphans on reinstall register_hooks only purged events the new manifest still declared, so an extension that dropped an event on reinstall left stale entries for it in the project config. Purge this extension's entries from undeclared events (and prune emptied events) before registering; scoped to this extension, and a no-op for the install/update flow where unregister_hooks runs first. * fix(extensions): reject boolean priority and complete orphan purge - normalize_priority falls back to default for bool values - dedup deletes duplicate commands before re-insert for last-wins ties - register_hooks purges orphans even when all hooks are dropped * docs(extensions): document per-event hook lists and priority - EXTENSION-API-REFERENCE: hook event accepts a mapping or list; add priority field reference and last-wins dedup note - EXTENSION-DEVELOPMENT-GUIDE: add list-form example with priority * docs(extensions): show both single and list hook forms in schema snippet * docs(extensions): reference DEFAULT_HOOK_PRIORITY in normalize_priority normalize_priority hard-coded the default as the literal 10 in both its signature and docstring, duplicating DEFAULT_HOOK_PRIORITY. Reference the constant in the signature and drop the literal from the docstring so the default has a single source of truth.	2026-06-08 08:03:46 -05:00
Copilot	7106858c4e	feat!: remove legacy --ai, --ai-commands-dir, and --ai-skills flags (0.10.0) (#2872 ) * Initial plan * feat!: remove legacy --ai, --ai-commands-dir, and --ai-skills flags at 0.10.0 * refactor(tests): rename stale test_ai_help_* methods to test_agent_config_* * fix: address review — derive agent folder for generic integration and remove redundant test - Security notice now falls back to integration_parsed_options['commands_dir'] when AGENT_CONFIG folder is None (generic integration). - Remove test_agent_config_includes_kiro_cli which duplicates the assertion in test_runtime_config_uses_kiro_cli_and_removes_q. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: scrub all remaining --ai flag references from source and tests - Remove dead AI_ASSISTANT_ALIASES, AI_ASSISTANT_HELP, and _build_ai_assistant_help() from _agent_config.py - Update comments/docstrings in extensions.py, presets.py, and integration subpackages to reference 'skills mode' or '--integration' instead of the removed flags - Fix catalog.json generic integration description - Update test docstrings/comments in test_extension_skills.py, test_extensions.py, and test_presets.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: remove legacy --ai flag rejection tests The flags are fully removed from the CLI; typer handles unknown options generically. No custom rejection logic exists to test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * revert: remove manual CHANGELOG.md entry CHANGELOG is generated automatically; manual edits should not be made. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: make generic catalog description self-explanatory Include the required --commands-dir sub-option in the description so readers don't need to look up integration docs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(tests): rename duplicate test classes to avoid shadowing The rename from TestAutoPromote to TestIntegration collided with the existing TestIntegration(SkillsIntegrationTests) base classes, causing the shared test suites to be silently overwritten. Rename the CLI init flow classes to TestInitFlow instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-05 14:56:28 -05:00
Manfred Riem	60302fefec	feat(extensions): add bundled bug triage workflow extension (#2871 ) * feat(extensions): add bundled bug triage workflow extension (#2870) Add a bundled 'bug' extension providing a three-stage bug triage workflow: - speckit.bug.assess: triage a bug report (pasted text or URL), locate suspected code paths, and propose a remediation - speckit.bug.fix: apply the proposed remediation and record what changed - speckit.bug.test: validate the fix and record the verification result Each bug gets its own directory under .specify/bugs/<slug>/ with one Markdown report per stage (assessment.md, fix.md, test.md). The slug is the only handle the three commands share; existing bug directories are never overwritten. Mirrors the layout of the existing bundled extensions (git, agent-context): - extensions/bug/extension.yml, README.md, commands/ - extensions/catalog.json: register 'bug' (alphabetical, between agent-context and git) - pyproject.toml: add wheel mapping to specify_cli/core_pack/extensions/bug Closes #2870 * address Copilot review on #2871 - speckit.bug.assess.md: drop POSIX-specific 'mkdir -p' example; reword the prerequisite to describe the requirement (ensure BUG_DIR exists) without assuming a specific shell. - speckit.bug.fix.md: fix the slug-resolution fallback wording. It listed '.specify/bugs//assessment.md' but then keyed off whether 'exactly one bug directory' existed; now it correctly keys off whether exactly one matching 'assessment.md' was found and uses the slug from its parent directory. - tests/extensions/bug/test_bug_extension.py: add a smoke test analogous to the agent-context extension's coverage. Validates the bundled layout, catalog registration, '_locate_bundled_extension("bug")' resolution, and that 'ExtensionManager.install_from_directory' installs the three commands. All 333 tests in tests/extensions/, tests/test_extensions.py, and tests/test_extension_registration.py pass. address Copilot review on #2871 (round 2) - Import _locate_bundled_extension from the public 'specify_cli' package (it is re-exported in __init__.py) instead of the private 'specify_cli._assets' module, so the test does not depend on internal module layout. - Clarify module docstring: install_from_directory is called with register_commands=False, so commands are copied and recorded in the installed manifest but not registered with AI agents. Wording updated to avoid implying otherwise. * address Copilot review on #2871 (round 3) - tests/extensions/bug/test_bug_extension.py: read extension.yml as UTF-8 explicitly to avoid platform-dependent default encoding (notably on Windows). Matches how the README is read in the same module. - extensions/bug/commands/speckit.bug.assess.md: add a 'Safety When Fetching URLs' section. Instructs the agent to treat fetched page content as untrusted input (no obeying embedded prompt-injection directives), forbids supplying credentials/secrets that a page asks for, scopes the fetch to the URL the user provided (no following redirects to other resources), and requires suspicious content to be quoted verbatim under an 'Unverified' heading rather than acted on. - extensions/catalog.json: bump 'updated_at' to today (2026-06-05) so consumers that cache by this field invalidate when 'bug' is added. - extensions/bug/README.md: minor grammar fix ('a reproduction that was not actually performed'). All 251 tests in tests/extensions/bug/, tests/test_extensions.py, and tests/test_extension_registration.py pass. * speckit.bug.assess: add URL Trust Policy for fetched bug-report URLs Builds on the 'Safety When Fetching URLs' section by adding a tiered classification rule the agent applies before any fetch: 1. Refuse outright (no fetch, no prompt) for non-http(s) schemes, loopback, link-local, RFC1918 private space, and known cloud instance-metadata endpoints (169.254.169.254, metadata.google.internal, 100.100.100.200, metadata.azure.com). This closes the SSRF / internal-recon vector opened by 'paste any URL'. 2. Fetch silently for an explicit allowlist of widely-used public bug-report sources (github, gitlab, bitbucket, atlassian.net, linear, stackoverflow/stackexchange, sentry). This preserves the paste-a-URL ergonomics the workflow is built for. 3. Otherwise prompt once in interactive mode (default 'no', naming the resolved host explicitly); in automated mode skip the fetch and record '[UNVERIFIED - fetch skipped: host not on safe list: <host>]' in assessment.md so a human can decide later. In every case, assessment.md records the verbatim URL, the resolved host, and which branch of the policy was taken (allowlisted / confirmed-by-user / auto-refused: <reason>) so the per-bug directory's audit trail is complete. Preflight HEAD probes are explicitly forbidden since the probe itself is the request the policy gates. Execution step 1 now defers to the policy before fetching. * speckit.bug.assess: remove 'post-redirect-resolution' inconsistency The URL Trust Policy explicitly forbids following redirects, but the audit-trail bullet asked the agent to record the host 'post-redirect-resolution', which contradicted that rule and could lead agents to follow redirects unintentionally to determine what to log. Reword both call sites to refer to the host parsed from the URL the user supplied (no resolution implied): - Tier-3 interactive prompt: '...naming the host parsed from the URL explicitly...' - Recorded fields: 'The host parsed from that URL (no redirect following - see the rule above).' No behavior change; clarification only.	2026-06-05 12:37:25 -05:00
lselvar	f512b8b0d1	fix: resolve GitHub release asset API URL for private repo preset and workflow downloads (#2855 ) * fix: resolve GitHub release asset API URL for private repo preset and workflow downloads - Add shared `resolve_github_release_asset_api_url` utility to `_github_http.py` for reuse across preset and workflow download paths - Apply the same private-repo fix from PR #2792 (extensions) to: - `PresetCatalog.download_pack` — ZIP downloads via catalog `download_url` - `preset add --from <url>` — ZIP downloads from a direct URL - `workflow add <url>` — workflow YAML downloads from a direct URL - `workflow add <id>` (catalog) — workflow YAML downloads via catalog `url` - For browser release URLs (`github.com/…/releases/download/…`), the asset is resolved via the GitHub REST API and downloaded with `Accept: application/octet-stream` - Direct REST API asset URLs (`api.github.com/…/releases/assets/<id>`) are downloaded directly with `Accept: application/octet-stream` - Auth is preserved end-to-end through the existing `open_url` infrastructure - Update `test_download_pack_sends_auth_header` and add `test_download_pack_accepts_direct_github_rest_asset_url` to cover both paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: URL-encode tag in release API URL to handle special characters Encode the tag as a path segment (using quote with safe='') when building the releases/tags/<tag> API URL. This prevents malformed URLs when tags contain reserved characters like '/' or '#'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: add CLI-level tests for preset add --from GitHub release URL resolution Adds regression tests covering: - resolve_github_release_asset_api_url unit tests (passthrough, resolution, network error, URL encoding of special chars in tags) - CLI-level 'preset add --from <github-release-url>' end-to-end flow - CLI-level 'preset add --from <api-asset-url>' direct passthrough Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: deduplicate release URL resolution; fix test issues - ExtensionCatalog._resolve_github_release_asset_api_url now delegates to the shared helper in _github_http.py (also gains URL-encoding fix) - Remove unused 'io' import from test_github_http.py - Remove duplicate 'provides' dict keys accidentally added to test_presets.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: align resolver timeout with download timeout; add workflow CLI tests - Pass timeout=30 to resolve_github_release_asset_api_url in both workflow add paths so worst-case latency matches the download timeout - Add CLI-level regression tests for 'workflow add <url>' covering browser URL resolution and direct API asset URL passthrough Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: remove unused urllib.request import; add catalog workflow test - Remove unused 'import urllib.request' in preset add --from path - Add CLI test for catalog-based 'workflow add <id>' with GitHub release URL resolution Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * style: remove unused MagicMock imports from tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-05 10:41:40 -05:00
Huy Do	005c80a9c7	fix(workflows): render gate show_file contents in the interactive prompt (#2810 ) * fix(workflows): render gate show_file contents in the interactive prompt The gate step read and recorded `show_file` but never displayed its contents at the interactive prompt, so the operator approved/rejected without seeing the referenced file. Render the file inside the prompt when stdin is a TTY, with a graceful notice for missing/unreadable files. Non-interactive PAUSED behaviour, exit codes, resume semantics, and no-`show_file` output are unchanged. Closes #2809. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): keep gate _prompt signature stable and harden show_file reads The gate prompt rendered show_file by passing it as a third positional argument to _prompt. A test that stubs _prompt with a two-argument lambda (test_gate_abort_still_halts_with_continue_on_error) then failed once the branch caught up to main, because the call site passed three arguments to the two-argument stub. Compose the show_file material into the displayed message in execute() and keep _prompt to its (message, options) contract. Display data no longer widens the interactive seam, so stubbing _prompt stays stable and future review material can be added without breaking callers. _prompt now renders a multi-line message inside the gate box. Also catch ValueError in _read_show_file so a path the OS rejects outright (e.g. an embedded NUL byte) degrades to a notice instead of crashing the prompt, matching the helper's stated contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): coerce gate prompt message to str before rendering The multi-line render loop split the message on newlines, which assumes a str. A non-string message (e.g. a YAML numeric literal) previously rendered fine through the old f-string and would now raise on .split. Coerce with str() to preserve that tolerance, and add a regression test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): make gate stdin handling robust; tidy compose_prompt typing Address review feedback on the gate tests and helper: - Swap the gate module's sys.stdin for a fixed-isatty stub (shared _StubStdin / _force_gate_stdin helpers) instead of setattr on sys.stdin.isatty, which is not assignable under some pytest capture modes. This also forces the non-interactive tests to a non-TTY so they cannot block on input() when run in a real terminal. - The non-interactive show_file test now hard-fails if _read_show_file is called, proving the file is not read on the PAUSED path. - _compose_prompt accepts a non-string message (e.g. a YAML numeric literal) and always returns str via str(message), keeping its annotation and docstring accurate; the redundant coercion in _prompt is removed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): strip control chars from gate show_file; default tests non-TTY Address review feedback: - _read_show_file strips C0 control characters (except tab) from each line, so a show_file containing ANSI escape sequences (e.g. \x1b[2J) cannot clear the screen or spoof the prompt/options when rendered to a terminal. - Add an autouse fixture on TestGateStep that defaults every gate test to a non-TTY stdin, so no test can drop into the interactive prompt and block on input() when the suite runs under a real TTY. Interactive tests opt back in via _force_gate_stdin(tty=True); the now-redundant explicit non-TTY calls were removed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(workflows): localize gate stdin patch to the gate module's sys _force_gate_stdin rebinds the gate module's `sys` name to a stand-in whose stdin has a fixed isatty() and which delegates every other attribute to the real sys, instead of mutating the process-wide sys.stdin. This keeps the patch local to the gate module and leaves real stdin untouched. The gate abort test, which used the same process-wide swap, now shares the helper, so the pattern exists in exactly one place. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): sanitize the displayed gate show_file path, not just content Control characters were stripped from show_file contents but the path was still printed verbatim as the header (`f"{show_file}:"`) and echoed in the read-error notice, so a show_file path containing ANSI escapes could still inject terminal sequences. Centralize stripping in `_sanitize_for_display` and apply it to every show_file-derived string that reaches the terminal — the displayed path, each file line, and the error notice — while still opening the file with the original path. Add a test for path sanitization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(workflows): inline control-char stripping, drop the helper Reuse the existing _CONTROL_CHARS regex directly at the three display sites instead of wrapping it in a one-line helper. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): also strip LF and C1 controls from gate show_file display The control-char class skipped LF (so an embedded newline in a show_file path could break the boxed layout) and the C1 range (so \x9b CSI and other 8-bit controls survived). Widen the class to [\x00-\x08\x0a-\x1f\x7f-\x9f] (still keeping tab). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-05 08:04:52 -05:00
Samir Abed	34ce66139e	feat: add support for rovodev (#2539 ) * feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev * fixup! feat: add support for rovodev	2026-06-04 11:34:05 -05:00
Huy Do	141119efea	feat(workflows): add JSON output for workflow run resume and status (#2814 ) * feat(workflows): add --json output to workflow run, resume, and status Adds an opt-in `--json` flag to `workflow run`, `workflow resume`, and `workflow status` that emits a single machine-readable object (run_id, workflow_id, status, current step; status also reports per-step states and a runs list) for automation and external orchestrators. JSON is written via a small `_emit_workflow_json` helper using plain stdout, so Rich markup, highlighting, and line-wrapping can never alter the emitted object. Default human-readable output and exit codes are unchanged when `--json` is omitted. Reference docs updated. Closes #2811. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(workflows): keep --json stdout clean while steps write output Suppressing the banner and the step-start callback was not enough to guarantee a single parseable JSON object on stdout: individual steps still write there while the engine runs. The gate step prints its prompt, and the prompt step runs a CLI subprocess that inherits the process's stdout file descriptor — either can corrupt the JSON stream for interactive runs or integration-backed workflows. Wrap engine.execute()/engine.resume() in a file-descriptor-level redirect (dup2) when --json is set, so both Python-level writes and inherited-fd subprocess output go to stderr while stdout carries only the emitted JSON. Step progress stays visible on stderr. status does not run the engine, so it is unaffected. Tests cover both pollution channels (a Python print and a real subprocess) via fd-level capture, and the inactive no-op path. Docs note the stdout/stderr split. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(workflows): fix stray escape sequence in --json redirect comments The redirect helper's docstring and its test comment wrote ``print``\s, which renders as "print\s" rather than "prints". Replace with plain "prints". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-04 11:11:39 -05:00
minbang	a9a759450d	fix: recover active skills registration for extensions (#2803 ) Extension command registration now resolves the active skills directory before writing command artifacts. This lets initialized skills-backed agents recover a missing active skills directory while preserving the existing preset registration behavior. Add regression coverage for missing active skills directories, shared skills directories, and symlinked parent guards. Fixes #2769. Co-authored-by: OpenAI Codex <codex@openai.com>	2026-06-04 09:53:31 -05:00
One-TheOnly	8e5643d4ff	fix(cursor-agent): enable headless CLI dispatch end-to-end (-p --trust --approve-mcps --force + Windows .cmd shim resolution) (#2631 ) * fix(cursor-agent): enable CLI dispatch via ``-p --trust`` headless mode Restores the ability for ``specify workflow run`` to dispatch the cursor-agent CLI, complementing the existing in-IDE skill flow. Without this fix, ``specify workflow run speckit --input integration=cursor-agent ...`` fails with a misleading ``CLI not found or not installed`` error even when the CLI is installed (since cursor-agent had ``requires_cli=False`` and an unset ``build_exec_args``). The cursor-agent CLI (>= 2026.05.16) supports headless execution via ``-p`` (print mode with full tool access including write/shell) and ``--trust`` (bypass Workspace Trust prompt). Without ``--trust`` the CLI exits non-zero in non-TTY contexts (verified locally). Changes to ``src/specify_cli/integrations/cursor_agent/__init__.py``: * ``config.requires_cli``: ``False`` -> ``True`` * ``config.install_url``: ``None`` -> Cursor CLI docs URL * Override ``build_exec_args()`` to emit ``[cursor-agent, -p, --trust, <prompt>, ...]`` with optional ``--model`` and ``--output-format json`` flags, mirroring the shape used by ``claude``/``codex``/``gemini``. Tests: * 34 existing cursor-agent tests still pass. * 6 new tests in ``TestCursorAgentCliDispatch`` pin ``requires_cli``, ``install_url``, and the exact argv shape (default, text-output, with-model, and the hyphenated skill invocation form ``/speckit-<name>``). * Full repo: 1085 / 1085 passed, no regressions. Fixes #2629 Co-authored-by: Cursor <cursoragent@cursor.com> * fix(integrations): resolve ``.cmd``/``.bat`` shims before subprocess.run On Windows, ``shutil.which`` honors ``PATHEXT`` and locates wrappers like ``cursor-agent.cmd`` and ``codex.cmd``, but Python's ``subprocess.run`` calls ``CreateProcess`` which does not consult ``PATHEXT`` and therefore fails with ``WinError 2`` on a bare argv like ``[cursor-agent, ...]``. Resolve ``exec_args[0]`` via ``shutil.which`` in ``IntegrationBase.dispatch_command`` so ``.cmd``/``.bat`` shims work transparently. On POSIX this is a no-op for absolute paths and a harmless lookup otherwise. Verified locally on Windows 10 + cursor-agent 2026.05.16: without this fix, ``specify workflow run speckit --input integration=cursor-agent`` fails with ``FileNotFoundError`` even after the cursor-agent integration starts producing valid exec args (per the prior commit on this branch). Tests: * New: 2 cursor-agent tests pin the shim-resolution + passthrough behavior (``test_dispatch_command_resolves_cmd_shim_for_subprocess`` and ``test_dispatch_command_passthrough_when_shutil_which_finds_nothing``). * Updated: ``tests/test_workflows.py::TestCommandStep::test_dispatch_with_mock_cli`` was mocking ``shutil.which`` only at the ``command`` step level and not at the ``base`` level, which made it environment-sensitive (fails locally when the real ``claude`` CLI is on PATH). Added the matching base-level patch and updated the argv-assertion to reflect the resolved path. ``test_dispatch_failure_returns_failed_status`` gets the same patch for consistency. * Full repo: 2867 passed, 0 regression from this PR. The 12 remaining pre-existing failures are unrelated Windows ``symlink`` privilege failures (``WinError 1314``) on a non-admin Windows runner. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cursor-agent): inject --approve-mcps --force for headless MCP/tool access The previous commit (`1c55988`) wired up ``-p --trust`` so the CLI launches in headless mode without the Workspace Trust prompt, but that alone is not enough to let ``specify workflow run`` drive a real speckit feature end-to-end with cursor-agent on Windows. Two more flags are required: * ``--approve-mcps``: without it, every MCP server configured in ``.cursor/mcp.json`` stays ``not loaded (needs approval)``, and any tool call against them is silently dropped. We hit this immediately trying to read a DingTalk PRD from a remote MCP server during the ``/speckit-specify`` step. * ``--force``: without it, the agent halts on the first tool-call approval prompt (the tool call gets rejected and the workflow exits non-zero with a misleading message). With ``--force`` cursor-agent matches the implicit "trusted environment" semantics that ``claude -p`` and ``codex --exec`` already have by default -- which is the right semantics for an unattended ``specify workflow run`` invocation. Verified end-to-end on Windows 10 + cursor-agent 2026.05.16-0338208: * ``cursor-agent -p --trust --approve-mcps --force --output-format text`` + a ``/speckit-specify`` prompt that included a DingTalk URL produced a full spec.md (31.5 KB) plus checklists/requirements.md in ~10.7 min, reading the source PRD through the ``dingtalk-doc`` remote MCP server, deciding the ``specs/`` subpath itself, and updating ``.specify/feature.json`` and ``specs/menu-dictionary.md`` along the way -- no human-in-the-loop, no source PRD ever touched the filesystem. * Without ``--approve-mcps`` the same prompt errors with the tool call rejected message; without ``--force`` the agent stops at the first non-MCP tool call. Tests: * ``test_build_exec_args_`` updated to pin the new four-flag prefix. New ``test_build_exec_args_contains_mandatory_headless_flags`` asserts the four flags are always present together. * ``test_dispatch_command_resolves_cmd_shim_for_subprocess`` updated to match the new argv layout. * All 43 cursor-agent tests pass; no other tests touched. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(cursor-agent): express dispatch support via build_exec_args() instead of requires_cli Co-authored-by: Cursor <cursoragent@cursor.com> * test(cursor-agent): use urlparse hostname check and cover dispatch without requires_cli Co-authored-by: Cursor <cursoragent@cursor.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: 刘一 <liuyi@oureman.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-04 09:48:33 -05:00
Copilot	40d832f90a	Allow `specify workflow run` to execute YAML files without a project (#2825 ) * Initial plan * feat: add --workflow option to init command for post-init workflow execution * chore: remove unused import in test file * refactor: allow workflow run without project when given a YAML file path Instead of adding --workflow to init, make `specify workflow run ./file.yml` work without requiring a .specify/ project directory. When the source is a YAML file that exists on disk, cwd is used as the project root. When it's a workflow ID, the .specify/ project requirement is preserved. * Handle standalone workflow path edge cases * Fix USERPROFILE env var portability and docs notation * Fix workflow YAML path detection to require regular files * Harden workflow run against unsafe .specify paths --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-06-04 07:09:54 -05:00
Grissiom.GuRui	659a41a6cc	feat(extensions): add --force flag to extension add for overwrite reinstall (#2530 ) * feat(extensions): add --force flag to extension add for overwrite reinstall Add --force support to `specify extension add` that allows overwriting an already-installed extension without manually removing it first. - install_from_directory() and install_from_zip() accept force=True, automatically calling remove() before installation - The --force CLI flag works with all install modes (--dev, --from URL, bundled, and catalog) - Config files (-config.yml) are preserved across force reinstall - Error message suggests --force when extension is already installed - 6 new tests covering unit and CLI force reinstall flows fix: address PR review feedback on --force implementation - Remove unused `backup_config_dir` variable assignment (Ruff F841) - Defer `remove()` until after `_validate_install_conflicts()` to prevent data loss if validation fails mid-reinstall - Use `TemporaryDirectory` instead of `NamedTemporaryFile` in ZIP test to avoid Windows file-locking failures * fix: only restore config backup when --force actually triggers a remove When --force is used but the extension is not already installed, the backup restore/cleanup should not run. Previously it could resurrect stale config files from a previous removal and delete the backup directory unnecessarily. * fix: address Copilot review feedback on --force implementation - Clear stale backup dir before remove() so only fresh backups are restored - Restore only config files (-config.yml, -config.local.yml) from backup - Remove trailing \n from --force console message (console.print adds newline) * fix: handle non-directory paths in backup cleanup/restore - Use is_dir() before rmtree/iterdir on backup path to avoid crashes when .backup/<id> exists as a file or symlink - Remove unused manifest1 variable in test_install_force_reinstall * fix: handle symlinks in backup cleanup/restore and correct CLI message - Check is_symlink() before is_dir() in backup cleanup and restore: Path.is_dir() follows symlinks (returns True for symlink-to-dir) but shutil.rmtree() raises OSError on symlinks. Handle symlinks by unlinking them instead. - Skip symlink entries during config file restore. - Change --force dev-install message from "Reinstalling" to "Installing [...] (will overwrite if already installed)" because --force also works for first-time installs.	2026-06-03 16:34:24 -05:00
Pascal THUET	4028c50af8	fix: render script command hints with active agent separator (#2649 ) * fix script command hints for agent separators * Address command hint review feedback * chore: remove whitespace-only PR churn * test: fix PowerShell command hint invocation * fix: preserve hyphens in script command hints * fix: render managed script command hints	2026-06-03 16:24:13 -05:00
darion-yaphet	67fecd357a	chore(tests): fix ruff lint violations in tests/ (#2827 ) Clear pre-existing lint debt flagged by repo-wide `ruff check` (the lint config only scopes src/, so tests/ had drifted). No behavior change. - F401/F541: drop unused imports and redundant f-string prefixes (autofix) - E741: rename ambiguous `l` to `ln` in comprehensions - E702: split semicolon-joined statements onto separate lines - F841: drop unused bindings while keeping the side-effecting calls (_minimal_feature, install_from_directory) Full suite: 3344 passed, 40 skipped. ruff check (repo-wide): clean.	2026-06-03 16:02:26 -05:00
Quratulain-bilal	bb2b49d0ae	fix(workflows): validate run_id in RunState.load before touching the … (#2813 ) * fix(workflows): validate run_id in RunState.load before touching the filesystem ``RunState.load(run_id, project_root)`` interpolates ``run_id`` directly into ``project_root / ".specify" / "workflows" / "runs" / run_id`` and then calls ``state_path.exists()`` and ``json.load`` on the result. The run_id is reachable from user input via ``specify workflow resume <run_id>`` (CLI argument) and via ``SPECKIT_WORKFLOW_RUN_ID`` (env var override on the engine's run path), so a value like ``../escape`` turns ``runs_dir`` into ``.specify/workflows/escape/`` and: * ``state_path.exists()`` becomes a file-existence oracle for any path the process can read. * if a ``state.json`` exists at the traversed location (planted by a malicious dependency, a misconfigured shared workspace, or an older spec-kit version that happened to write there), ``json.load`` parses it and the workflow resumes under the attacker-chosen ``workflow_id`` / step state. * a subsequent ``state.save()`` then writes back to the traversed location, persisting the corruption. ``RunState.__init__`` already validates ``run_id`` against ``r'^[a-zA-Z0-9][a-zA-Z0-9_-]$'`` — but that check runs on ``state_data["run_id"]`` after* ``load`` has already done the file lookup, which is too late to prevent the disclosure. This change extracts the pattern into a class-level constant ``_RUN_ID_PATTERN`` and a single ``_validate_run_id`` classmethod so ``__init__`` and ``load`` cannot drift, then calls the validator at the top of ``load`` before any path is built. Mirrors the precedent in ``src/specify_cli/agents.py::_ensure_within_directory`` (used at line 437 of that file) which guards extension-install paths against the same threat model. Regression tests parametrize 9 traversal vectors (``../escape``, ``..``, ``../../etc/passwd``, ``foo/bar``, ``foo\bar``, ``.hidden``, ``-flag``, ``foo\x00bar``, empty) and plant a malicious ``state.json`` outside ``runs/`` so a missing guard would surface as a successful load rather than the ambiguous ``FileNotFoundError``. A second test asserts ``__init__`` and ``load`` reject the same representative malformed ID, so future changes to one path can't silently drift from the other. * test(workflows): exercise RunState.load in shared-validation test, fix __init__ empty-string asymmetry Copilot's review on this PR pointed out that test_init_and_load_share_validation claimed to verify both entry points share the same validation rules but never actually called RunState.load — only __init__ and the shared _validate_run_id helper. A regression in load (e.g. someone deleting the cls._validate_run_id(run_id) call before the path is built) would slip through even though __init__ and the helper stayed aligned, defeating the whole point of the test. Tightening the test surfaced a real asymmetry the previous version was silently masking: self.run_id = run_id or str(uuid.uuid4())[:8] The truthiness fallback meant RunState(run_id="") silently substituted a UUID and skipped validation, while RunState.load("", project_root) correctly rejected the empty string. The two entry points diverged on the empty-string vector. That is exactly the drift the test name claimed to defend against — and the original test missed it. Changes ------- * engine.py: __init__ now distinguishes run_id is None (caller omitted it → auto-generate UUID) from an empty string (caller provided it → must validate like any other value). Both paths still flow through _validate_run_id, but only the explicit-None case auto-generates. * test_workflows.py: test_init_and_load_share_validation is now parametrized over one representative vector per category from test_load_rejects_path_traversal (parent traversal, embedded separator, leading non-alphanumeric, empty string) and asserts that all three entry points — __init__, _validate_run_id, and load — reject the same input. Adding load to the assertion is the substantive fix Copilot asked for; keeping __init__ and the helper alongside it makes any future drift between the three immediately observable instead of having to read three separate tests. Verification ------------ pytest tests/test_workflows.py — 168 passed (was 165 before the parametrize expansion; __init__ empty-string vector would have failed the new test against the old engine code, confirming the asymmetry was real).	2026-06-03 14:26:07 -05:00
김준호	ac2cb5daf5	feat(cli): implement specify self upgrade (#2475 ) * feat(cli): implement specify self upgrade * fix(cli): normalize self-upgrade prerelease tags * fix(cli): tighten self-upgrade diagnostics * fix(cli): harden self-upgrade verification parsing * fix(cli): sanitize self-check fallback tags * fix(cli): harden self-check release display * fix(cli): validate resolved upgrade tags * fix(cli): tolerate invalid install metadata * test(cli): align upgrade network mocks * fix(cli): respect relative installer paths * fix(cli): tighten upgrade failure handling * fix(cli): align installer path diagnostics * fix(cli): validate release and version output * fix(cli): clarify source checkout guidance * fix(cli): harden upgrade detection helpers * fix(cli): avoid echoing invalid release tags * fix(cli): tolerate argv path resolve failures * chore: remove self-upgrade formatting-only diffs * fix: address self-upgrade review feedback * fix: address self-upgrade review followups * fix: address self-upgrade review edge cases * fix: address self-upgrade review docs * fix: refine self-upgrade review followups * fix: address self-upgrade review cleanup * fix: handle self-upgrade review edge cases * fix: address self-upgrade review nits * fix: address follow-up self-upgrade review * fix: resolve self-upgrade review and Windows CI failures - README: promote "Optional Commands" to ### so it is a sibling of "Core Commands" under "Available Slash Commands" (consistent heading levels; avoids the h2->h4 jump a revert would create). - _version: allow --tag prerelease/dev and build-metadata suffixes to compose (e.g. v1.0.0-rc1+build.42), matching PEP 440 / semver; the Version() check still enforces canonical validity. - tests: compare resolved argv0 as Path objects instead of POSIX strings so the assertion holds on Windows; skip the relative-installer-path executable-bit tests on Windows via a new requires_posix marker (they rely on chmod/X_OK semantics and chdir-into-tmp teardown that do not hold there). Add a combined prerelease+build-metadata tag test. * fix: address second self-upgrade review round - self_check: clarify that the "up to date" branch is reached only for parseable latest tags (the unparseable case returns earlier), so the InvalidVersion fallback assumption is not reintroduced. - self_upgrade: compare target/current as Version instances directly instead of re-parsing the canonical strings through _is_newer; the empty-current case stays explicit via the not-None guard. - tests: document the intentional broad GH_/GITHUB_ env scrub with a test asserting non-credential context vars (GH_HOST, GITHUB_REPOSITORY, …) are stripped from the installer subprocess env — a deliberate fail-safe that also catches credential-adjacent names without a recognized suffix. * fix: address third self-upgrade review round - self_upgrade: unify the no-op short-circuits on packaging Version equality instead of canonical-string equality. Version("1.0") equals Version("1.0.0") but their str() forms differ, so the old check could misreport an equal install as "already on latest release or newer". Both the unpinned and pinned branches now use Version comparison. - self_upgrade: compare the verified version as a parsed Version against the target so a non-version verifier result is a mismatch (exit 2) rather than a coincidental canonical-string match. - resolver: map HTTP 429 (Too Many Requests / secondary rate limit) to the rate-limited category so users get the same actionable token hint as 403. - _is_github_credential_env_key: document the precise (intentionally broad) scrub matching contract in the docstring. - tests: add a trailing-zero Version-equality regression test and a parametrized HTTP-status categorization test (429 -> rate limited; 404/502 -> verbatim). * fix: address fourth self-upgrade review round - self_upgrade: label a pinned target older than the installed version as "Downgrading" rather than "Upgrading" so `--tag <older>` is not mistaken for a forward upgrade. - resolver: drop the unused `typing.Optional` import and annotate the `--tag` option as `str \| None`, consistent with the rest of the module (verified Typer resolves it on the supported Python versions). - _is_github_credential_env_key: add `_PASSWORD` and `_CREDENTIALS` to the recognized credential suffixes and document that only these shapes are scrubbed (not blanket coverage). - tests: assert the precise exit code (1) for the re-raised transient OSError path; skip the InvalidMetadataError test on Pythons where the real exception is absent instead of fabricating it; update the pinned downgrade test to expect the "Downgrading" label. * fix: accept uppercase V prefix in --tag Fold a leading uppercase `V` (a common paste) to the canonical lowercase `v` before validating `--tag`. The remainder of the tag stays case-sensitive on purpose: the validated value is used verbatim as a git ref, which is case-sensitive on GitHub, so rewriting label/build-metadata casing could point at a tag that does not exist. Adds a normalization test.	2026-06-03 12:04:54 -05:00
Huy Do	1732b9b62e	feat(workflows): allow resume to accept updated workflow inputs (#2815 ) `workflow resume` now accepts `--input key=value` (the same flag and parsing as `workflow run`, via a shared `_parse_input_values` helper). Supplied values are merged over the run's persisted inputs and re-resolved through the existing typed-validation path (`_resolve_inputs`), so a resumed/re-run step sees the updated inputs and ill-typed values fail fast. Keys not supplied keep their persisted values; resuming without `--input` is unchanged. Reference docs updated. Distinct from #2405 (file-reference inputs at run time): this is about supplying inputs at resume time, reusing the existing input model. Closes #2812. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 12:04:07 -05:00
WOLIKIMCHENG	6d511acfb9	fix(plan): clarify quickstart validation guide scope (#2805 ) Co-authored-by: root <kinsonnee@gmail.com>	2026-06-03 08:07:42 -05:00
lselvar	c9c02ae790	fix: resolve GitHub release asset API URL for private repo extension downloads (#2792 ) * fix: resolve GitHub release asset API URL for private repo downloads For private or SSO-protected GitHub repos, browser release download URLs redirect to HTML/SSO instead of the ZIP asset. This commit resolves the asset via the GitHub REST API and downloads with Accept: application/octet-stream, falling back to the original URL if the API call fails. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: support direct GitHub REST release asset URLs in extension downloads When a catalog download_url is already a GitHub REST release asset URL (https://api.github.com/repos/<owner>/<repo>/releases/assets/<id>), skip the release metadata lookup and download directly with Accept: application/octet-stream. This complements the browser URL resolution from the previous commit, covering catalogs that reference the REST API directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-02 17:15:36 -05:00
lselvar	d79a514b30	fix: remove unsupported mode: frontmatter from Copilot skills mode (fixes #2799 ) (#2819 ) VS Code Copilot Agent Skills do not support the `mode:` frontmatter field. The generated SKILL.md files included `mode: speckit.<stem>` injected by CopilotIntegration.post_process_skill_content(), which had no effect in VS Code and could cause confusion. Simplify post_process_skill_content to delegate directly to _CopilotSkillsHelper without injecting mode:. Update tests to assert mode: is absent from generated skill frontmatter. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-02 17:14:08 -05:00
darion-yaphet	ee17b04784	refactor(integrations): co-locate integration commands in integrations/ domain dir (PR-5/8) (#2720 ) * refactor(integrations): co-locate integration commands in integrations/ domain dir - Remove commands/ stubs (handlers will live in domain dirs) - Move all integration CLI handlers out of __init__.py into integrations/ - Split into focused modules under integrations/: _helpers.py (340 lines) — domain helpers _install_commands.py (306 lines) — install / uninstall _migrate_commands.py (487 lines) — switch / upgrade _query_commands.py (442 lines) — list / use / search / info / catalog _commands.py (34 lines) — app objects + register() - __init__.py reduced by ~1400 lines; integration block replaced with register() call - Fix patch paths in tests to new module locations * fix(integrations): restore original integration list output in refactor Preserve the CLI Required column, post-table default/installed summary, and no-installed guidance that were dropped during the no-behavior-change refactor of integration list into _query_commands.py. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(integrations): restore _clear/_update_init_options public imports The refactor that split integration commands moved _clear_init_options_for_integration and _update_init_options_for_integration into integrations/_helpers.py, but tests still import them from the top-level specify_cli package, causing ImportError. Re-export them with explicit aliases at the end of __init__.py to preserve the public import surface. --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-02 12:21:19 -05:00
Huy Do	7bab0568c5	feat(workflows): add continue_on_error step field for non-halting failures (#2663 ) * feat(workflows): add continue_on_error step field Adds an optional `continue_on_error: bool` field on every step. When set to `true` and the step fails, the engine records the result (`exit_code`, `stderr` on `steps.<id>.output` plus `status` as a sibling key on `steps.<id>`) and continues to the next sibling step instead of halting the run. Downstream `if`, `switch`, or `gate` steps can then branch on `{{ steps.<id>.output.exit_code }}` to route the recovery path. Engine details -------------- `WorkflowEngine._execute_steps` now consults the step config when a step returns `StepStatus.FAILED`: - Gate aborts (`output.aborted`) always halt the run — operator decisions take precedence over the flag. - Otherwise, if `continue_on_error` is the literal `True`, log a `step_continue_on_error` event and proceed to the next sibling. The runtime check uses identity comparison (`is True`) rather than truthiness, so truthy non-bool values like the string `"true"` cannot silently change run semantics even if a caller bypasses `validate_workflow()`. - Otherwise, behave as before: log `step_failed`, set `RunStatus.FAILED`, and return. Validation ---------- `_validate_steps` rejects non-bool values for `continue_on_error`. Coerced strings like `"true"` are not accepted so authoring mistakes surface at validation time rather than silently changing run semantics. Tests ----- `TestContinueOnError` in `tests/test_workflows.py` (8 tests): - `test_undeclared_failure_halts_run` — default halt behaviour. - `test_declared_and_fired_continues_run` — flag + fail → continue. - `test_declared_but_step_succeeded_is_noop` — flag + success → no-op. - `test_if_branch_routes_around_failure` — end-to-end recovery. - `test_gate_abort_still_halts_with_continue_on_error` — abort always halts. - `test_validation_rejects_non_bool_continue_on_error` — `"true"` rejected at validation. - `test_validation_accepts_bool_continue_on_error` — `true`/`false` pass cleanly. - `test_engine_ignores_truthy_non_bool_continue_on_error` — defense-in-depth: engine ignores string `"true"` even when validation is bypassed. Rebased onto current upstream/main (post #2664 merge); the new `TestContinueOnError` class sits immediately after upstream's `TestContextRunId` so the two feature suites coexist cleanly. Closes #2591. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): restore runtime context section, clarify gate prompt Two Copilot findings on `d0b9e00`: 1. The `### Runtime Context` documentation for `{{ context.* }}` was lost during the rebase onto current main (the squash dropped the anchor where #2664 had added it). Restored under `## Expressions` so users can find `context.run_id` semantics and examples. 2. The continue_on_error example gate had message "Retry or skip?" but used the default `options: [approve, reject]` with `on_reject: skip`, which implied an automatic retry path that gates do not provide. Reworded the message to match the actual approve/reject semantics and added an explicit note that retry requires either custom gate options + downstream branching or a wrapper loop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): clarify continue_on_error scope — returned FAILED only Copilot finding on `d0b9e00`: The README's "Error Handling" intro implied `continue_on_error` covers "any other runtime error raised during step execution", but the engine only consults the flag when a step returns `StepResult(status=FAILED, ...)`. Exceptions raised out of `step_impl.execute()` propagate to `WorkflowEngine.execute()`, where the catch-all logs `workflow_failed` and re-raises — the step result is never recorded, and the flag is never consulted. Audited the whole PR diff for the same overclaim: 1. workflows/README.md — main fix. Reworded the Error Handling intro to "any step that returns StepResult(status=FAILED, ...)" and promoted the parenthetical structural-validation note into the Notes block. Added a new "Scope: returned failures only" note that names the exception path explicitly and tells step authors how to bring the flag into scope for exceptional code (catch internally and return FAILED with the failure encoded in `output`). 2. tests/test_workflows.py — section comment used "when an executable step fails", same ambiguity. Tightened to "when a step returns StepResult(status=FAILED, ...)" and added a sentence calling out that unhandled exceptions are out of scope. 3. src/specify_cli/workflows/engine.py — already correct ("any step that returns FAILED" in the validator comment; "lets the pipeline route around the failure" in the execute path). No change. Engine semantics and test bodies are unchanged. Docs-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): clarify on_reject:skip semantics — engine returns COMPLETED, not auto-skip Copilot finding on `b8982a7`: The README example's gate message said "reject to skip the rest of this branch", and the explanatory paragraph claimed [approve, reject] map to "continue" vs "skip the rest of this branch". The engine does not implement automatic branch-skipping. `on_reject: skip` returns `StepStatus.COMPLETED` (gate/__init__.py:65-66); the next sibling step runs unconditionally unless the author wires a downstream `if` reading `{{ steps.<gate-id>.output.choice }}`. Two fixes: 1. Restructured the YAML example so it actually demonstrates the manual-branching pattern: added a `recover` if-step after the gate that conditions on `steps.review.output.choice == 'approve'`. Now the example shows the real workflow author's responsibility instead of implying the engine does it. 2. Replaced the trailing paragraph with three precise notes: - both gate options return COMPLETED; `on_reject: skip` controls abort behaviour only, not sibling-skipping - all three `on_reject` values enumerated with their actual engine semantics (FAILED+aborted / COMPLETED / PAUSED) - the original retry-loop guidance retained as the third bullet Updated the gate message in the example to match — "reject to leave the failure recorded and move on" instead of "reject to skip the rest of this branch". Audited the whole PR diff for the same overclaim: no other instance. Engine semantics, validation, and test bodies are unchanged. Docs-only. 161/161 tests/test_workflows.py pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): clarify gate's role — surfaces, doesn't programmatically branch Audit follow-up to `393ac6b` — three sites repeated the same minor overclaim about gates being one of the "branch on it" step types alongside `if` and `switch`: 1. workflows/README.md (the "downstream `if`, `switch`, or `gate` steps can branch on it" sentence introducing the example) 2. engine.py:236 (validator inline comment) 3. engine.py:657 (execute-path inline comment) A `gate` step does not have a `condition` or `expression` field — it only evaluates expressions for `message` and `show_file` (gate/__init__.py:29,36). Programmatic branching happens in `if`/`switch`; a gate surfaces the value to a human operator via message interpolation, and the operator's choice is recorded in `output.choice` for a subsequent `if`/`switch` to route on. Reworded all three sites consistently: "a downstream `if` or `switch` can branch on it (or a `gate` can surface it to the operator via message interpolation)". The README example already demonstrates this distinction — the gate carries `{{ }}` template variables in its message and the `recover` if-step downstream is what actually branches on the choice. Engine semantics, validation, and test bodies are unchanged. Docs-only on the README; comment-only on engine.py. 161/161 tests/test_workflows.py pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(workflows): use qualified StepStatus.* instead of bare FAILED/COMPLETED/PAUSED Three Copilot inline comments on workflows/README.md lines 226, 282, 288 flagged that ``StepResult(status=FAILED, ...)`` is not valid Python — ``StepResult.status`` is a ``StepStatus`` enum value, so the documented form should be ``StepStatus.FAILED``. Audited the whole PR diff for the same shorthand. The bare unqualified form appears in three files added/modified by this PR: 1. workflows/README.md (6 sites) — three ``StepResult(status=FAILED, ...)`` parentheticals, plus the on_reject Notes bullet listing the three step statuses (``FAILED``, ``COMPLETED``, ``PAUSED``). 2. tests/test_workflows.py (4 sites) — section header for TestContinueOnError, two test-method docstrings, one inline comment about a gate's TTY-fallback behaviour. 3. src/specify_cli/workflows/engine.py (1 site) — the validator inline comment added in `d0b9e00` said "returns FAILED" where the engine code itself uses ``StepStatus.FAILED``. All 11 sites normalised to the qualified ``StepStatus.<name>`` form so the docs / test docstrings / inline comments match what readers will actually find in the engine code and the tests. Engine semantics, validation, and test bodies are unchanged. 161/161 tests/test_workflows.py pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-02 10:10:07 -05:00
Eldar Shlomi	39921ddd3b	fix(shared-infra): record skipped files in speckit.manifest.json (#2483 ) * fix(shared-infra): record skipped files in speckit.manifest.json `install_shared_infra` skipped files that already existed on disk when `force=False`, but the skip branches in both the scripts loop and the templates loop only appended to `skipped_files` without calling `manifest.record_existing`. So when the function ran with a fresh manifest against an already-populated `.specify/` tree (e.g. after the manifest was deleted, corrupted, or extracted out of band), every file went down the skip path, `planned_copies` / `planned_templates` stayed empty, and `manifest.save()` wrote an empty `files` field — leaving the integration believing nothing was installed. Record every skipped file in the manifest, but only when it is not already tracked. This preserves the original hash for files that were previously recorded so `check_modified()` (used by `integration use` to decide whether a user has customized a template) keeps working correctly. Add `TestSpeckitManifestRecordsSkippedFiles` in `tests/integrations/test_integration_claude.py` covering both the fresh-skip path and the recover-after-lost-manifest path. Fixes #2107 * fix(shared-infra): guard manifest.record_existing against non-file dst Address Copilot review feedback on PR #2483. The previous fix called ``manifest.record_existing(rel_skip)`` from the skip branch of both loops in ``install_shared_infra``, which would crash with ``IsADirectoryError`` (or another ``OSError``) if a directory or other non-regular-file happened to exist at the expected destination path — since ``record_existing`` opens the file to compute its SHA-256. Three coordinated fixes: 1. ``IntegrationManifest.record_existing`` now validates its precondition: it raises ``ValueError`` if the path is a symlink or is not a regular file. The docstring already promised "an already-existing file"; this enforces it. The symlink check runs on the un-resolved path because ``_validate_rel_path`` calls ``resolve()``, which would silently follow the symlink. Mirrors the existing ``_ensure_safe_manifest_destination`` precedent in the same module. 2. In ``install_shared_infra``'s scripts and templates skip branches, guard the ``record_existing`` call with ``dst.is_file()`` and wrap it in ``try/except (OSError, ValueError)``. A directory collision, permission error, or TOCTOU race no longer aborts the whole install — the user gets a per-path warning, the path still surfaces in ``skipped_files``, and the rest of the install continues. 3. ``_read_manifest_files`` in the regression test no longer falls back to ``data.get("_files")`` (Copilot's low-confidence finding): the silent fallback could mask a schema regression where the public ``files`` key is renamed. It now asserts ``"files" in data`` and that the value is a dict. Add two regression tests in ``TestSpeckitManifestRecordsSkippedFiles`` covering the directory-at-destination edge case for both the scripts loop and the templates loop. Both verify (a) install does not crash, (b) the non-file path is not recorded in the manifest, and (c) the path still surfaces in the user-visible warning. The "shared infrastructure file(s)" warning text is changed to "path(s)" so it remains accurate when non-file entries appear in the list. Refs #2107 * fix(manifest): lexical pre-check for record_existing + add error-case tests Address Copilot review (2026-05-11, review id 4266902103): 1. `record_existing` was calling `(self.project_root / rel).is_symlink()` BEFORE validating containment. For absolute paths or paths containing `..`, this performed a filesystem stat outside the project root before `_validate_rel_path()` raised. Add a cheap lexical pre-check that delegates to `_validate_rel_path()` for the canonical error messages, so the symlink stat only ever runs on paths that are already lexically inside the project root. 2. Add focused unit tests in `tests/integrations/test_manifest.py` for the symlink and non-regular-file error paths, including: - symlink target rejection - dangling symlink rejection (caught by the symlink guard before the is_file check) - directory path rejection (is_file == False) - missing-path rejection (is_file == False) - absolute-path lexical pre-check The Copilot reviewer noted these guards had no focused coverage in `test_manifest.py`, only via the `test_integration_claude.py` regression test. 3. The third Copilot finding (repeated `dict(self._files)` copies via `manifest.files` in the skip branches) is already resolved on this branch by using `prior_hashes` — the function-scope snapshot taken at the top of `install_shared_infra` — for the membership check, instead of `manifest.files`. AI disclosure: drafted with assistance from Claude (Opus 4.7). * fix(manifest): track recovered files separately + symlink-ancestor + canonical-path guards Address Copilot review id 4309888722 (2026-05-18) on PR #2483: 1. Recovery semantics (shared_infra.py:371, 412) — install_shared_infra now passes ``recovered=True`` when re-recording a skipped existing file. This flag funnels into a new ``recovered_files`` array in the manifest JSON, so a future ``refresh_managed`` run can distinguish "hash I produced" from "hash I observed on a file that may be a user customization" and avoid silent overwrite without ``--refresh-shared-infra``. Schema is purely additive: ``files: dict[str, str]`` is unchanged; the new ``recovered_files: list[str]`` is omitted when empty. 2. Symlinked ancestor (manifest.py:172) — ``record_existing`` now walks every component of the rel path and rejects any symlinked ancestor, not just a symlinked leaf. Catches ``linked_dir/file.txt`` where ``linked_dir`` is a symlink, which previously slipped past the leaf-only ``is_symlink()`` check and was resolved through by ``_validate_rel_path``. Mirrors the component-walk pattern in ``_ensure_safe_manifest_directory``. 3. Misleading "escapes project root" message (manifest.py:168) — paths like ``dir/../file.txt`` normalize inside the project, so the old message lied about what was wrong. New message: "Manifest paths must be canonical; '..' segments are not allowed". Still rejects (canonical keys are required so ``check_modified``/``uninstall`` cannot key the same file under two paths). Tests: 7 new test methods across TestManifestRecoveredFiles and TestRecordExistingNewGuards covering all 4 Copilot findings. Full suite passes locally. 🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7). * fix(manifest): normalize is_recovered input through _validate_rel_path Address Copilot review comment id 4309888722 round-5 (2026-05-21) on PR #2483: ``is_recovered()`` previously checked ``self._recovered_files`` membership with bare ``Path(rel).as_posix()``, while ``record_existing()`` stores keys via ``_validate_rel_path(rel, root).relative_to(root).as_posix()``. The two normalizations disagreed on absolute paths and paths that escape the project root — ``is_recovered`` would silently return False for inputs that ``record_existing`` would have refused entirely. The fix routes ``is_recovered`` through the same ``_validate_rel_path`` pipeline; ``ValueError`` from the validator is caught and converted to False so query semantics stay exception-free (Python ``__contains__`` convention). Tests: 2 new methods in ``TestManifestRecoveredFiles``: - ``test_is_recovered_absolute_path_returns_false`` - ``test_is_recovered_escaping_path_returns_false`` 🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7). * fix(manifest): clear recovered marker on managed re-record + reject '..' in is_recovered Address Copilot Round-7 review comments on PR #2483: 1. record_existing(recovered=False) and record_file now BOTH discard the path from _recovered_files. The marker is meant to flag "we observed this file but cannot vouch it's a managed baseline" — once the same path is re-recorded as managed (either explicitly or by writing fresh bytes), the marker is stale and must clear so refresh_managed and future is_recovered queries return the truthful answer. 2. is_recovered now applies the same canonical-key guard as record_existing (rejects absolute paths and '..' segments lexically before delegating to _validate_rel_path). Such paths can never be stored keys, so the query correctly returns False without depending on _validate_rel_path semantics that diverged from record_existing's stricter contract. record_file docstring updated to mention the side-effect on recovered markers. Tests: 3 new methods in TestManifestRecoveredFiles covering record_existing(false) clearing, record_file clearing, and is_recovered dotdot rejection. * test(manifest): update is_recovered comments to reflect Round-7 lexical guard Round 8 — addresses Copilot review comment on tests/integrations/test_manifest.py:362. After Round-7 (`1dbf0c2`), is_recovered() rejects absolute paths and '..' segments up front via a lexical guard, returning False without calling _validate_rel_path at all. The test comments still described the prior "_validate_rel_path raises; we catch" code path, which is misleading for readers. Updated comments in both: - test_is_recovered_absolute_path_returns_false (Copilot's exact target) - test_is_recovered_escaping_path_returns_false (same comment-class issue; fixed preemptively to avoid a Round-9 finding on the same drift) Pure documentation change. Test assertions and behavior unchanged; all manifest tests still green. * fix(manifest): document OS errors on record_existing + filter orphan recovered_files on load Round 9 — addresses Copilot review on PR #2483: 1. record_existing's docstring now documents OSError/PermissionError as possible raises (in addition to ValueError) — the implementation has always been able to raise them from is_symlink, is_file, or the file-read used to hash, but the contract did not reflect that. Callers should be prepared for both surfaces. 2. load() now filters recovered_files entries that don't correspond to keys in files. An externally-edited or partially-corrupted manifest can deserialize with orphan recovered paths; rather than reject the whole manifest (too strict on the upgrade path), we drop the orphans and let the inconsistency self-correct on the next save(). is_recovered then returns the truthful False for the orphan. Tests: new test_load_filters_recovered_files_not_in_files asserting an orphan recovered entry is dropped on load.	2026-06-02 08:06:31 -05:00
Quratulain-bilal	442a581358	fix(cli): pin UTF-8 encoding on init-options and .extensionignore I/O (#2686 ) * fix(cli): pin UTF-8 encoding on init-options and .extensionignore I/O ``Path.read_text`` / ``Path.write_text`` default to the system locale codec, which is cp1252 / gb2312 / cp932 on Windows. Two user-facing file paths in spec-kit were calling them without an explicit ``encoding=`` argument: - ``src/specify_cli/__init__.py:400,412`` — ``save_init_options`` / ``load_init_options`` for ``.specify/init-options.json``. A peer machine with a different default locale (or a UTF-8 Unix CI runner reading a file written on a cp1252 Windows host) cannot decode the file, raising ``UnicodeDecodeError``. ``UnicodeDecodeError`` is a subclass of ``ValueError`` — not ``OSError`` / ``json.JSONDecodeError`` — so the existing fall-back ``except`` tuple in ``load_init_options`` also misses it and the error propagates raw to the CLI. - ``src/specify_cli/extensions.py:764`` — ``.extensionignore`` pattern reader. The very next line already normalises backslashes "so Windows-authored files work", proving the codebase expects Windows authors to write this file. Multibyte UTF-8 patterns (Chinese filenames, accented directory names) silently mojibake when the host locale is not UTF-8, so the patterns fail to match and unintended files are shipped with the extension. The sibling integration-catalog reader at ``src/specify_cli/integrations/catalog.py:150,156,193,202,374`` already pins ``encoding="utf-8"`` everywhere. PR #2280 fixed the symmetric PowerShell-template BOM bug. This change brings the two remaining drifted paths in line with that precedent. Regression tests: - ``tests/test_presets.py::TestInitOptions`` — parametrized non-ASCII round-trip (CJK, Latin-1, Greek, emoji) plus a corrupted-file case that asserts the existing "fall back to {}" contract still holds when a peer file contains bytes invalid as UTF-8. - ``tests/test_extensions.py::TestExtensionIgnore`` — Japanese (``ドキュメント/``) and Latin-1 (``café/``) ignore patterns correctly exclude their directories during install. * fix(cli): wrap .extensionignore decode error and tighten UTF-8 contract Addresses Copilot review feedback on this PR. Three issues, three fixes: 1. ``save_init_options`` now writes JSON with ``ensure_ascii=False``. Without that flag, ``json.dumps`` emits ASCII-only ``\uXXXX`` escapes, which means the ``encoding="utf-8"`` pin on the surrounding ``Path.write_text`` makes no observable difference for any value we currently write. Flipping ``ensure_ascii`` makes the non-ASCII bytes hit the file directly, so the encoding pin becomes the thing that decides between cp1252 garbage and clean UTF-8 on Windows. The comment above the call now describes the real reason instead of the previously-misleading rationale Copilot flagged. 2. ``test_save_load_round_trip_preserves_non_ascii`` was a no-op under the old ``ensure_ascii=True`` writer (Copilot's second comment). Added ``test_save_writes_real_utf8_bytes`` that asserts the on-disk bytes contain the UTF-8 encoding of ``café`` (``0xC3 0xA9``), not its JSON escape form ``é``. Removing either ``ensure_ascii=False`` or ``encoding="utf-8"`` from the writer now breaks this test — the contract is pinned. 3. ``.extensionignore`` reader wraps ``UnicodeDecodeError`` as ``ValidationError`` with a pointer to the offending byte (Copilot's third comment). Mirrors ``ExtensionManifest._load_yaml``'s existing handler for ``extension.yml``. Adds ``test_extensionignore_invalid_utf8_raises_validation_error`` asserting installation aborts with the wrapped error instead of a raw Python traceback.	2026-06-02 07:19:11 -05:00
WOLIKIMCHENG	14da893e4f	fix(copilot): resolve active spec template (#2765 ) Co-authored-by: root <kinsonnee@gmail.com>	2026-06-01 14:49:02 -05:00

1 2 3 4

197 Commits