352 Commits

Author SHA1 Message Date
WOLIKIMCHENG
d982c2f67f feat(copilot): warn before skills default rollout (#3256)
* feat(copilot): default to skills mode

* feat(copilot): warn before skills default rollout

* Make Copilot skills warning test less brittle

---------

Co-authored-by: root <kinsonnee@gmail.com>
2026-07-01 11:35:53 -05:00
Ali jawwad
2d56dfd73d fix(integrations): cline hook note collapses onto instruction at EOF (#3263)
The hook-note injection regex matches the line terminator via
(\r\n|\n|$), so the captured eol group is empty when the instruction
is the final line of a file with no trailing newline. The cline
integration emitted the note with that empty eol, mashing the note text
and the instruction onto a single line. Default eol to '\n', matching
the agy integration twin which already guards this case.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 15:44:49 -05:00
darion-yaphet
810d6fcfe1 refactor: move workflow command handlers to workflows/_commands.py (PR-8/8) (#3159)
* refactor: move workflow command handlers to workflows/_commands.py (PR-8/8)

Final PR of the __init__.py split. Moves the workflow command group out of
__init__.py into the existing workflows/ package, completing the domain-dir
layout established in PR-5 (integrations), PR-6 (presets) and PR-7
(extensions).

- New workflows/_commands.py holds the four Typer apps (workflow / catalog /
  step / step-catalog), all 25 command handlers, the six workflow-only
  helpers (_parse_input_values, _workflow_run_payload, _emit_workflow_json,
  _stdout_to_stderr_when, _validate_step_id_or_exit,
  _resolve_steps_base_dir_or_exit), and a register(app) entry point.
- workflows is already a package, so no rename is needed; intra-package
  imports change from `.workflows.x` to `.x`. The only root-helper dep
  (_require_specify_project) is reached through a call-time shim so test
  monkeypatching of specify_cli._require_specify_project keeps working.
- __init__.py drops ~1445 lines (2066 -> 621); the workflow group is
  re-attached via register(app). Dead `contextlib` import removed.
- tests/test_workflows.py: import the now-relocated _stdout_to_stderr_when
  helper from its new home (workflows._commands) instead of the package root.

No behavior change. Full suite green (3847 passed), ruff clean.

* Prevent workflow state writes through symlinked storage

Workflow commands persist run state under .specify/workflows/runs, so the command-local project shim now rejects symlinked workflow storage before any workflow command proceeds. The standalone YAML path uses the same guard because it intentionally bypasses the normal project requirement while still creating workflow state under the current directory.

Constraint: Local YAML workflow runs do not require an existing .specify project directory but still create .specify/workflows/runs state

Rejected: Guard only .specify in the file-source path | .specify/workflows and runs can independently redirect writes

Confidence: high

Scope-risk: narrow

Directive: Keep workflow storage symlink checks centralized before constructing WorkflowEngine

Tested: .venv/bin/python -m pytest tests/test_workflow_run_without_project.py tests/test_workflows.py::TestWorkflowAddSymlinkGuard -v

Tested: .venv/bin/python -m py_compile src/specify_cli/workflows/_commands.py tests/test_workflow_run_without_project.py tests/test_workflows.py

Not-tested: Ruff lint; ruff is not installed in the repo virtualenv

Assisted-by: OpenAI Codex (model: GPT-5, autonomous)

* fix(workflows): pass github_hosts allowlist to GHES release asset resolver

workflow add resolved GitHub release download URLs without forwarding the
github_provider_hosts() allowlist, so resolve_github_release_asset_api_url
never treated any host as GHES. This regressed GitHub Enterprise Server
release asset resolution and diverged from presets/extensions, which already
pass github_hosts. Forward github_provider_hosts() at both the direct-URL and
catalog install call sites. The allowlist remains the anti-SSRF gate.

* fix(workflows): reject symlinked/traversal <id> dir on workflow install

Local/URL and catalog installs wrote to .specify/workflows/<id>/workflow.yml
without guarding the <id> segment. A pre-planted symlink at <id> or
<id>/workflow.yml let mkdir+copy/download follow it and write outside the
project root; a non-directory <id> made mkdir raise unhandled.

Add _safe_workflow_id_dir() to reject path traversal, symlinked or
non-directory <id>, and a symlinked workflow.yml leaf before any write.
Fold the catalog branch's existing traversal check into the helper.

* fix(workflows): harden _safe_workflow_id_dir output and leaf checks

- Reorder symlink/non-directory check before resolve() so a symlinked
  <id> reports as symlinked instead of misleading "Invalid workflow ID"
- Reject a pre-existing <id>/workflow.yml that is not a file, avoiding an
  unhandled IsADirectoryError on later write/copy2
- Escape workflow_id in Rich output to prevent markup injection; escape
  the repr (not the raw id) so repr-added backslashes cannot re-expose
  brackets, matching extensions/_commands.py hardening
- Add tests for workflow.yml-as-directory and markup-escaped invalid id

* Avoid stale lint failures from config helper imports

Move PyYAML loading into the helpers that read and write agent-context configuration, and replace the broad Any annotation with object. The runtime behavior stays the same while the module no longer exposes top-level imports that can be flagged as unused when CI analyzes a narrower code shape.

* Prevent workflow commands from targeting reserved storage

Workflow install and removal paths are derived from workflow IDs before any catalog download, local copy, or directory deletion. Validate that IDs are single workflow-id path segments and reject names reserved for workflow runtime storage so commands cannot target .specify/workflows/runs or .specify/workflows/steps.
2026-06-30 11:03:54 -05:00
Ben Buttigieg
36501d459f chore: retire Roo Code integration — extension shut down (#3167) (#3212)
* chore: retire roo integration — extension shut down (#3167)

Remove the Roo Code integration after the extension was shut down: subpackage,
registry entry, catalog entry, docs, tests, and issue-template options.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: remove stale Roo Code mention in upgrade guide

Assisted-by: GitHub Copilot (model: gpt-5.3-codex, autonomous)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: remove leftover Roo Code references after merge

Drop roo from presets/ARCHITECTURE.md example and the agent-context
defaults map; these came in from main and were flagged by review.

Assisted-by: GitHub Copilot (model: claude-opus-4.8, autonomous)

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-30 10:24:04 -05:00
Ali jawwad
c5ac90b245 fix(bundle): allow 'catalog remove' by the same relative path used to add (#3242)
* fix(bundle): allow 'catalog remove' by the same relative path used to add

add_source canonicalizes a local catalog path to an absolute url before persisting it, but remove_source compared only the raw input against the stored id/url. So 'bundle catalog remove ./cat.json' could not undo 'bundle catalog add ./cat.json' -- the stored url was absolute, the removal target relative, and they never matched ('No project-scoped catalog source found'). Match the canonicalized form too (a no-op for ids and remote urls), so a local source is removable by the same path it was added with.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(bundle): match catalog removal target exactly first, canonical only as fallback

Address Copilot review: canonicalizing the removal target unconditionally could let 'remove <id>' also delete a different source whose url equals that id's canonicalized path (ids are treated as local paths by _canonicalize_url, empty scheme). Try an exact id/url match first; only fall back to a canonicalized-url match when no exact match is found, so relative-path removal still works without collateral deletion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 10:21:53 -05:00
Ali jawwad
3571ba72d8 fix(workflows): reject bool max_iterations in while/do-while validation (#3237)
* fix(workflows): reject bool max_iterations in while/do-while validation

while/do-while validate() checked 'not isinstance(max_iter, int) or max_iter < 1'. Since bool is a subclass of int, isinstance(True, int) is True and True < 1 is False, so 'max_iterations: true' passed validation and then ran as a single iteration (range(True) == range(1)) instead of being reported as a type error. Reject bools explicitly, matching the fail-fast-on-bool handling already used for number inputs and gate options.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: assert empty error list for the valid do-while max_iterations case

Address Copilot review: the accepted-config assertion only checked that no error mentioned 'max_iterations', which could let an unrelated validation error pass unnoticed. For a known-good config, assert the entire error list is empty (consistent with the other validate tests in this file).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 10:17:05 -05:00
Dyan Galih
6fb7e77b3e fix: allow prerelease spec-kit versions in compatibility checks (#2695)
* docs: generate integrations reference from catalog

* refactor: integrate table rendering into specify integration search --markdown

- Remove standalone scripts/generate_integrations_reference.py
- Strip doc injection machinery from catalog_docs.py; keep only table rendering
- Wire render_integrations_table() into existing --markdown flag of integration search
- Remove old simple markdown table block from integration_search (was Name|ID|Version|Description|Author)
- Simplify tests: drop subprocess/doc-path tests, keep table rendering and metadata tests
- Clean up docs/reference/integrations.md: remove generated markers, update note

* fix: address Copilot review feedback on catalog_docs and integration_search

- Warn when --markdown is combined with filters (query/--tag/--author) which are
  silently ignored; catch ValueError/FileNotFoundError and surface clean error
  via console instead of raw traceback (r3244821516)
- Add coverage enforcement in list_integrations_for_docs(): raises ValueError
  with actionable message if any registry key is missing from INTEGRATION_DOC_URLS,
  preventing silently incomplete doc tables (r3244821589)
- Rename test to accurately reflect sources: label derives from registry config,
  URL comes from INTEGRATION_DOC_URLS doc map — not solely from registry (r3244821607)
- Simplify test dict construction to idiomatic dict comprehension (r3244821619)

* fix: add sync test, INTEGRATIONS_REFERENCE_PATH constant, and fix naming

* revert: restore docs/reference/integrations.md to upstream/main; remove sync test (GH Actions job will handle)

* fix: remove dead INTEGRATIONS_REFERENCE_PATH, drop URL-length padding, fix docstring, drop FileNotFoundError

* fix: send --markdown warnings/errors to stderr, rename test for clarity

* fix: detect stale doc-map keys, test _render_cell escaping, strengthen header assertion

* refactor: promote _render_cell to public render_cell function

* test: mock registry and doc maps to avoid brittle live registry coupling

* refactor: flatten patches, remove unused imports, fix trailing whitespace, optimize missing calculation

* refactor: make validation non-fatal, fix context manager syntax, add CLI tests

* fix: improve docstring clarity, test robustness, and exception handling

* fix: improve test assertions, disable warnings by default, enhance exception handling

* fix: make CLI tests deterministic and improve config access resilience

* fix: remove extra blank line, add stale keys validation, add regression test for docs sync

* Fix 5 remaining feedback items:
- Rename _get_mocked_cli_runner() to _get_catalog_docs_patches() for clarity
- Use ExitStack context manager for guaranteed patch cleanup
- Add explicit UTF-8 encoding to file reads
- Skip doc sync test gracefully when docs aren't present
- Remove exception chaining from typer.Exit to avoid noisy tracebacks

* address all outstanding copilot review feedback on PR 2563

* Address Copilot feedback: escape URLs in markdown links, deduplicate cell rendering, fix table parser for escaped pipes

* Address 3 new Copilot feedback: add URL escaping test, fix parse_first_markdown_table for escaped pipes, guard community tests with skip

* Address 3 new Copilot feedback: escape id field, remove unused alias, escape integration URLs

* Address 3 new Copilot feedback: fix comment name, include all integrations in list

* Fix architectural issue: escape raw fields before composing Markdown to prevent double-escaping

* Deduplicate _escape_url_for_markdown_link and add URL escaping test

* Address 4 new Copilot feedback: add trailing newline, fix test helper ExitStack, update warning message

* Address 4 new Copilot feedback: make escape function public, fix error message, validate test rows, prevent double newline

* Update error message in test_missing_catalog_file for clarity

* Remove obsolete integrations sync test

* keep integrations docs in sync

* fix: allow prerelease spec-kit versions in compatibility checks

Allow prerelease/dev builds to satisfy extension and preset compatibility
checks when their version number falls within the required specifier range.
Also harden the integrations docs rendering helpers and add regression
coverage for the markdown table parsing and version gating paths.

Tests: pytest -q; python3 -m compileall -q .; black/flake8 unavailable
Reference: branch 002-generate-integrations-docs; source patch /tmp/spec-kit-changes.patch

* fix: isolate prerelease compatibility gate changes

Keep the prerelease/version compatibility fix on its own branch and remove
the unrelated integrations docs updates that belong with PR 2563.

Tests: full suite passed on the prerelease branch before splitting; docs branch covered by targeted docs tests
Reference: upstream/main; source patch /tmp/spec-kit-changes.patch

* Address PR 2695 feedback: Centralize prerelease policy and add boundary test

* Address remaining Copilot PR feedback: revert docs and add preset prerelease tests

* Remove unreachable raise CompatibilityError

* Fix PEP8 E302 and E303 formatting issues
2026-06-30 09:41:57 -05:00
Ben Buttigieg
c47dd2b812 chore: retire Windsurf integration — absorbed into Cognition Devin (#3168) (#3213)
* chore: retire windsurf integration — absorbed into Cognition Devin (#3168)

windsurf.com now permanently redirects to devin.ai/desktop following
acquisition. Remove subpackage, registry/catalog entries, docs, and tests;
re-point sample-agent test fixtures to Kilo Code.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: remove stale Windsurf support references

Assisted-by: GitHub Copilot (model: gpt-5.3-codex, autonomous)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: fix Kilo Code command path in upgrade guide

Assisted-by: GitHub Copilot (model: gpt-5.3-codex, autonomous)

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>

* chore: align integration lists after rebase

Assisted-by: GitHub Copilot (model: gpt-5.3-codex, autonomous)

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>

* docs: align kilocode example with runtime behavior

Assisted-by: GitHub Copilot (model: gpt-5.3-codex, autonomous)

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-30 08:49:49 -05:00
Huy Do
20f430686c feat(workflows): honor max_concurrency in fan-out via a bounded thread pool (#3224)
* feat(workflows): honor max_concurrency in fan-out via a bounded thread pool

* feat(workflows): address review — sliding-window fan-out, locked output, faithful halt

Address the reviewer feedback on the bounded fan-out concurrency:

- Sliding submission window: keep at most `workers` items in flight and stop
  launching new items once the run is halting, instead of submitting all items
  up front (which let the pool keep starting queued work after a halt).
- Faithful halt prefix: attribute a halt to the specific item whose own
  recorded result halted the run (replaying the sequential break condition,
  honoring continue_on_error/aborted), not the shared run status a later
  concurrent item may have flipped. The returned prefix now includes the actual
  halting item, matching the sequential path. An item that fails before
  recording a result (e.g. an unknown step type) is attributed too, since every
  item runs the same template.
- Lock the parent fan-out output mutation: route the post-fan-out
  step_results[...]['output'] update through a new RunState.set_step_output()
  under the run lock, so it cannot race a concurrent save().
- Docstring: describe int() coercion accurately (numeric strings / floats are
  honored; only non-coercible or <= 1 runs sequentially).

Tests: add concurrent halt-includes-halting-item, continue_on_error-does-not-
truncate, and unknown-template-type-matches-sequential coverage; make the
timing test use a monotonic clock with a looser threshold to avoid CI flakiness.

* feat(workflows): address second review pass — concurrency hardening

- append_log: serialize the log_entries append + log.jsonl write under a
  dedicated RunState._log_lock so concurrent fan-out workers can't interleave
  or corrupt log lines (kept separate from the state lock; never nested).
- _run_fan_out.run_item: read the item output back through the item_ctx it
  executed against rather than the outer context closure — clearer and robust
  if StepContext ever stops sharing the steps dict by reference.
- StepBase: document the thread-safety contract — STEP_REGISTRY holds one shared
  instance per type, so concurrent fan-out invokes execute() on the same object;
  implementations must be stateless/thread-safe (the built-ins already are).
- test_concurrency_is_real: prove parallelism deterministically with a
  threading.Barrier (sequential execution can't clear it) instead of a
  wall-clock timing assertion.

* feat(workflows): address review — stamp updated_at under lock, clarify cancel semantics

- RunState.save(): move the updated_at timestamp assignment inside the run lock
  so the timestamp matches the snapshot the thread serializes and concurrent
  savers don't race on it.
- _run_fan_out docstring: clarify that on a halt only not-yet-started items are
  cancelled; items already running finish but their outputs are ignored
  (Future.cancel() can't stop running work, and the pool joins on exit).

* feat(workflows): serialize on_step_start callback under a lock

The concurrent fan-out path invokes _execute_steps from worker threads, which
calls the engine's on_step_start callback (the CLI sets it to a console.print
lambda). Concurrent invocation could interleave/garble progress output. Guard
the call with a WorkflowEngine._callback_lock so callbacks are serialized;
the lock is uncontended for sequential runs.

* feat(workflows): re-raise worker exceptions in-place to preserve traceback

In _run_fan_out's concurrent path, a worker exception was stashed in first_exc
and re-raised after the loop. Re-raise it from within the except block with a
bare `raise` (after cancelling outstanding futures) so the original traceback is
preserved, and drop the now-unneeded first_exc variable. The ThreadPoolExecutor
__exit__ still joins any already-running workers before the exception escapes.

* feat(workflows): lock final fan-out status, drop redundant output write, bound workers

Address third review pass:

- Remove the unlocked `context.steps[step_id]["output"] = …` writes in the
  fan-out parent update. context.steps[step_id] is the same dict object that
  set_step_output() updates under the run lock, so the direct (unsynchronized)
  mutation was redundant.
- Preserve sequential halt semantics under concurrency: a later in-flight item
  could overwrite state.status after the halting item was identified. _run_fan_out
  now derives the halting item's run status (item_halt_status, replacing the bool
  item_halted) and restores it after the pool joins, so the final status is the
  first halting item's outcome.
- Bound the pool: workers = min(max_concurrency, len(items)) and early-return for
  empty items, so a user-controlled max_concurrency can't over-allocate threads.

Add coverage that an earlier PAUSED item's status wins over a later concurrent
FAILED item.

* feat(workflows): avoid unlocked context.steps writes when it aliases step_results

On a resume run, StepContext is built with steps=state.step_results, so the two
direct `context.steps[...] = ...` writes mutated the shared dict outside the run
lock and could race save(). Route both through a new _record_result helper that
mirrors into context.steps only when it is a distinct object (a fresh run) and
otherwise relies solely on record_step_result's locked write.
2026-06-30 08:23:27 -05:00
Ben Buttigieg
28a38af6c1 chore: retire iflow integration — product discontinued (#3166) (#3211)
Remove the iFlow CLI integration whose product was shut down: subpackage,
registry entry, catalog entry, docs, tests, and issue-template options.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-30 07:30:52 -05:00
Noor ul ain
cb7c36c95b fix: reject host-less catalog URLs in base and preset validators (#3209) (#3227)
`CatalogStackBase._validate_catalog_url` (inherited by `IntegrationCatalog`)
and `PresetCatalog._validate_catalog_url` checked `parsed.netloc`, which is
truthy for host-less URLs like `https://:8080` (port only) or `https://user@`
(userinfo only). Such URLs slipped past validation despite the error message
promising "a valid URL with a host", then failed later with a confusing fetch
error.

Switch both validators to `parsed.hostname` (None for those inputs), matching
the workflow, step, and bundler catalog validators that already do this.

Add regression tests covering port-only and userinfo-only URLs for both
validators.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 07:18:39 -05:00
Ben Buttigieg
b1bd9180ca fix(goose): repoint install_url and docs to goose-docs.ai (#3171) (#3215)
* fix(goose): repoint install_url and docs to goose-docs.ai (#3171)

Goose moved to the Agentic AI Foundation; docs moved from block.github.io/goose
to goose-docs.ai. Update install_url and the docs reference link.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(goose): restore table column alignment

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:43:06 -05:00
Ali jawwad
c5fb3dc86f fix(bundle): send command errors to stderr so --json stdout stays parseable (#3235)
The bundle command group's _fail() helper is documented as printing 'to stderr', and the module contract is 'human logs go to stderr/console' while --json 'emits machine-readable data on stdout'. But it called console.print(), and the shared console writes to STDOUT, so every bundle error (every command routes through _fail) landed on stdout -- corrupting the JSON stream that --json consumers parse.

Add a stderr-bound err_console to _console.py (its documented role as the single Console source) and use it in _fail. stdout now carries only the JSON payload.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 15:46:56 -05:00
Manfred Riem
53d9543355 feat: make agent-context extension a full opt-in (#3097)
* docs: add Spec Kit spec for agent-context full opt-in

Use Spec Kit's own specify workflow to author the spec that makes the
agent-context extension a full opt-in, removing all agent-context
configuration/support from the Python codebase and removing the
deprecation message. Force-added despite specs/ being gitignored; the
generated artifact will be purged prior to merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add Spec Kit plan artifacts for agent-context full opt-in

Phase 0/1 of the SDD plan workflow: plan.md, research.md, data-model.md,
quickstart.md, and contracts/cli-behavior.md. Constitution Check is a
documented no-op (repo has no ratified constitution). Force-added despite
specs/ being gitignored; generated artifacts will be purged prior to merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: correct Constitution Check against ratified v1.0.0

Earlier draft wrongly treated the gate as a no-op; the fork's main is 16
commits behind upstream/main, which carries .specify/memory/constitution.md.
Re-evaluate the feature against Principles I-V (all PASS) and note that
Principle I mandates keeping context_file as a declared class attribute,
validating the R1 metadata decision.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: refresh plan artifacts against synced upstream/main

After syncing fork main to upstream and rebasing, re-scan the current
agent-context surface. Upstream generalized the single context_file into a
plural context_files concept with new resolver helpers
(_resolve_context_files, _resolve_context_file_values,
_format_context_file_values) and upsert/remove now loop over multiple
files. Update research.md, data-model.md, contracts, quickstart grep
guards, and the plan summary to cover the expanded removal scope.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add Spec Kit tasks for agent-context full opt-in

Phase 2 of SDD: dependency-ordered tasks.md (30 tasks) organized by the
three user stories, with mandatory test tasks (Constitution Principle II)
and a foundational phase decoupling __CONTEXT_FILE__ resolution from the
extension config. Includes the extension self-seeding task (T015) and a
static guard test (T002) enforcing zero agent-context references in the CLI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat!: remove agent-context lifecycle from the Specify CLI

Make the agent-context extension a full opt-in. The CLI no longer
installs the extension during init, writes agent-context-config.yml,
or creates/updates/removes the managed Spec Kit section in agent
context files. Context-section upsert/remove, marker resolution,
extension-enabled gating, the config helpers, and the obsolete inline
deprecation warning are all removed. Integration context_file stays as
inert metadata; __CONTEXT_FILE__ now resolves from registry metadata.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(agent-context): self-seed context file from the active integration

When agent-context-config.yml has no context_file/context_files, the
bundled bash and PowerShell update scripts now resolve the context file
from the active integration in .specify/init-options.json via the
integration registry, so the extension no longer depends on the CLI
writing its config.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test+docs: update suite and docs for agent-context opt-in

Update integration/extension tests to expect no agent-context install,
config, or context-section writes during init. Add a static guard test
(test_agent_context_cli_free.py) asserting the CLI source is free of
agent-context lifecycle symbols, plus backward-compatibility tests for
legacy projects. Refresh AGENTS.md, the extension README, and add a
CHANGELOG entry describing the opt-in behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(agent-context): warn on self-seed failure, correct docs, speed up guard test

Address PR review feedback:
- Self-seed scripts (bash + PowerShell) now emit an actionable warning when
  an active integration is configured but specify_cli cannot be imported by
  the chosen Python (e.g. pipx installs), or when the integration declares no
  context file, instead of silently falling through to 'nothing to do'.
- Correct the extension README disable note: command rendering never reads the
  extension config; __CONTEXT_FILE__ is always substituted from integration
  metadata, so a stale context_files value cannot affect rendering.
- Cache CLI source reads in the static guard test via a module-scoped fixture
  so the directory walk happens once instead of once per forbidden symbol.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(agent-context): ship self-owned per-agent context-file defaults

The extension now bundles agent-context-defaults.json (key→context_file
map) and self-seeds from it, dropping any dependency on the Specify CLI
registry. Both the bash and PowerShell update scripts read the bundled
JSON map keyed by the active integration from init-options.json.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat!: remove all agent-context state from the Specify CLI

Strip every context_file reference from the CLI: the field on all 35
integration classes, the IntegrationBase plumbing (process_template
param/step, _context_file_display, docstrings), the __CONTEXT_FILE__
resolution in agents.py, the legacy context_file/context_markers
popping in _helpers.py, and the context_file template in
integration_scaffold.py. Also drop the Agent context update step and
__CONTEXT_FILE__ placeholder from templates/commands/plan.md.

The agent-context extension now solely owns all context-file knowledge,
including the per-agent default mapping.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: drop context_file coverage and guard against CLI reintroduction

Remove CONTEXT_FILE attrs and context_file assertions across the base
mixins, all 35 per-integration test files, shared integration tests, and
conftest stubs. Rewrite the base-mixin context tests to assert no managed
section is written and no __CONTEXT_FILE__ placeholder survives. Extend
the CLI-free static guard to forbid context_file, __CONTEXT_FILE__, and
_context_file_display in src/specify_cli, and have the extension tests
copy the bundled defaults JSON so self-seed runs without the CLI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: reflect full removal of agent-context state from the CLI

Update AGENTS.md (integration examples, required-fields table, context
behavior section, pitfalls), CHANGELOG, and the SDD spec artifacts
(FR-007, SC-002, data-model) to state that the CLI carries no
context_file and the extension fully owns the per-agent default mapping
via agent-context-defaults.json.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: align SDD artifacts with full context_file removal

Update research.md (R1, R2, R4, summary table), contracts/cli-behavior.md
(C3, C5), tasks.md (Phase 2, T026, notes), plan.md (Principle I, source
map), and checklists/requirements.md so the spec artifacts reflect the
implemented decision: the CLI carries no context_file attribute or
__CONTEXT_FILE__ resolution, and the per-agent defaults map lives in the
extension. Resolves PR review #4548130110.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: scrub stale context-file mentions from CLI docstrings

Update the multi_install_safe docstring (drop the removed "context file"
invariant), the RovoDev setup docstring (no longer upserts a context
section), the Copilot module docstring (drop the context-file line), and
tighten the _update_init_options_for_integration note. Pure docstring
changes — no behavioral impact. Resolves PR review #4548237085.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test+docs: harden agent-context test helper and fix stale docs

- base.py: document multi_install_safe as an optional subclass attribute
  in the IntegrationBase docstring.
- test_cli.py: clarify the init-options assertion is guarding against
  leftover legacy agent-context keys, not relocation.
- test_extension_agent_context.py: _install_agent_context_config now
  asserts the bundled agent-context-defaults.json exists and always
  copies it, so self-seeding tests fail loudly instead of silently
  skipping when the map is missing.
- test_integration_cursor_agent.py: drop Path/IntegrationManifest imports
  left unused after removing the context-section frontmatter tests.

Resolves PR review #4548293116.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: remove gitignored SDD artifacts from specs/

The specs/001-agent-context-full-optin/ artifacts were force-added for
dogfooding visibility, but specs/ is gitignored and these were always
intended to be purged before merge. Remove them so merging does not add
an intentionally-untracked directory to repo history.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: keep CHANGELOG.md identical to upstream

CHANGELOG.md is auto-generated at release time, so the branch should not
carry a manual entry. Restore it to match upstream/main exactly.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: preserve Cursor .mdc frontmatter in agent-context updater scripts

The bundled agent-context updater scripts wrote the managed section as
plain text. For Cursor-style `.mdc` targets this dropped the required
`---\nalwaysApply: true\n---` frontmatter, reintroducing the rule-loading
bug originally fixed in #1699. Port the `_ensure_mdc_frontmatter` logic
into both the bash and PowerShell updaters: prepend frontmatter when
missing, repair `alwaysApply` when set to the wrong value, and leave
non-`.mdc` targets untouched. Add regression tests covering both shells.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: scope CLI-free guard to agent-context-specific symbols

Drop the bare "context_file" substring from FORBIDDEN_SYMBOLS so the
guard no longer fails on unrelated future CLI fields named context_file.
The list still covers agent-context-specific identifiers (__CONTEXT_FILE__,
_context_file_display, _resolve_context_files, _resolve_context_file_values).

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: harden agent-context bash self-seed against malformed init JSON

Two robustness fixes in the embedded Python self-seed logic:
- Coerce the integration value from init-options.json to a string only when
  it is actually a string; otherwise treat it as unset so a corrupted
  dict/list value degrades to the existing nothing-to-do behavior instead of
  breaking the agents-map lookup.
- Normalize agent-context-defaults.json: only use 'agents' when both the JSON
  root and the 'agents' value are dicts, so a wrong-shaped (but valid) JSON
  falls back to the warning path instead of raising on .get.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: correct PowerShell hyphenated key lookup and regex replace count

- Self-seed now reads the defaults mapping via
  $defaults.agents.PSObject.Properties[$integrationKey].Value instead of
  member access ($defaults.agents.$integrationKey), which parsed hyphenated
  keys like 'cursor-agent'/'kiro-cli' as subtraction and failed to resolve.
- Replace the static [regex]::Replace(..., 1) call, whose trailing 1 was
  interpreted as RegexOptions.IgnoreCase rather than a replacement count, with
  an instance Regex whose Replace(input, replacement, 1) limits to the first
  match as intended.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make bash .mdc frontmatter guard case-insensitive

The bash updater only injected Cursor .mdc frontmatter when ctx_path ended
in lowercase '.mdc', so a mixed/upper-case extension (e.g. specify-rules.MDC)
was skipped and Cursor would not auto-load the rule file. Compare against the
casefolded path. The PowerShell variant already uses -match, which is
case-insensitive by default, so no change is needed there.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: document separator-agnostic agent-context update invocation

The README hard-coded the dot-notation slash command
(/speckit.agent-context.update), which hyphen-separator agents like Forge and
Cline do not recognize. Document the canonical command ID plus both slash
invocations so users copy the form their agent accepts.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 15:27:26 -05:00
Ali jawwad
876dca8659 fix(workflows): gate validate() must not crash on non-string options (#3233)
GateStep.validate() reports non-string options as an error, but then -- when on_reject is 'abort'/'retry' -- still runs the reject-choice check 'any(o.lower() in ... for o in options)'. For a non-string option (e.g. options: [123]) o.lower() raised AttributeError, which escaped validate() and broke validate_workflow's documented 'return a list of errors, never raise' contract. Guard the check so it only runs when every option is a string (the non-string case is already reported above).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 15:07:50 -05:00
Ali jawwad
9ece347a77 fix(workflows): make pipe-filter detection quote-aware in expressions (#3232)
_evaluate_simple_expression used 'if "|" in expr' / expr.split("|", 1) to detect a filter pipe, so a literal '|' inside a quoted operand (e.g. inputs.x == 'a|b') was mistaken for a filter separator and raised a spurious ValueError ('unknown filter') instead of comparing the string. Use the existing quote/bracket-aware _find_top_level helper (added for the operator-splitting fix) so only a top-level pipe is treated as a filter separator.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 14:55:45 -05:00
Huy Do
3036fe6954 fix(workflows): reject a fan-in wait_for that names an unknown step at validation (#3225)
* fix(workflows): reject a fan-in wait_for that names an unknown step at validation

* fix(workflows): reject fan-in wait_for self-reference and non-string entries

Address review feedback on the fan-in wait_for validator:

- A fan-in's own id is added to seen_ids before the wait_for check, so
  `wait_for: [<self>]` passed validation while producing a silent empty
  join at runtime. Reject self-references explicitly.
- Non-string entries (e.g. YAML `wait_for: [123]`) were skipped by the
  isinstance(str) guard and validated even though they can never match a
  real step id. Flag them as wiring errors.

Add coverage for both cases.
2026-06-29 14:52:08 -05:00
Manfred Riem
bbc5f176e3 fix(extensions): apply GHES auth and resolve release assets for extension add --from (#3217)
* fix(extensions): apply GHES auth and resolve release assets for --from

The 'specify extension add --from <url>' path fetched ZIPs via a bare
open_url with no GitHub release-asset resolution and no Accept header,
diverging from the catalog download path. Against GHES it received an
HTML login page and failed obscurely with zipfile.BadZipFile.

Route --from through ExtensionCatalog so configured GHES credentials
apply and release-download URLs resolve via /api/v3, and reject non-ZIP
content with a clear error pointing at auth.json.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(extensions): use zipfile.is_zipfile for --from content guard

Replace the weak zip_data.startswith(b"PK") prefix check with
zipfile.is_zipfile() on a BytesIO so any non-ZIP payload (not just
those lacking the PK magic) is rejected with the friendly error before
install_from_zip can raise BadZipFile. Addresses PR review feedback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 11:31:10 -05:00
Ben Buttigieg
ac47178f65 fix(pi): repoint install_url to @earendil-works/pi-coding-agent (#3169) (#3214)
The @mariozechner/pi-coding-agent npm package is deprecated in favor of
@earendil-works/pi-coding-agent. Pi Coding Agent is still active under the
new org, so update the install_url rather than removing the integration.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 11:25:38 -05:00
Quratulain-bilal
5bdcb4ad14 fix(catalogs): reject host-less catalog URLs in base and preset validators (#3210)
the shared CatalogStackBase validator and PresetCatalog validator
checked parsed.netloc to enforce 'a valid URL with a host'. but netloc
is truthy for host-less URLs like https://:8080 or https://user@, so
those slipped through even though they have no host - contradicting the
error message. the workflow, step, and bundler validators already check
parsed.hostname (which is None in those cases); this aligns the two
stragglers with that. add regression tests covering port-only and
userinfo-only URLs.
2026-06-29 10:29:14 -05:00
WOLIKIMCHENG
9a40ed0b6e fix: update CodeBuddy install docs URL (#3187)
* fix: update CodeBuddy install docs URL

* test: assert codebuddy integration is registered before checking install_url

---------

Co-authored-by: root <kinsonnee@gmail.com>
2026-06-29 10:26:10 -05:00
Ali jawwad
d378485696 fix(workflows): reject infinite number-input default instead of raising OverflowError (#3199)
WorkflowEngine._coerce_input normalizes a whole-valued number to int via int(value). For an infinite float (e.g. a 'type: number' input with YAML 'default: .inf') int(inf) raises OverflowError, which is not in the except (ValueError, TypeError) tuple. validate_workflow eager-coerces declared defaults and is documented to RETURN a list of errors, but it only catches ValueError -- so the OverflowError escaped and validate_workflow raised instead of reporting, breaking its contract. (NaN already surfaced cleanly because int(nan) raises ValueError.)

Add OverflowError to the except tuple so an infinite default surfaces as the same clean 'expected a number' ValueError as NaN, consistent with the function's existing fail-fast-on-authoring-mistakes design. Finite values (5.0 -> 5, 3.5 -> 3.5) are unaffected.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 10:05:51 -05:00
Ali jawwad
2a9db1d350 fix(workflows): make expression operator/literal parsing quote-aware (#3197)
_evaluate_simple_expression split on operator keywords using naive str.find/split, so a keyword INSIDE a quoted operand was treated as an operator: `inputs.mode == 'read and write'` split on the inner ' and ' and evaluated as `(inputs.mode == 'read) and (write')`. The literal short-circuit was also too greedy -- `'a' == 'b'` matched startswith("'")/endswith("'") and was stripped to the garbage truthy string `a' == 'b`, so `'done' == 'failed'` evaluated truthy and gated the wrong branch.

Add a quote/bracket-aware _find_top_level helper (mirroring the existing _split_top_level_commas) and use it for the and/or/comparison/in/not-in splits; tighten the literal short-circuit to fire only when the opening quote's match is the final char. The docstring already lists comparisons + and/or/not + in/not-in + string literals as supported, so this restores the documented contract.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 09:53:35 -05:00
Manfred Riem
c49966da4d fix(claude): stop forking /speckit-analyze to prevent long-session freezes (#3188)
PR #2511 added `context: fork` + `agent: general-purpose` to the generated
speckit-analyze SKILL.md on the assumption that its heavy reads collapse to a
short summary. In practice /speckit-analyze returns a 300-500 line report that
is injected back into the main conversation. In long sessions each subsequent
fork inherits that growing context, compounding overhead until the chat
freezes (#3185).

Empty FORK_CONTEXT_COMMANDS so no command opts into context: fork, restoring
direct in-session execution for analyze. The injection mechanism is retained
so a command can be re-enabled once it genuinely returns a compact result.

Fixes #3185

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-26 12:11:54 -05:00
Si Zengyu
1add20341d fix(extensions,presets,workflows): resolve private GHES release assets via /api/v3 (#3157)
* feat(auth): add github_provider_hosts() to enumerate GHES hosts from auth.json

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* fix(extensions): resolve GHES release assets via /api/v3

Generalizes resolve_github_release_asset_api_url to GitHub Enterprise
Server hosts (gated by auth.json github hosts), fixing private GHES
extension/preset downloads. github/spec-kit#3147

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* fix(extensions,presets): pass auth.json github hosts into release resolver

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* docs(auth): document GHES private catalog + release-asset auth

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* fix(presets,workflows): pass auth.json github hosts into remaining release resolvers

Wires preset add --from and workflow add through github_provider_hosts()
so private GHES release assets resolve via /api/v3 there too. github/spec-kit#3147

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* test(presets): use module-level io.BytesIO in GHES preset test

Addresses Copilot review on PR #3157: drop unnecessary __import__("io")
in test_preset_add_from_ghes_release_url_resolves_via_api_v3 since io is
already imported at module level.

* fix(github-http): pass through GHES asset API URLs by path shape

Addresses Copilot review on PR #3157. A direct GHES /api/v3 release asset
URL was only returned as already-resolved when its host was in the
allowlist; otherwise the resolver returned None and the caller downloaded
the same URL without 'Accept: application/octet-stream', fetching JSON
metadata instead of the binary.

Gate the passthrough on path shape alone, mirroring the github.com case.
This is safe: passthrough returns the input URL unchanged and the caller
fetches it either way, so no new request to an arbitrary host is induced;
the token stays independently gated by auth.json in open_url. The
allowlist remains the anti-SSRF gate on the tag-lookup resolving path.

Add test_passthrough_for_unlisted_ghes_api_asset_url.
2026-06-25 10:44:30 -05:00
meymchen
dc840f07d0 feat(integration): update Kimi integration for Kimi Code CLI (#2979)
* feat(integration): update Kimi integration for Kimi Code CLI

Update the Kimi integration to target the new Kimi Code CLI
(MoonshotAI/kimi-code) layout:

- Change skills directory from .kimi/skills/ to .kimi-code/skills/
- Change context file from KIMI.md to AGENTS.md
- Extend --migrate-legacy to move old .kimi/skills/ installs and
  migrate KIMI.md user content to AGENTS.md
- Clean up leftover legacy .kimi/skills/ directories on teardown
- Update devcontainer installer to @moonshot-ai/kimi-code
- Update docs and tests

Relates to #1532

* fix(integration): align Kimi dispatch and harden legacy migration

- Override build_command_invocation to emit /skill:speckit-<stem>
  so dispatched commands match Kimi Code CLI's native slash syntax.
- Skip symlinked .kimi/skills directories during legacy migration
  and teardown to avoid operating on files outside the project.
- Remove kimi from the multi-install-safe integrations table.
- Add tests for command invocation and symlink safety.

* fix(integration): resolve custom context markers in Kimi legacy migration

Use IntegrationBase._resolve_context_markers() when migrating legacy
KIMI.md content so that projects with customized context_markers in
.specify/extensions/agent-context/agent-context-config.yml have the
managed section stripped with the correct markers instead of the
hard-coded defaults.

Adds a test verifying custom markers are respected during
--migrate-legacy.

* fix(integration): harden Kimi legacy migration against symlinked paths

* fix(kimi): guard symlinked SKILL.md during migration and teardown

* docs(kimi): mention KIMI.md→AGENTS.md migration in --migrate-legacy help

The --migrate-legacy help text listed only the skills directory move and
dotted→hyphenated renaming, but the flag also migrates KIMI.md user content
into AGENTS.md. Align the help with the actual behavior, docs, and tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(kimi): validate legacy migration destination; clarify docstrings

Address Copilot review feedback on PR #2979:

- setup(): gate skills migration on _is_safe_legacy_dir(new_skills_dir)
  as well as the source. base setup() already rejects a destination that
  escapes the project root, but an in-tree symlinked .kimi-code/skills
  (e.g. -> .) could still misdirect the move; this gives the destination
  the same symlink-component protection as the source.
- _migrate_legacy_kimi_dotted_skills: rewrite docstring as a compatibility
  shim describing same-path delegation to _migrate_legacy_kimi_skills_dir.
- test_presets: clarify that the dotted-skill test exercises legacy naming
  under the current .kimi-code/ base, not the legacy .kimi/ location.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(kimi): harden legacy KIMI.md→AGENTS.md context migration

- Skip context-file migration when the agent-context extension is
  disabled, matching upsert/remove_context_section opt-out behavior so
  an opted-out project's KIMI.md/AGENTS.md are left untouched.
- Safely skip (instead of raising) on filesystem edge cases: unreadable
  or non-UTF-8 KIMI.md, and AGENTS.md existing as a non-file/unwritable.
- Refuse to migrate a corrupted managed section (single marker, or end
  before start) so a partial managed block is never copied into
  AGENTS.md; KIMI.md is preserved for manual repair.

Add regression tests for all three cases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Approve fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* chore(kimi): revert CHANGELOG.md edit (auto-generated)

The CHANGELOG is generated from merged PR titles, so a hand-written entry
is redundant; it was also placed under the already-released 0.10.2 section,
which would make those release notes historically inaccurate. Revert to
match main per maintainer feedback.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(kimi): skip symlink-safety tests when symlinks are unavailable

The Kimi legacy-migration safety tests create symlinks to assert that
migration/teardown never follow them out of the project. Symlink creation
fails on Windows without the create-symlink privilege and in some restricted
CI sandboxes, so these tests errored during setup instead of skipping.

Wrap every symlink_to() call in a shared _symlink_or_skip() helper that
pytest.skip()s on OSError/NotImplementedError, matching the guard pattern
already used by one of these tests. Verified on Windows: the 6 symlink tests
now skip cleanly (51 passed, 6 skipped) instead of erroring.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(kimi): reject symlinked skills destination before install

Add a destination symlink pre-check in KimiIntegration.setup() before
super().setup() writes any SKILL.md. The base class only rejects a
destination that escapes project_root after resolve(), so an in-tree
symlinked .kimi-code/.kimi-code/skills (e.g. `-> .`) would still
misdirect writes into an unintended in-tree location (./skills/).

Extract the symlink-component walk into a shared _has_symlinked_component()
helper and reuse it from _is_safe_legacy_dir(). Add a regression test.

Also clarify that --migrate-legacy only migrates KIMI.md -> AGENTS.md when
the agent-context extension is enabled, in the CLI help text and the
integration docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Refactor formatting and simplify logic in Kimi integration

* fix(kimi): reject symlinked target dir during legacy skills migration

When the migration destination already exists, guard against a symlinked
(or non-directory) target_dir before comparing SKILL.md bytes, so the
comparison never follows a link outside the project root. Also skip a
missing/non-file target SKILL.md explicitly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-24 15:22:08 -05:00
Ali jawwad
fdaaf18371 fix(workflows): preserve commas inside quoted list-literal elements (#3134)
* fix(workflows): preserve commas inside quoted list-literal elements

The simple-expression evaluator parsed a list literal with a naive
`inner.split(",")`, which splits on commas inside quoted strings (and
nested brackets). So `{{ ["a, b", "c"] }}` evaluated to three items
(`["a", "b", "c"]`) instead of two, silently corrupting `fan-out` `items:`
and any list expression that contains a comma inside a quoted element.

Split list-literal elements on top-level commas only, ignoring commas
inside quotes or nested brackets, via a small `_split_top_level_commas`
helper. Plain and empty lists are unchanged.

Add tests covering quoted commas, nested lists, and the existing
plain/empty cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(workflows): cover single-quoted and nested list literals

Address review: extend the list-literal regression test to assert single-quoted elements with commas and nested lists parse correctly, alongside the existing double-quoted cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 15:10:02 -05:00
Zied Jlassi
b042d2a843 feat(extensions): verify catalog archive sha256 before install (#3080)
* feat(extensions): verify catalog archive sha256 before install

Extension and preset archives were downloaded over HTTPS and unpacked
(with Zip-Slip protection) but their bytes were never checked against a
known digest. Trust rested entirely on TLS and the integrity of the
release host, so a tampered or swapped archive from a compromised
third-party release would be installed silently. Maintainers do not audit
extension code, so consumer-side integrity is the only available defence.

Catalog entries may now pin an optional `sha256` digest. When present, the
downloaded archive is verified before it is written to disk and installed;
a mismatch aborts with a clear error. Entries without `sha256` keep
working unchanged (a DEBUG line records that the download was unverified),
so the change is backwards compatible. The check runs on both download
paths (extensions and presets) via a single shared helper so the two stay
in parity.

- Add `verify_archive_sha256` helper in shared_infra (digest match,
  `sha256:` prefix, case-insensitive; DEBUG log when no digest declared)
- Enforce it in ExtensionCatalog.download_extension and
  PresetCatalog.download_pack, before the archive is written to disk
- Document the optional `sha256` field in the publishing guides
- Tests: helper unit tests + matching/mismatch/no-digest on both paths

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
Assisted-by: AI

* fix(extensions): harden sha256 parsing and tidy download test mocks

Follow-up to the review on #3080:

- shared_infra.verify_archive_sha256: strip only a literal `sha256:`
  algorithm prefix (case-insensitive) instead of `split(':', 1)[-1]`,
  which silently dropped any prefix — so `md5:<64-hex>` was accepted as
  if it were a valid SHA-256. Validate that the declared value is exactly
  64 hex characters and raise a clear error otherwise, and compare with
  `hmac.compare_digest` for a constant-time check. Add tests covering a
  malformed digest and a non-`sha256:` prefix (both previously accepted).
- Download test helpers: configure the context-manager mock via
  `__enter__.return_value`/`__exit__.return_value` rather than assigning a
  `lambda s: s`, which is clearer and independent of the invocation arity.

Assisted-by: AI
Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>

* fix(extensions): reject a declared-but-empty sha256 instead of skipping verification

verify_archive_sha256 skipped on any falsy expected value, so a present-but-empty digest (e.g. sha256: "" reached via ...get("sha256")) silently disabled the integrity check instead of surfacing the authoring error. Guard on expected is None so only an absent digest skips; blank/whitespace/bare-prefix values fall through to the 64-hex validation and are rejected. Adds a regression test.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

* docs(shared_infra): clarify _SHA256_HEX_RE accepts and normalizes uppercase

The comment described the regex as matching '64 lowercase' hex characters,
but verify_archive_sha256 lowercases the declared value (raw.lower()) before
matching, so an uppercase digest is accepted and normalized rather than
rejected. Clarify the comment to avoid misleading future readers.

Addresses Copilot review feedback on shared_infra.py.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

* test(presets): cover the no-sha256 backwards-compatible path

Address Copilot review: download_pack's optional sha256 verification was
tested for match/mismatch but not the backwards-compatible path where a
catalog entry has no sha256 (pack_info.get("sha256") is None). Add a
no-sha256 test mirroring the extensions coverage so the helper never
silently becomes mandatory for presets.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

---------

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>
2026-06-24 14:52:24 -05:00
Zied Jlassi
f846d6526c fix(workflows): validate requires keys and reject phantom permissions gate (#3079)
* fix(workflows): validate requires keys and reject phantom permissions gate

A workflow's `requires` block was parsed but its keys were never
validated, so a typo or an unsupported key was silently ignored. Most
importantly, authors could write `requires.permissions.shell: true`
expecting a runtime capability gate — but no such gate exists: a `shell`
step always runs with the user's privileges. The declaration gave a
false sense of sandboxing.

`validate_workflow` now accepts only the recognised keys
(`speckit_version`, `integrations`, `tools`, `mcp`) and rejects anything
else, with an explicit error for `requires.permissions` pointing authors
to `gate` steps for approval. Docs and the model comment are updated to
state that `requires` is advisory, not a security boundary.

- Reject non-mapping `requires`, unknown keys, and `requires.permissions`
- Clarify workflows reference + PUBLISHING.md shell-step guidance
- Tests for valid keys, non-mapping, unknown key, and permissions

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
Assisted-by: AI

* fix(workflows): address review feedback on requires validation

Follow-up to the review on #3079:

- Guard `requires` validation on `is not None` instead of truthiness so a
  falsy non-mapping value (e.g. `requires: []` or `requires: ''`) is
  reported as an error instead of being silently skipped; `requires:`
  (YAML null) is still treated as an omitted block. Add a regression test.
- Reword the workflows security note so `requires.permissions` is shown
  as rejected/unsupported rather than as a valid example of `requires`.
- Standardize on US spelling (`_RECOGNIZED_REQUIRES_KEYS`, "recognized")
  to match the surrounding code and ease searching.
- Tighten the permissions-rejection test to assert on specific message
  markers (`requires.permissions` and the `gate` guidance) so it fails if
  the validation path or wording drifts.

Assisted-by: AI
Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>

* fix(workflows): scope requires validation to workflow keys (drop tools/mcp)

tools and mcp belong to the bundle manifest requires schema (bundler/models/manifest.py, resolved in bundler/services/resolver.py), not the workflow requires validated here. Drop them from _RECOGNIZED_REQUIRES_KEYS and revert the PUBLISHING.md claim that this PR had introduced, so workflow requires only recognizes speckit_version and integrations.

This keeps the existing docs accurate and resolves the inline doc-consistency review comments.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

* refactor(workflows): type WorkflowDefinition.requires as Any pre-validation

self.requires holds the raw parsed value, which before validate_workflow()
runs may be a non-mapping (None for a bare 'requires:', a list for
'requires: []', etc.). Annotating it dict[str, Any] was misleading for
editors/type-checkers; use Any and document that validate_workflow() enforces
the mapping shape.

Addresses Copilot review feedback on engine.py.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

* fix(workflows): reject YAML-null requires: as a non-mapping

Address Copilot review: validate requires the same way as inputs. A
bare requires: parses as YAML null and was previously treated as an
omitted block, which is inconsistent with inputs and lets a stray
requires: line be silently ignored.

Drop the is-not-None guard and check isinstance(..., dict) directly: an
omitted block still defaults to {} (valid), but a present-but-non-mapping
value -- YAML null, [] or '' -- is now an authoring error that surfaces.

Tests: add YAML-null rejection + an omitted-is-still-valid guard test.
Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

---------

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>
2026-06-24 14:49:43 -05:00
Omar
44ef11aa18 feat(integrations): add omp support (#3107)
* feat(integrations): add omp support

* Update updated_at timestamp

* refactor(integrations): delegate omp build_exec_args to base, register in issue templates

Inherit MarkdownIntegration.build_exec_args so omp picks up shared CLI
contract changes (requires_cli gating, extra-args ordering, --model
handling) automatically; only specialize the --mode json flag.

Also add Oh My Pi / omp to the issue-template agent lists so
test_issue_template_agent_lists_match_runtime_integrations passes.

* fix(integrations): use --print + positional prompt for omp argv

OMP's CLI parser treats `-p`/`--print` as a boolean (one-shot mode)
and consumes the prompt as a positional message; the previous
inherited `-p <prompt>` shape worked by accident only because `-p`
ignores its next token. Build the argv explicitly with flags first
and the prompt as a trailing positional, matching upstream args.ts.
2026-06-24 13:44:34 -05:00
Ali jawwad
034fbfcbb4 fix: render valid TOML when a command body contains backslashes (#3135)
render_toml_command() emitted the body inside a multiline *basic* TOML
string ("""..."""), which processes backslash escape sequences. A command
body containing a backslash — e.g. a Windows path like C:\Users\... whose
\U reads as an invalid unicode escape — therefore produced unparseable TOML
("Invalid hex value"), so the generated Gemini/Tabnine command file failed
to load. A body ending in a backslash also silently ate the closing newline
via TOML line-continuation.

Route bodies containing a backslash to the multiline *literal* form
('''...'''), which does not process escapes, or to the escaped basic string
when both triple-quote styles are present. Mirrors the escaping already done
by base.py's TomlIntegration.

Add tests covering a Windows path, a trailing backslash, and the
backslash + both-triple-quote-styles fallback.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 13:13:44 -05:00
Pascal THUET
8e76ff3d5c harden: reject shell=True in run_command (#3132)
run_command() forwarded shell= straight to subprocess.run, so a caller
passing shell=True would invoke a shell. Reject shell=True with ValueError
(keeping the parameter for signature compatibility) and drop shell= from
both subprocess.run calls.

Enable ruff S602/S604/S605 to flag any future shell=True reintroduction,
annotate the one intentional workflow shell sink with # noqa: S602, and
document the shell-step execution risk in workflows/PUBLISHING.md.
2026-06-24 13:05:21 -05:00
JinHyuk Sung
0c975bbef7 fix: write Codex dev skills as files (#2988)
* fix: write Codex dev skills as files

* fix: route codex dev symlink policy through metadata

* fix: replace codex dev symlinks on refresh

* fix: migrate codex dev skill symlinks

* fix: avoid inactive shared skill dev symlinks

* fix: preserve unrelated dev skill symlinks
2026-06-23 10:55:38 -05:00
Ali jawwad
0a126256e0 feat: add Firebender integration (Android Studio / IntelliJ) (#3077)
* feat: add Firebender integration (Android Studio / IntelliJ)

Firebender (https://firebender.com/) is an AI coding agent for Android
Studio and IntelliJ. It reads project-local custom slash commands from
.firebender/commands/*.mdc and project rules from .firebender/rules/*.mdc.

Add a FirebenderIntegration (MarkdownIntegration) that installs the
speckit command templates as .mdc command files and writes the managed
context section into .firebender/rules/specify-rules.mdc. command_filename
is overridden so init-time commands also use the .mdc extension Firebender
requires. Register it in the integration registry, add the catalog entry
and docs row, and add an integration test covering the .mdc command output.

Closes #1548

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat: address review - bump catalog updated_at and list firebender as multi-install safe

Bump the catalog top-level updated_at to reflect the new entry, and add firebender (with its .firebender/commands + .firebender/rules/specify-rules.mdc isolation paths) to the 'currently declared multi-install safe integrations' table in the docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 08:01:00 -05:00
José Villaseñor Montfort
e5a03bffc8 fix(shared-infra): remove stale managed scripts the core no longer ships (#3076) (#3098)
* fix(shared-infra): remove stale managed scripts the core no longer ships (#3076)

install_shared_infra never removed shared scripts a prior (pre-refactor) install recorded but the current core no longer ships — e.g. the legacy scripts/<variant>/update-agent-context.sh, superseded by the bundled agent-context extension. On a legacy project the orphan lingers and crashes when it sources a refreshed common.sh (HAS_GIT unbound under set -u).

Apply the stale-removal that integration_upgrade already performs to install_shared_infra: manifest-tracked scripts the current bundle no longer produces are removed, but only managed copies (hash matches the manifest); user-customized files, symlinks, and recovered entries are preserved. Guarded so a missing/empty source can't trigger mass deletion, and the safe-destination check prevents unlinking through a symlinked ancestor.

Add IntegrationManifest.remove(); drop the stale update-agent-context.sh reference in CONTRIBUTING.md.

AI assistance: implemented with Claude Code (Anthropic); reviewed and validated locally (ruff clean, full suite 4176 passed, manual CLI repro).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(shared-infra): harden stale-cleanup per review (empty source + orphan manifest)

- Set scripts_scanned only after a real source file is seen, so an empty variant source can't trigger mass deletion of tracked scripts.
- Prune a stale manifest entry even when its file is already gone from disk, keeping the manifest consistent (previously left tracked forever).
- Add a test for each edge case.

Addresses the Copilot review comments on #3098. AI assistance: Claude Code (Anthropic), reviewed/validated locally (ruff clean, full suite 4178 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(shared-infra): guard unsafe manifest keys in stale-cleanup (review)

- Skip absolute / '..' manifest keys before any filesystem access in stale-cleanup, so a corrupted/hand-edited manifest can't make it touch paths outside the project root (mirrors IntegrationManifest.check_modified / uninstall).
- Clarify the scripts_scanned comment: the safety hinge is that flag, not seen_rels (which also holds template paths).
- Add a containment test: a traversal manifest key is skipped, its target untouched.

Addresses the second round of Copilot review on #3098. AI assistance: Claude Code (Anthropic); validated locally (ruff clean, full suite 4179 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(manifest): make remove() reject absolute/.. keys like its siblings (review)

IntegrationManifest.remove() now applies the same lexical validation and normalization as record_existing() / is_recovered(): absolute paths and '..' segments are rejected (return False) instead of being used verbatim as a key. Keeps the manifest API consistent. Adds tests (valid drop + no-op, absolute rejected, traversal rejected).

Addresses the third round of Copilot review on #3098. AI assistance: Claude Code (Anthropic); validated locally (ruff clean, full suite 4182 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(shared-infra): validate stale-cleanup keys for containment, not just lexically (review)

The stale-script cleanup guarded manifest keys with a lexical check only
(is_absolute() / ".." segments). On Windows a drive-relative key such as
"C:tmp\\file" is not is_absolute(), yet joining it onto the project path
discards the root — so cleanup could stat/unlink outside the project before
_ensure_safe_shared_destination raised, and a corrupted manifest key turned
into an install-time hard failure (ValueError) instead of being skipped.

Reuse the canonical containment helper (_validate_rel_path, the same one
IntegrationManifest.is_recovered / remove use): after the fast lexical reject,
resolve the join and confirm it stays within the project root; a key that
still escapes is skipped, never unlinked, never fatal.

Adds a regression test that forces _validate_rel_path to reject a managed key
(portably simulating the Windows drive-relative escape) and asserts the
install skips it without failing and still installs the real scripts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 07:36:54 -05:00
Pascal THUET
ce01877610 fix: register enabled extensions for agent on integration use/upgrade (#2949)
* fix: register enabled extensions for agent on integration install/upgrade

install and upgrade only set up the integration's own core commands; only
switch re-registered the enabled extensions' commands for the target agent.
A second integration added via install (or refreshed via upgrade) was
therefore silently missing the extension commands the existing agents
already had (e.g. the bundled agent-context extension).

Extract switch's registration into a shared _register_extensions_for_agent
helper and call it from install and upgrade too, so every installed agent
ends up with every enabled extension's commands — full parity with switch.

Closes #2886

* test: pin skills-mode secondary-agent registration; document #2948 limitation

Extension skill rendering is scoped to the active agent (init-options track a
single ai / ai_skills pair), so a skills-mode agent registered while not active
(e.g. Copilot --skills installed as a secondary integration) gets command files
rather than skills. install/upgrade match extension add here; only switch
renders skills, because it activates the target first.

Add a regression test pinning this behavior and document the limitation on the
shared helper. Per-agent skills parity is tracked separately in #2948.

* fix: don't re-render the active agent's skills when registering a non-active agent

register_enabled_extensions_for_agent runs an active-agent-scoped skills pass
(_register_extension_skills resolves the skills dir from init-options["ai"],
ignoring the passed agent). Routing install/upgrade of a secondary integration
through it re-rendered the *active* skills-mode agent's extension skills as a
side effect — resurrecting skill files the user had deliberately deleted. Gate
the skills pass on the target being the active agent; switch is unaffected
because it activates the target first.

Also harden the skills-mode install test (assert a core skill so --skills is
load-bearing, drop a vacuous registered_skills assertion) and add a regression
test. Surfaced by review of the PR; skills parity for non-active agents stays
tracked in #2948.

* refactor: share the extension-op scaffold and run (un)registration post-commit

Review cleanups, no behavior change on the success path:

- Extract the best-effort ExtensionManager scaffold (lazy import, instantiate,
  except -> _print_cli_warning) into _best_effort_extension_op. Both
  _register_extensions_for_agent and a new _unregister_extensions_for_agent
  delegate to it, removing the duplicate block left inline in switch.
- Invoke the best-effort extension registration AFTER the install/switch/upgrade
  try/except has committed, so a failure in it can never trigger the rollback
  (install and switch teardown on except).

* docs: clarify extension registration parity scope

* fix(integrations): defer extension registration until use

* fix(tests): remove redundant shutil import

* fix(integrations): backfill extensions for installed switch targets
2026-06-22 17:48:55 -05:00
darion-yaphet
826e193cee refactor: move extension command handlers to extensions/_commands.py (PR-7/8) (#3014)
* refactor: move extension command handlers to extensions/_commands.py (PR-7/8)

Convert the flat extensions.py module into an extensions/ package and
extract all extension_app and catalog_app command handlers plus their
private helpers (_resolve_installed_extension, _resolve_catalog_extension,
_print_extension_info) out of __init__.py into the new
extensions/_commands.py, mirroring the domain-dir layout used for
presets/_commands.py (PR-6) and integrations/_commands.py (PR-5).

- extensions.py -> extensions/__init__.py (pure rename, 99%); intra-module
  relative imports bumped from `.x` to `..x` since they reference root
  siblings.
- Root helpers (_require_specify_project, _locate_bundled_extension,
  load_init_options, _display_project_path) are reached through thin shims
  that re-fetch from the parent package at call time, so test
  monkeypatching of specify_cli.<helper> keeps working unchanged.
- __init__.py drops ~1444 lines (3511 -> 2067); CLI surface preserved via
  register(app).

No behavior change. Full suite failure set is identical before/after
(82 pre-existing env failures, 0 new).

* fix(extensions): preserve per-command path in update backup for skills agents

Skills agents (extension == "/SKILL.md") name every command file SKILL.md,
each in its own per-command subdir (e.g. speckit-plan/SKILL.md). The update
backup keyed the backup path on cmd_file.name alone, so all of an agent's
skill files collided onto a single backup path — each shutil.copy2 overwrote
the previous one, and rollback restored one skill's content over all the
others, corrupting or losing the rest.

Mirror the real on-disk layout by using cmd_file.relative_to(commands_dir),
keeping each backup path unique. This also makes backed_up_command_files
values unique so restore copies the correct content back to each command.

Add a regression test asserting two distinct skill files survive a
backup -> failed-update -> rollback cycle with their own content.

* style(extensions): use yaml.safe_dump when writing catalog config

The catalog add/remove handlers wrote the integration catalog config with
yaml.dump. Switch to yaml.safe_dump to align with the SafeDumper used by the
presets commands and to refuse emitting !!python/object tags if a non-basic
value ever reaches the config dict.

Output is unchanged for the current basic-type payload (str/int/bool/dict/
list) — this is a defensive/consistency change, not a behavioral fix.

* fix(extensions): correct _print_cli_warning import path in skill registration

register_enabled_extensions_for_agent imported _print_cli_warning from `.` (the extensions package), but the helper lives in the parent specify_cli package. The wrong level raised ImportError inside the error handlers, aborting extension/skill registration on the first failure instead of warning and continuing. Use `..` to match the other parent-package imports.

* fix(extensions): escape untrusted values in Rich markup output

User-provided arguments and extension/catalog metadata (names, descriptions, versions, IDs, paths) were interpolated into Rich markup strings without escaping. Values containing markup sequences (e.g. [red]...) would be parsed as markup, allowing output injection that could corrupt or mislead CLI messages.

Wrap all such interpolations with rich.markup.escape across the extension/catalog command handlers: list, search, info (_print_extension_info), add (including --dev paths), remove, enable, disable, set-priority, update, and the ambiguous-match resolvers (error strings and Table rows). Reuse the already-computed safe_extension where available.

Escaping is a no-op for benign strings, so normal output is unchanged.

* Prevent Rich markup injection in extension CLI output

User-controlled catalog URLs and extension IDs are rendered through Rich-enabled console paths, so every remaining output-only interpolation now escapes markup while leaving stored values and filesystem behavior unchanged. Regression tests cover catalog add, install hints, remove hints, and state command messages with bracketed markup-like values.

* Prevent markup injection from exception text

Rich markup remains enabled for styled CLI messages, so exception text and config path labels must be escaped before rendering. YAML parser errors, URL validation failures, download errors, and extension validation errors can include user-controlled catalog or manifest values.

Constraint: Preserve existing exception handling and user-facing error paths

Rejected: Disable Rich markup for these messages | existing output intentionally uses markup for labels and styling

Confidence: high

Scope-risk: narrow

Directive: Escape user-controlled exception text before interpolating into Rich-rendered strings

Tested: .venv/bin/python -m pytest tests/test_extensions.py -q

Co-authored-by: OmX <omx@oh-my-codex.dev>

* Prevent path and manifest review regressions

Catalog path labels are rendered through Rich markup and downloaded update manifests are trusted long enough to validate extension IDs. Escape displayed project paths before rendering, and reject non-mapping extension.yml payloads before ID validation so bad archives fail with a clear rollback reason.

---------

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-06-22 13:40:57 -05:00
meymchen
6a3ee9b64e feat: add ZCode (Z.AI) integration (#3063)
* feat: add ZCode (Z.AI) integration

Add a skills-based integration for ZCode, Z.AI's Claude-Code-style
agent. ZCode uses the same SKILL.md layout as Claude Code, so spec-kit
installs workflows into .zcode/skills/speckit-<name>/SKILL.md, invoked
in chat as $speckit-<name>.

- ZcodeIntegration(SkillsIntegration) with .zcode/ folder and --skills option
- Register in INTEGRATION_REGISTRY
- Catalog entry (tags: cli, skills, z-ai)
- Tests via SkillsIntegrationTests mixin
- Document in integrations reference and README

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix: render $speckit-* invocations for ZCode skills

ZCode is documented as a skills agent invoked with $speckit-<command>,
but the central invocation rendering only special-cased codex, so
specify init Next Steps and extension hooks rendered the dotted
/speckit.<command> form instead.

Centralize the $speckit-* decision in a DOLLAR_SKILLS_AGENTS set with an
is_dollar_skills_agent() helper, and route both init Next Steps and
HookExecutor._render_hook_invocation through it. Add ZCode invocation
regression tests mirroring the existing Codex/Kimi coverage.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 12:14:18 -05:00
Austin Z.
bbdf1b8f40 fix(agent-context): support multiple context files safely (#2969)
* fix(agent-context): support multiple context files safely

* fix(agent-context): harden context file validation

* fix(agent-context): preserve disabled context target

* fix(agent-context): address review follow-ups

* fix(agent-context): dedupe PowerShell context files

* fix(agent-context): align context file dedupe

* fix(agent-context): align bash context file dedupe

* fix(agent-context): preserve disabled display target

* fix(agent-context): require yaml-capable updater python

* fix(agent-context): preserve context files config

* fix(agent-context): align context file fallbacks

* fix(agent-context): share context file resolution

---------

Co-authored-by: AustinZ21 <AustinZ21@users.noreply.github.com>
2026-06-22 12:10:55 -05:00
Manfred Riem
79a34b892d fix(presets): use _repo_root() for bundled-core source-checkout fallback (#3086) (#3091)
* fix(presets): use _repo_root() for bundled-core source-checkout fallback

The tier-5 fallback in PresetResolver.resolve() and
_find_bundled_core() computed the repo root as
Path(__file__).parent.parent.parent. After presets.py was moved to
presets/__init__.py (#2826) that chain is one level short, resolving
to src/ and looking for src/templates/commands/<cmd>.md, which never
exists. As a result, wrap-strategy presets found no core base layer in
source/editable installs.

Use the shared _repo_root() helper so both fallbacks resolve against
the actual repo-root templates/ tree. Wheel installs were unaffected
(core_pack path), so this only impacts source/editable checkouts.

Refs #3086

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(presets): restore dropped def for oserror-manifest test

A prior edit accidentally removed the
def test_resolve_extension_command_via_manifest_skips_oserror_manifests
line, orphaning its body inside the new bundled-core test. Restore the
test definition so pytest collects it again.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(presets): move bundled-core tests into TestPresetResolver

The two tier-5 fallback regression tests exercise collect_all_layers()
and resolve(), not resolve_core(), so they belong in TestPresetResolver
rather than TestResolveCore. Relocate them for clearer suite navigation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-22 11:33:44 -05:00
Huy Do
1cb935997c fix: fail loudly on an unknown workflow expression filter (#3074)
* fix: fail loudly on an unknown workflow expression filter

The expression evaluator's filter dispatch fell through to `return value`
for any unregistered filter, so a typo'd or unsupported filter such as
`{{ items | length }}` rendered the value unchanged with no error and the
run completed — a silent wrong result.

Raise a clear ValueError instead, naming the offending filter and the valid
ones, mirroring the strict handling already used for `from_json`. The five
registered filters (default/join/map/contains/from_json) are unchanged; the
`name(arg)` form of an unknown filter is now caught too.

* fix: distinguish a misused registered filter from an unknown one; cover map

Address the review feedback on the unknown-filter fail-loud path:

- A *registered* filter used in an unsupported form (e.g. `| join` or
  `| map` with no argument) raised the misleading "unknown filter
  '<name>'" — the filter is registered, the syntax isn't. It now raises
  a message naming it as a known filter misused. A new
  `_REGISTERED_FILTERS` constant drives the distinction.
- `test_registered_filters_unaffected` now also exercises `map('attr')`,
  which it previously claimed to cover but didn't. Add
  `test_registered_filter_unsupported_form_raises` to pin the new path.

* fix: include the no-arg default form in the filter-error hint

Copilot review: the hint listed default('x') but omitted the valid
no-argument default form (| default), which this module supports.
2026-06-22 10:44:23 -05:00
Manfred Riem
902f5431f9 Harden command registration path handling (#3088)
* fix: validate command 'file' field against path traversal in registrar

CommandRegistrar.register_commands() read each command body from
source_dir / cmd_file without validating the manifest 'file' field,
unlike the parallel skill and preset readers which already reject
absolute paths and '..' traversal. A malicious extension/preset/bundle
manifest with file: ../../../etc/passwd (or an absolute path) could
read arbitrary host files verbatim into a generated agent command at a
predictable path (GHSA-w5fv-7w9x-7fc5, CWE-22).

Add the same containment guard at the command read site and reject a
traversal/absolute 'file' at manifest-load time in
ExtensionManifest._validate() for defense-in-depth, plus regression
tests for both the read path and the manifest validator.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test/fix: address review — robust absolute-path test and tolerant reads

- register_commands(): use is_file() instead of exists() and skip the
  command if read_text() raises (directory or non-UTF8 file), aligning
  with the other command/skill readers.
- Traversal tests: point the absolute-path payload at the real temp
  secret.txt (guaranteed to exist on all platforms) instead of
  /etc/passwd, so the absolute-path guard is genuinely exercised and the
  test fails if it regresses, rather than passing because the target
  happens not to exist (e.g. on Windows runners).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: rename traversal fixtures to avoid CodeQL secret-storage false positive

The regression fixtures named an out-of-tree file secret.txt with
TOP-SECRET-CREDENTIAL content. CodeQL's clear-text-storage heuristic
treated that read content as sensitive and followed the static path
into the pre-existing write_text sinks in _write_registered_output,
raising false 'clear-text storage of sensitive information' alerts on
PR 3088. Rename the fixtures to neutral outside.txt / OUTSIDE-FILE-MARKER
and drop /etc/passwd payloads; the test semantics (a file outside
source_dir must never be read into a generated command) are unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: reject Windows drive-relative 'file' values in traversal guards

is_absolute() is False for Windows drive-relative paths like C:outside.txt,
which contain no '..' yet resolve against the process CWD on that drive —
bypassing the containment guard on Windows. Evaluate the 'file' value under
PureWindowsPath as well so both the registrar runtime guard and the
manifest-load validator reject drive letters (and backslash '..' segments)
cross-platform. Extend the regression tests with drive-relative cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: use anchor under both path flavors so POSIX-absolute is rejected on Windows

On a Windows runner WindowsPath('/abs/outside.md').is_absolute() is False
(no drive), so the prior native-Path check let a leading-slash 'file' value
through and the manifest validator did not raise. Evaluate the value under
both PurePosixPath and PureWindowsPath and reject any non-empty anchor —
covering POSIX-absolute, Windows drive-relative, Windows absolute, and
rooted-without-drive — in both the registrar guard and the manifest
validator. The registrar join now uses the raw 'file' string so native
separators are handled by the resolve()/relative_to() containment check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: validate command 'file' field against path traversal in registrar

CommandRegistrar.register_commands() read each command body from
source_dir / cmd_file without validating the manifest 'file' field,
unlike the parallel skill and preset readers which already reject
absolute paths and '..' traversal. A malicious extension/preset/bundle
manifest with file: ../../../etc/passwd (or an absolute path) could
read arbitrary host files verbatim into a generated agent command at a
predictable path (GHSA-w5fv-7w9x-7fc5, CWE-22).

Add the same containment guard at the command read site and reject a
traversal/absolute 'file' at manifest-load time in
ExtensionManifest._validate() for defense-in-depth, plus regression
tests for both the read path and the manifest validator.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test/fix: address review — robust absolute-path test and tolerant reads

- register_commands(): use is_file() instead of exists() and skip the
  command if read_text() raises (directory or non-UTF8 file), aligning
  with the other command/skill readers.
- Traversal tests: point the absolute-path payload at the real temp
  secret.txt (guaranteed to exist on all platforms) instead of
  /etc/passwd, so the absolute-path guard is genuinely exercised and the
  test fails if it regresses, rather than passing because the target
  happens not to exist (e.g. on Windows runners).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: rename traversal fixtures to avoid CodeQL secret-storage false positive

The regression fixtures named an out-of-tree file secret.txt with
TOP-SECRET-CREDENTIAL content. CodeQL's clear-text-storage heuristic
treated that read content as sensitive and followed the static path
into the pre-existing write_text sinks in _write_registered_output,
raising false 'clear-text storage of sensitive information' alerts on
PR 3088. Rename the fixtures to neutral outside.txt / OUTSIDE-FILE-MARKER
and drop /etc/passwd payloads; the test semantics (a file outside
source_dir must never be read into a generated command) are unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: reject Windows drive-relative 'file' values in traversal guards

is_absolute() is False for Windows drive-relative paths like C:outside.txt,
which contain no '..' yet resolve against the process CWD on that drive —
bypassing the containment guard on Windows. Evaluate the 'file' value under
PureWindowsPath as well so both the registrar runtime guard and the
manifest-load validator reject drive letters (and backslash '..' segments)
cross-platform. Extend the regression tests with drive-relative cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: use anchor under both path flavors so POSIX-absolute is rejected on Windows

On a Windows runner WindowsPath('/abs/outside.md').is_absolute() is False
(no drive), so the prior native-Path check let a leading-slash 'file' value
through and the manifest validator did not raise. Evaluate the value under
both PurePosixPath and PureWindowsPath and reject any non-empty anchor —
covering POSIX-absolute, Windows drive-relative, Windows absolute, and
rooted-without-drive — in both the registrar guard and the manifest
validator. The registrar join now uses the raw 'file' string so native
separators are handled by the resolve()/relative_to() containment check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: harden register_commands inputs and tighten manifest 'file' validation

Address review feedback on #3088:
- register_commands(): skip non-string/empty 'file' values instead of
  raising TypeError, and hoist source_dir.resolve() out of the per-command
  loop.
- ExtensionManifest._validate(): reject 'file' values with leading/trailing
  whitespace with a clear ValidationError instead of a confusing
  missing-file failure later.
- tests: add non-string 'file' and whitespace cases; use yaml.safe_dump
  with explicit utf-8 encoding in the manifest validation test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: align runtime '..' policy, correct comment, dedupe test helper

Address review feedback on #3088:
- register_commands(): also reject '..' segments under both POSIX and
  Windows semantics, keeping runtime policy consistent with
  ExtensionManifest._validate() and the skill/preset readers (not just
  relying on the resolve()/relative_to() containment backstop).
- Replace the version-dependent is_absolute() claim in the extensions.py
  comment with the actual portability rationale (native Path is OS-
  dependent; C:foo is anchored but not absolute).
- Extract the duplicated leak-detection assertion into
  _assert_no_marker_leak() and add an in-bounds '..' payload that
  exercises the new runtime '..' rejection.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Extract shared path-safety policy and warn on unreadable command files

Introduce relative_extension_path_violation() in _utils.py as the single
source of truth for the extension-relative `file` path-safety policy, and
use it from both the runtime registrar guard (agents.py) and the
manifest-load validator (extensions.py) so the two cannot drift.

Warn (instead of silently skipping) when an in-bounds command file exists
but cannot be read/decoded, surfacing misconfigured extensions.

Add unit tests for the shared helper, a read-skip warning test, and make
the in-bounds `..` test create its target file so the skip is attributable
to the `..` rejection rather than file absence.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retrigger CI

Empty commit to re-trigger code scanning / CodeQL analysis on the PR
merge ref.

Assisted-by: GitHub Copilot CLI (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-22 10:25:29 -05:00
Ali jawwad
f9c6cf83e5 fix(presets): preserve argument-hint in preset SKILL.md generation (#2978)
* fix(presets): preserve argument-hint in preset SKILL.md generation

Preset-provided and extension-override commands that declare
`argument-hint:` in their frontmatter had it dropped from the generated
Claude SKILL.md, and it was re-dropped when a preset was removed and its
overridden skill restored. This is the preset-side analog of the
extension fix in #2903 / #2916.

Factor the argument-hint carry-over into a shared
CommandRegistrar.apply_argument_hint() helper and apply it at the four
preset skill-generation sites (register, reconcile override-restore, and
the core/extension unregister-restore paths). The extension path from

The helper writes argument-hint into the frontmatter dict before
serialization (so a folded multi-line description cannot be split into
invalid YAML) and only for integrations that support it (those exposing
inject_argument_hint -- currently Claude), leaving build_skill_frontmatter's
shared shape unchanged for every other agent. Core templates carry no
argument-hint, so the core-restore path is a no-op. No behavior change for
non-Claude agents or the core path.

Add regression tests covering a folding description (Claude) and the
non-Claude gate (codex).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(presets): address review - guard skill_frontmatter type and tighten apply_argument_hint annotations

Add a symmetric isinstance(skill_frontmatter, dict) guard so the helper stays a safe no-op if a caller passes a non-dict, and annotate the parameters as Dict[str, Any] with an optional integration to match real call-site usage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 09:52:13 -05:00
Huy Do
f5f76160a3 feat: surface gate detail in the workflow run/resume --json payload (#2965)
* feat: surface gate detail in the workflow run/resume --json payload

A paused run was indistinguishable from any other pause in the
machine-readable outcome, and the gate's prompt/options/choice never
left the human-facing stream. Record each step's type in the run
state's step results (one engine line) and, when the run sits at a
gate, add a gate block (step_id/message/options/choice) to the payload
so orchestrators can drive review gates without parsing stdout.

Reference implementation for the proposal in #2964.

Addresses #2964

* fix(workflow): only surface gate detail in --json when the run is paused

Address review (#2965): _gate_outcome() emitted a gate block whenever current_step_id pointed at a gate step. Since RunState.current_step_id is never cleared on completion, a completed/failed run whose last step was a gate leaked stale gate detail in run/resume/status --json. Guard on status == paused. Also assert CLI success in the _run_json test helper before JSON-parsing, and add direct coverage for the suppression guard.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(workflows): surface gate block on aborted runs; stabilize message

Address Copilot review:
- `_gate_outcome` now also surfaces the gate block when a run is `aborted`
  by a gate rejection (`on_reject: abort`), not only when `paused`. Abort
  is the only path that sets ABORTED and it leaves current_step_id on the
  gate, so an orchestrator can read the recorded `choice` for the stop.
- Coerce `message` to a string (it may be a non-string YAML literal that
  GateStep only coerces for interpolation) so the JSON schema stays stable.
- Tests: add a CLI-level aborted-path test, a message-coercion test, and
  extend the suppression test to allow `aborted`; share the run helper via
  `_invoke_json` to avoid duplicating the invoke boilerplate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(workflows): assert clean exit in gate-abort JSON test

Address Copilot review: the gate-abort test parsed stdout without first
asserting the CLI exited cleanly, so an invoke failure would surface as an
opaque JSON decode error. Route it through `_run_json` (which asserts
exit_code == 0 before parsing) and drop the now-redundant `_invoke_json`
helper — a gate abort emits the payload and returns, so the run exits 0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: use result.output in run-helper assert; document step_data shape

Address Copilot review:
- `_run_json` asserted with `result.stdout` in the message, but under
  `--json` step output is redirected off stdout — the useful diagnostics
  live on `result.output`. Switch the assertion message to `result.output`
  (the JSON parse still reads stdout), matching the other CLI tests.
- `StepContext.steps` documented a 5-key entry shape; the engine now also
  persists `type` and `status`. Update the docstring to the canonical
  7-key shape so step authors/debuggers see the real record.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(workflows): align gate-abort JSON test with aborted→exit-1

After rebasing onto main, a gate abort now emits the --json payload and
then exits non-zero (`_run_outcome_exit_code` maps aborted → 1, from the
merged exit-code work). Give `_run_json` an `expected_exit` parameter
(default 0) so the abort case asserts exit 1 while the paused/completed
cases stay at 0 — keeping a single shared helper rather than duplicating
the invoke boilerplate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(workflows): backward-compat gate detection + normalize gate options

Address Copilot review:
- A run paused by an older version has no persisted step `type`, so
  `_gate_outcome` would never surface its gate block on resume. Add
  `_is_gate_step`: prefer the `type` field, but when it is absent fall back
  to the gate's unique output signature (`on_reject`, written only by
  GateStep). A record with a different known `type` is still not a gate.
- Normalize `options` to a list of strings (mirroring the `message`
  coercion) so an unvalidated workflow with non-string options can't
  destabilize the JSON schema.
- Tests: options coercion, type-less gate detection, and a type-less
  non-gate negative case.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(workflows): normalize non-list gate options to a stable list[str]

Address Copilot review: the prior options normalization only mapped a
`list`, returning the raw value for any other shape (scalar/tuple), which
contradicted the "stable list[str]" intent. Extract `_normalize_gate_options`:
None stays None; list/tuple maps each element through str; any other scalar
becomes a single-element list (a bare string is one option, never iterated
character-by-character). The emitted schema is now always list[str] | None.
Extend the options test to cover list, tuple, bare string, numeric scalar,
and None.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(workflows): normalize gate choice to str; portable plain-gate test

Address Copilot review:
- `_gate_outcome` normalized `message` and `options` but passed `choice`
  through as-is; an unvalidated gate can record a non-string `choice`,
  which contradicts the stable-schema rationale. Coerce `choice` to
  `str | None` (None still means "no decision yet"), consistent with the
  other two fields. Adds a focused choice-coercion test.
- The plain (no-gate) test workflow used `run: "true"`, which fails under
  cmd.exe on Windows (ShellStep uses shell=True). Use the cross-platform
  `run: "exit 0"` (matching the exit-code suite's workflows).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-22 07:05:54 -05:00
Manfred Riem
487af97864 feat: add specify bundle command (#3070)
* docs: dogfood Spec Kit — bundler SDD artifacts + constitution

Scaffold Spec Kit (--integration copilot) and run the full SDD workflow
against the `specify bundle` subcommand feature:

- spec.md (4 user stories, 31 FRs, 8 success criteria) + clarifications
- plan.md, research.md, data-model.md, contracts/, quickstart.md
- tasks.md (43 dependency-ordered tasks, organized by user story)
- Spec Kit Constitution v1.0.0 (code quality, testing, UX, performance,
  dependency/security principles) derived from deep codebase analysis
- plan Constitution Check + tasks grounded against the ratified principles

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(bundler): add `specify bundle` subcommand for role-based setups

Implements the Spec Kit Bundler as a `specify bundle ...` subcommand group
that calls existing primitive machinery in-process with zero new dependencies,
per the v1.0.0 constitution (Principles I-V).

Adds the `specify_cli.bundler` package (models, services, lib helpers) and the
`commands/bundle` Typer group wiring search, info, list, install, update,
remove, validate, build, init, and catalog list/add/remove (with --json and
--offline). Includes manifest/catalog schemas, version + integration-clash
gating, discovery-only refusal, idempotent install with atomic rollback,
non-collateral removal, and offline-first catalog resolution.

Ships an 82-test suite (contract/unit/integration), four sample role bundles
(product-manager, business-analyst, security-researcher, developer), README
"Bundles" docs, and an AGENTS.md pitfall on the test-venv gotcha. Marks
tasks T001-T043 complete and records follow-ups T044 (live in-process
primitive dispatch) and T045 (install from a local artifact path).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(contributing): document running the full test suite via project .venv

Add a "Running the full test suite" subsection under Automated checks covering
`uv pip install -e ".[test]"` + `.venv/bin/python -m pytest`, with the
shared/global editable-install contamination caveat that mirrors the AGENTS.md
pitfall.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(bundler): wire real in-process primitive install + local-artifact install

Closes the two follow-ups left after the initial bundler landing.

T044 — DefaultPrimitiveInstaller now performs real installs through existing
machinery instead of raising "use the primitive command" errors:
- presets/extensions install via their reusable managers
  (install_from_directory / install_from_zip); bundled assets install fully
  offline, catalog assets are fetched only when the network is allowed.
- workflows/steps delegate to the existing `workflow add` / `workflow step add`
  command callables in-process (project root as cwd), avoiding any duplicated
  download/validation logic (Principle I).
- `--offline` is threaded through DefaultPrimitiveInstaller(allow_network=…) so
  network-only kinds refuse with an actionable message rather than silently
  reaching out.

T045 — `specify bundle install` now accepts a local path (a built .zip
artifact, a bundle directory, or a bundle.yml) and installs directly without
consulting the catalog stack; bundle-ids still resolve via the stack.

Adds 13 tests (routing, offline gating, local-source resolution, and an
end-to-end offline build → install → list → remove of the bundled
agent-context extension). Bundler suite: 95 passing; ruff clean. Marks T044
and T045 complete in tasks.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(bundler): append Phase 8 convergence tasks from converge assessment

Ran the converge command: assessed the codebase against spec.md, plan.md,
tasks.md, and the v1.0.0 constitution. Appended 7 traceable gap-closure tasks
(T046–T052) as a new "Phase 8: Convergence" section. Append-only — no existing
tasks were modified and no application code was changed.

Findings: 1 CRITICAL (Constitution III — bundle group undocumented under
docs/reference/), 3 HIGH (FR-005/SC-007 validate references; FR-009/SC-002 info
expansion; FR-012 install-time init), 3 MEDIUM (FR-013 integration precedence;
FR-020 surface overlaps; FR-028 update refresh).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Implement Phase 8 convergence tasks (T046–T052)

Close the gaps the converge command found between the bundler spec/plan/
constitution and the code:

- T046: add docs/reference/bundles.md documenting the full `specify bundle`
  command group; link it from docs/reference/overview.md (Constitution III).
- T047: wire a reference checker into `bundle validate` (services/references.py);
  online runs fail and name unresolved component references, offline runs warn.
- T048: expand `bundle info` to enumerate the full component set (versions,
  preset priority/strategy) plus the bundle integration — info == install.
- T049/T050: `bundle install`/`bundle init` now scaffold an uninitialized
  project via the existing `specify init` machinery, choosing the integration by
  precedence (override → bundle-declared → Copilot + OS default script type).
- T051: surface foreseeable component overlaps during info and install.
- T052: `bundle update` refreshes already-installed components via a new
  refresh path in install_bundle, preserving primitive-level overrides.

Adds unit/contract/integration coverage (107 tests pass).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* converge: append Phase 9 (T053) — surface bundle trust indicator

Re-run of converge after Phase 8. The seven Phase 8 tasks are verified closed.
One residual partial gap remains: the `verified`/trust indicator (FR-010,
FR-027) is exposed only in `bundle info --json`, absent from `bundle search`
(the primary discovery surface) and `bundle info` text. Appended as a single
new task for implement to complete. Append-only; no code changed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Implement T053 — surface bundle trust indicator in discovery

`bundle search` (text + JSON) and `bundle info` (text + JSON) now expose each
catalog entry's verification/trust level (verified vs community), so users can
judge a bundle's trust before installing, per FR-010 / FR-027. Previously
`verified` was only present in `bundle info --json`.

Adds contract coverage; 108 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: dogfood Spec Kit — bundler SDD artifacts + constitution

Scaffold Spec Kit (--integration copilot) and run the full SDD workflow
against the `specify bundle` subcommand feature:

- spec.md (4 user stories, 31 FRs, 8 success criteria) + clarifications
- plan.md, research.md, data-model.md, contracts/, quickstart.md
- tasks.md (43 dependency-ordered tasks, organized by user story)
- Spec Kit Constitution v1.0.0 (code quality, testing, UX, performance,
  dependency/security principles) derived from deep codebase analysis
- plan Constitution Check + tasks grounded against the ratified principles

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(bundler): add `specify bundle` subcommand for role-based setups

Implements the Spec Kit Bundler as a `specify bundle ...` subcommand group
that calls existing primitive machinery in-process with zero new dependencies,
per the v1.0.0 constitution (Principles I-V).

Adds the `specify_cli.bundler` package (models, services, lib helpers) and the
`commands/bundle` Typer group wiring search, info, list, install, update,
remove, validate, build, init, and catalog list/add/remove (with --json and
--offline). Includes manifest/catalog schemas, version + integration-clash
gating, discovery-only refusal, idempotent install with atomic rollback,
non-collateral removal, and offline-first catalog resolution.

Ships an 82-test suite (contract/unit/integration), four sample role bundles
(product-manager, business-analyst, security-researcher, developer), README
"Bundles" docs, and an AGENTS.md pitfall on the test-venv gotcha. Marks
tasks T001-T043 complete and records follow-ups T044 (live in-process
primitive dispatch) and T045 (install from a local artifact path).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(contributing): document running the full test suite via project .venv

Add a "Running the full test suite" subsection under Automated checks covering
`uv pip install -e ".[test]"` + `.venv/bin/python -m pytest`, with the
shared/global editable-install contamination caveat that mirrors the AGENTS.md
pitfall.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(bundler): wire real in-process primitive install + local-artifact install

Closes the two follow-ups left after the initial bundler landing.

T044 — DefaultPrimitiveInstaller now performs real installs through existing
machinery instead of raising "use the primitive command" errors:
- presets/extensions install via their reusable managers
  (install_from_directory / install_from_zip); bundled assets install fully
  offline, catalog assets are fetched only when the network is allowed.
- workflows/steps delegate to the existing `workflow add` / `workflow step add`
  command callables in-process (project root as cwd), avoiding any duplicated
  download/validation logic (Principle I).
- `--offline` is threaded through DefaultPrimitiveInstaller(allow_network=…) so
  network-only kinds refuse with an actionable message rather than silently
  reaching out.

T045 — `specify bundle install` now accepts a local path (a built .zip
artifact, a bundle directory, or a bundle.yml) and installs directly without
consulting the catalog stack; bundle-ids still resolve via the stack.

Adds 13 tests (routing, offline gating, local-source resolution, and an
end-to-end offline build → install → list → remove of the bundled
agent-context extension). Bundler suite: 95 passing; ruff clean. Marks T044
and T045 complete in tasks.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(bundler): append Phase 8 convergence tasks from converge assessment

Ran the converge command: assessed the codebase against spec.md, plan.md,
tasks.md, and the v1.0.0 constitution. Appended 7 traceable gap-closure tasks
(T046–T052) as a new "Phase 8: Convergence" section. Append-only — no existing
tasks were modified and no application code was changed.

Findings: 1 CRITICAL (Constitution III — bundle group undocumented under
docs/reference/), 3 HIGH (FR-005/SC-007 validate references; FR-009/SC-002 info
expansion; FR-012 install-time init), 3 MEDIUM (FR-013 integration precedence;
FR-020 surface overlaps; FR-028 update refresh).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Implement Phase 8 convergence tasks (T046–T052)

Close the gaps the converge command found between the bundler spec/plan/
constitution and the code:

- T046: add docs/reference/bundles.md documenting the full `specify bundle`
  command group; link it from docs/reference/overview.md (Constitution III).
- T047: wire a reference checker into `bundle validate` (services/references.py);
  online runs fail and name unresolved component references, offline runs warn.
- T048: expand `bundle info` to enumerate the full component set (versions,
  preset priority/strategy) plus the bundle integration — info == install.
- T049/T050: `bundle install`/`bundle init` now scaffold an uninitialized
  project via the existing `specify init` machinery, choosing the integration by
  precedence (override → bundle-declared → Copilot + OS default script type).
- T051: surface foreseeable component overlaps during info and install.
- T052: `bundle update` refreshes already-installed components via a new
  refresh path in install_bundle, preserving primitive-level overrides.

Adds unit/contract/integration coverage (107 tests pass).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* converge: append Phase 9 (T053) — surface bundle trust indicator

Re-run of converge after Phase 8. The seven Phase 8 tasks are verified closed.
One residual partial gap remains: the `verified`/trust indicator (FR-010,
FR-027) is exposed only in `bundle info --json`, absent from `bundle search`
(the primary discovery surface) and `bundle info` text. Appended as a single
new task for implement to complete. Append-only; no code changed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Implement T053 — surface bundle trust indicator in discovery

`bundle search` (text + JSON) and `bundle info` (text + JSON) now expose each
catalog entry's verification/trust level (verified vs community), so users can
judge a bundle's trust before installing, per FR-010 / FR-027. Previously
`verified` was only present in `bundle info --json`.

Adds contract coverage; 108 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): address PR review — annotations, Windows paths, HTTPS, errors, reproducible builds

Resolves automated review feedback on github/spec-kit#3070:

- validator: drop redundant string-quoting on ReferenceChecker's
  `str | None` return so the annotation evaluates as a real union under
  `from __future__ import annotations`.
- adapters: normalize Windows drive-letter paths (e.g. C:\...) to the
  local-file branch so offline file catalogs resolve on Windows.
- adapters: enforce HTTPS (HTTP only for localhost) and require a host on
  remote catalog URLs before any network call, mirroring
  specify_cli.catalogs URL validation (MITM/downgrade protection).
- adapters: pass `origin` to loads_json for local files and HTTP payloads
  so JSON parse errors name the real source instead of <string>.
- manifest: parse component `priority` defensively, raising an actionable
  BundlerError on non-integer values instead of a raw ValueError.
- packager: write zip members with a fixed timestamp + permissions so
  identical inputs yield byte-for-byte identical artifacts (genuinely
  reproducible builds), and strengthen the determinism test accordingly.

Adds regression tests for priority validation, plain-HTTP/host rejection,
and byte-level artifact reproducibility (111 bundler tests pass; ruff clean).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): address PR review round 2 — nested output dir + file:// URLs

- packager: when --output points inside the bundle directory, exclude the
  whole output subtree from collection so previously-built artifacts are
  never re-packaged (prevents broken reproducibility and unbounded growth).
- adapters: resolve file:// catalog URLs via url2pathname and preserve
  netloc, so Windows file URLs (file:///C:/...) and UNC shares
  (file://server/share) resolve correctly instead of dropping the host or
  producing /C:/x.

Adds regression tests for nested-output exclusion and file:// resolution
(113 bundler tests pass; ruff clean).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): address PR review round 3 — discovery UX + hardening

- bundle search/info: fall back to the built-in/user catalog stack instead of
  requiring a Spec Kit project, so discovery works in a fresh directory (and
  the README/quickstart examples now match actual behavior). install still
  auto-initializes a project as before.
- packager: traverse with os.walk(followlinks=False) and prune symlinked
  directories before descending, so a symlink-to-dir can no longer pull in
  out-of-tree files (which previously turned "skip symlinks" into a hard
  ensure_within() failure and did extra filesystem work).
- records: parse contributed-component priority defensively, raising an
  actionable BundlerError on a corrupt records file instead of leaking a raw
  ValueError/traceback.
- installer: give install_bundle's manifest parameter an explicit
  BundleManifest | None type for a clearer, safer service API.

Adds regression tests for project-less search/info, symlinked-dir pruning,
and corrupt-priority records (117 bundler tests pass; ruff clean).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): address PR review round 4 + markdownlint exclusions

Review fixes:
- bundle info: expand the manifest regardless of install policy so
  discovery-only bundles remain inspectable (only install is refused).
- _download_manifest: handle local .zip download_url by extracting bundle.yml
  (via _local_manifest_source), and add a real remote HTTPS fetch path using
  the shared authenticated, redirect-validated open_url client (HTTPS enforced
  on the initial URL and every redirect; offline still refuses).
- _run_init: thread the --offline flag through to the init callback so
  `bundle install/init --offline` never performs network init.
- conflict.ConflictReport: use field(default_factory=list) and drop the
  None + __post_init__ workaround.
- CatalogSource.from_dict: parse priority defensively, raising an actionable
  BundlerError naming the source + offending value instead of a raw ValueError.

markdownlint:
- Exclude .specify/, .github/, and specs/ (and their subdirectories) from
  markdownlint so the in-flight dogfooding scaffolding doesn't trip the linter.

Adds regression tests for discovery-only info, local-zip download_url, and
non-integer catalog priority (120 bundler tests pass; ruff clean; the PR's own
markdown lints clean).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): address PR review round 5 + ignore generated files in whitespace check

Review fixes:
- packager: exclude any prior build artifact for this bundle (matching
  <id>-*.zip), not just the current output path, so older artifacts next to
  bundle.yml are never re-packaged.
- docs(bundles): correct the note — `search` and `info` work without a project
  (they fall back to the built-in/user catalog stack); only list/update/remove/
  catalog require an initialized project.

CI / generated files:
- .gitattributes: mark the generated dogfooding scaffolding (.specify/**, the
  speckit .github agent/prompt files, copilot-instructions.md, specs/**) with
  -whitespace so `git diff --check` (the Lint workflow's whitespace gate) stops
  flagging emitted trailing whitespace. These files are produced by
  `specify init` and are scrubbed before merge.

Adds a regression test for prior-artifact exclusion (121 bundler tests pass;
ruff clean).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): collision-resistant catalog ids, canonical local paths, explicit uninstalled result

Addresses review round 6 (PR #3070):
- catalog_config._derive_id now combines host label with the URL path stem so
  multiple catalogs from the same host get distinct, stable default ids.
- add_source canonicalizes local file paths to absolute before persisting, so
  project config no longer depends on the caller's cwd.
- InstallResult gains a dedicated `uninstalled` list; remove_bundle no longer
  overloads `installed` for removals, and the CLI prints from `uninstalled`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): confine config writes, guard indeterminate integration, fix validate docs

Addresses review round 7 (PR #3070):
- save_records and catalog_config._write now pass within=project_root to
  dump_json/dump_yaml, refusing symlinked .specify paths that escape the
  project (defense-in-depth, matching the rest of the codebase).
- resolve_install_plan now fails when a bundle pins an integration but the
  project's active integration cannot be determined and no explicit
  --integration override was given, instead of silently adopting the bundle's
  required integration (FR-019 guard). CLI passes integration_explicit.
- docs/reference/bundles.md: corrected the validate semantics to describe the
  actual best-effort online behavior (unreachable catalogs warn, not fail).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): Windows path handling + review round 8 hardening

Fix Windows CI failures:
- is_safe_relpath now rejects POSIX-absolute (/abs) and Windows drive-absolute
  (C:\x, UNC) paths on every OS, instead of passing them through on Windows
  where os.path.isabs('/abs') is False and Path('/abs').parts yields '\\'.
- _download_manifest treats a Windows drive-letter download_url (C:\bundle.yml,
  which urlparse reads as scheme 'c') as a local file, fixing the empty
  component set in `bundle info` on Windows.

Address review round 8 (PR #3070):
- Bundled workflows now install under --offline (locate via
  _locate_bundled_workflow) instead of being refused unconditionally.
- bundle update preserves the original installed_at timestamp on refresh
  (import find_record; reuse the existing record's timestamp).
- _derive_id lowercases the host label so 'Example.com' and 'example.com'
  produce the same deterministic id.
- CatalogEntry.from_dict validates 'tags' is a list and 'verified' is a real
  boolean, raising BundlerError on invalid untrusted shapes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): normalize SemVer prerelease spellings before version parsing

Addresses review round 9 (PR #3070): parse_version and is_semver now apply the
same prerelease normalization (mirroring specify_cli._version._normalize_tag)
so SemVer spellings like 1.2.3-rc1 / 1.2.3-alpha1 validate and compare
consistently across is_semver, parse_version, and satisfies. Leading 'v' is
also stripped. Keeps the manifest validator and constraint checks in agreement.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): no collateral removal + enforce manifest-pinned versions

Addresses review round 10 (PR #3070):
- install_bundle records only the components this bundle actually contributed:
  freshly-installed components, plus pre-existing ones already owned by this
  bundle (refresh) or a sibling bundle (shared/refcounted). A component that is
  installed on disk but tracked by no bundle was installed independently and is
  no longer attributed, so `bundle remove` won't uninstall it (FR-022).
- preset/extension/workflow install paths now verify the active catalog's
  advertised version matches the manifest-pinned component.version before
  downloading/installing, raising BundlerError on mismatch so bundles stay
  reproducible. When a catalog advertises no version the pin can't be enforced
  and installation proceeds.

Added regression tests: independent pre-existing component survives removal;
version-mismatch refusal (helper + workflow path).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(scripts): add SPECIFY_INIT_DIR to target a member project from the repo root (#2892)

* feat(scripts): add SPECIFY_INIT_DIR to target a member project from the repo root

Resolve an explicit SPECIFY_INIT_DIR project override once in the core
get_repo_root / Get-RepoRoot, so a non-interactive / CI caller can target a
member project (the directory containing .specify/) from a monorepo root
without cd. Strict by design: the path must exist and contain .specify/,
otherwise it hard-errors with no silent fallback.

- Single resolver in core; the git feature-branch script inherits it by
  sourcing core, with no per-extension copies.
- PS resolver verifies the resolved path is a directory (Resolve-Path also
  succeeds for files) so a file value errors as "not an existing directory".
- get_feature_paths splits decl/assignment so a SPECIFY_INIT_DIR failure
  propagates instead of being masked by `local`.
- create-new-feature-branch: when core is absent (only git-common loaded) and
  SPECIFY_INIT_DIR is set, hard-error rather than silently using the git root.
- Document SPECIFY_INIT_DIR and SPECIFY_FEATURE_DIRECTORY in the core reference.
- Tests for valid/relative/trailing-slash/file/missing/no-.specify targets,
  feature-axis composition, the no-core guard, and a PowerShell mirror.

* fix: guard SPECIFY_INIT_DIR with stale core scripts

* docs: clarify SPECIFY_FEATURE_DIRECTORY precedence wording

* fix: normalize trailing slash in PowerShell SPECIFY_INIT_DIR resolver

Resolve-Path preserves a trailing separator from its input, so a
SPECIFY_INIT_DIR ending in a slash returned a root that didn't match the
bash resolver (whose `cd && pwd` strips it). That broke
test_ps_trailing_slash_tolerated on the CI runners, which do have pwsh.
Trim it with TrimEndingDirectorySeparator (no-op on a bare root or a path
with no trailing separator).

Also fix the misleading test comment: the PowerShell mirror runs on the
CI ubuntu/windows runners (they ship pwsh), it is not skipped there.

* test: normalize bash path expectations on Windows

* docs: clarify SPECIFY_INIT_DIR root helpers

* chore: sync dogfooded .specify core scripts with SPECIFY_INIT_DIR

Mirror the SPECIFY_INIT_DIR resolver (resolve_specify_init_dir in
common.sh) into the committed dogfooding .specify/scripts/bash copies so
the git extension's create-new-feature-branch.sh finds an up-to-date
common.sh instead of failing with "requires updated Spec Kit core
scripts". Fixes the test_init_dir.py CI failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): harden remote catalog fetch and config parsing

- adapters: route catalog HTTP fetches through the shared authenticated
  client (authentication.http.open_url) so auth.json tokens apply and the
  Authorization header is stripped on cross-host/downgrade redirects.
  Reject any redirect that leaves HTTPS via a redirect_validator and
  re-validate the final URL after redirects, closing the urlopen
  auto-redirect MITM/downgrade gap.
- catalog_config._read: raise an actionable BundlerError when the config
  top level is not a mapping, 'catalogs' is not a list, or an entry is
  not a mapping, instead of letting list(<str>) produce a downstream
  AttributeError.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): tighten record read confinement, policy gate, and precedence

Addresses review 4534504799:

- records.load_records: confine the read via ensure_within(project_root,
  ...) so a symlinked/traversal-escaping .specify cannot read arbitrary
  files outside the project (matches the write path's within= guard).
- catalog_config._slug: lowercase so derived catalog ids are
  deterministic across platforms and case-variant duplicates can't slip
  past the case-sensitive dup check.
- installer.install_bundle: reword the docstring's misleading "atomic on
  failure" claim to describe the real scoped guarantee (record written
  only on full success; rollback limited to newly-installed components).
- bundle update: enforce the source install_policy like install, refusing
  to update from a discovery-only source (FR-025).
- catalog source precedence: the CLI now passes ~/.specify as the user
  config dir so project > user > built-in precedence is actually
  reachable (previously the user scope was silently ignored).
- .gitattributes: scope the specs whitespace exemption to the generated
  dogfooding feature dir (specs/001-spec-kit-bundler/**) instead of all
  of specs/**.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): no collateral refresh, catalog id integrity, loud info

Addresses review 4534571362:

- installer: in refresh mode (bundle update) only re-apply already-
  installed components that this bundle (or a sibling) owns. Components
  installed independently and tracked by no bundle are now skipped, never
  refreshed, so update cannot make collateral changes (FR-022).
- catalog.load_catalog_payload: validate each entry's own id is present
  and matches its enclosing bundles key, rejecting catalogs that would
  otherwise list a spoofed or unresolvable id.
- bundle info: stop swallowing manifest download failures. If the
  manifest can't be resolved (e.g. --offline against an https download_url
  or a download failure), surface the error and exit non-zero instead of
  silently degrading to catalog `provides` counts, preserving the "info
  == what install applies" guarantee.

Added regressions: refresh leaves independently-installed components
untouched, catalog id key/field mismatch + missing id rejection, and
info exits non-zero when the manifest is unresolvable offline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): confine catalog-config and integration-marker reads

Addresses review 4534716790: two more state reads bypassed the
symlink/path-escape confinement that records and the write paths already
enforce.

- catalog_config._read: validate the config path with
  ensure_within(project_root, ...) before exists()/read, so a symlinked
  .specify resolving outside project_root is rejected instead of read.
- lib.project.active_integration: confine the .specify/integration.json
  read the same way; an out-of-tree escape is treated as "not
  determinable" (returns None) rather than followed.

Added regressions covering both via a symlinked .specify pointing
outside the project root.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): validate manifest tags, disambiguate derived ids by full host

Addresses review 4534768419:

- manifest.from_dict: reject a non-list `tags` (e.g. a bare string) instead
  of splitting it character-by-character, matching the catalog parser and
  the schema contract (tags = list of strings).
- catalog_config._derive_id: derive ids from the full host (TLD included)
  so example.com and example.net no longer collide on the same id. Updated
  the affected id assertions.
- CHANGELOG: call out the new `specify bundle` command group in the
  unreleased section (the PR's headline user-facing feature).
- .gitattributes: clarify the specs whitespace exemption — the dogfooding
  feature dir is scrubbed before merge (not retained), so it doesn't weaken
  checks for kept docs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(gitattributes): retain whitespace exemption for constitution.md

The project constitution (.specify/memory/constitution.md) is the one
dogfooding artifact carried forward past the pre-merge scrub. Give it its
own standalone whitespace exemption so it survives removal of the broader
.specify/** generated-scaffolding exemption.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): accurate uninstall count, confine catalog read, safe bundle id

Addresses review 4534812056:

- installer.remove_bundle: only count a component as uninstalled when
  installer.remove() actually ran; components already absent on disk are
  reported as skipped, keeping the uninstalled count accurate.
- catalog.load_source_stack: confine the project-scoped .specify config read
  with ensure_within, so a symlinked .specify/ resolving outside the project
  root is refused (consistent with the bundler's other guarded reads).
- manifest: enforce a filesystem-safe slug for bundle.id in structural
  validation; packager.build_bundle adds an ensure_within defense-in-depth
  check so a crafted id can never push the artifact outside the output dir.

Also reverts the CHANGELOG entry (the changelog is updated separately).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): validate requires/provides shapes in manifest and catalog

Addresses review 4534855443:

- manifest: validate requires.tools and requires.mcp as list-of-strings via
  a shared _parse_str_list helper (also reused for tags), so a bare string
  like `tools: docker` is rejected with an actionable BundlerError instead of
  being split character-by-character.
- catalog.CatalogEntry.from_dict: validate that `requires` and `provides` are
  mappings before accessing them, so an untrusted catalog payload with
  `requires: "..."` raises a named BundlerError rather than escaping as a raw
  AttributeError traceback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): require README.md when building a bundle artifact

Addresses review 4534938014: build_bundle now fails early with an
actionable error when README.md is missing, matching the documented
artifact contract (manifest + README) instead of silently producing a
bundle with no human-facing description.

Also reverts CHANGELOG.md to the upstream/main copy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): validate record shapes; drop stale install --refresh claim

Addresses review 4534969692:

- records.InstalledBundleRecord.from_dict: hard-error when
  contributed_components is not a list, instead of iterating a corrupt
  bare string character-by-character.
- records.load_records: validate the top-level 'bundles' field is a list and
  fail with a clear BundlerError when a corrupt file makes it a mapping/string.
- PR description: remove the inaccurate "supports --refresh" note from
  `bundle install` (refresh is the `bundle update` path); docs already omit it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): refuse symlinked .specify, reject bad url schemes, IPv6 ids

Addresses review 4534997724:

- lib.project.find_project_root: a symlinked .specify is no longer accepted
  as a project root (is_dir() follows symlinks), matching the confinement the
  rest of the CLI applies and avoiding confusing downstream failures.
- catalog_config.add_source: reject unsupported url schemes (ssh://, ftp://,
  ...) up front instead of silently treating them as local paths; local paths
  containing ':' but not '://' are still allowed.
- catalog_config._derive_id: derive the host via urlparse().hostname so IPv6
  literals, credentials, and ports no longer corrupt the derived id.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): strict semver, narrow artifact skip, preserve priority 0

Addresses review 4535084048:

- versioning.is_semver: enforce a full MAJOR.MINOR.PATCH SemVer (with optional
  pre-release/build) via a dedicated regex, instead of accepting any
  packaging.version.Version-parseable string (e.g. "1", "1.0"). This makes
  BundleManifest.structural_errors() reject non-semver versions.
- packager: narrow the prior-artifact skip pattern to semver-named zips
  (<id>-<x.y.z>.zip) so legitimate assets like <id>-assets.zip are still
  packaged.
- primitives (preset + extension install): use an explicit `is None` check so
  an intentional priority of 0 is preserved instead of being replaced by the
  default.

Adds regressions: non-semver rejection ("1"/"1.0"/"1.2.3.4"), asset-not-
excluded vs semver-artifact-excluded, and priority-0 pass-through.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): artifact regex for prerelease+build; clarify integration/priority docs

Addresses review 4535132279:

- packager: the prior-artifact skip regex now matches semver names carrying
  both a prerelease and build-metadata segment (e.g. 1.0.0-rc1+build5), so such
  an existing artifact is excluded rather than re-packaged — keeping builds
  bounded/deterministic, consistent with is_semver().
- docs/reference/bundles.md: correct the install integration wording.
  --integration selects the integration when initializing a new project and
  confirms the target when a pinned bundle's active integration can't be
  determined; it does NOT override a bundle that targets a specific integration
  (a mismatch aborts with no changes).
- examples/security-researcher README: reword the preset priority note in terms
  of the numeric comparison (ascending priority order) to avoid inverting the
  meaning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): --integration can't bypass clash guard; honest rollback docs

Addresses review 4535159341:

- bundle install: for an already-initialized project, the project's recorded
  active integration is now authoritative. --integration no longer overrides it
  (which let a copilot project install a claude-pinned bundle via
  `--integration claude`, bypassing the FR-019 clash guard). The override still
  selects the integration at init time and confirms the target only when the
  active integration cannot be determined.
- docs/reference/bundles.md: reword the install guarantee to match the
  implementation — no provenance record is written unless the install fully
  succeeds, and rollback of this run's components is best-effort (removal errors
  are swallowed, so partial on-disk state may remain). Dropped the inaccurate
  "atomic / rolls back everything" claim.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): validate component kind/id when loading records

Addresses review 4535194606: _component_from_dict now rejects a contributed
component whose 'kind' is not a supported component kind or whose 'id' is
empty, raising a BundlerError that explicitly flags the records file as
corrupt. Previously such a record loaded successfully and only failed later
(e.g. in primitive_manager() during bundle remove/update) with a less
actionable error.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): address review 4535234003 (7 findings)

- versioning: tolerate an uppercase `V` prefix in `_normalize_semver` and
  `is_semver`, mirroring specify_cli._version tag normalization (V -> v) so
  `V1.2.3` parses and validates consistently.
- validator: import BundlerError and narrow the speckit_version constraint
  except clause to `BundlerError` only, so programming errors are no longer
  masked behind an "invalid constraint" message.
- bundle update: accept `--integration` and thread it through
  resolve_install_plan the same way `bundle install` does (override used only
  when the active integration can't be auto-detected), so integration-pinned
  bundles can be updated where `.specify/integration.json` is missing/unreadable.
- bundle validate: fold reference warnings into `report.warnings` so the
  ValidationReport is the single warning channel at the CLI layer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(bundler): make update --integration help assertion ANSI-safe

Rich can split the "--integration" option label with ANSI escape codes
between the two leading dashes, so the literal substring check failed under
CI's terminal settings. Match the un-split option word instead, mirroring how
test_bundle_help_lists_all_commands checks bare command names.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): preserve exec bits in artifacts; document install-time pins

Addresses review 4535280786:

- packager.build_bundle: no longer forces every ZIP member to 0644, which
  stripped the executable bit from bundled scripts (e.g. extension hook
  scripts) and could break them after extraction. Permissions are now
  normalized reproducibly to 0755 when the source file has any execute bit
  set, otherwise 0644 — identical inputs still yield byte-for-byte identical
  artifacts.
- installer.install_bundle + docs/reference/bundles.md: document that version
  pins are enforced install-time only. Because primitive is_installed checks
  are id-based (not version-aware), an already-present component is skipped
  during install without comparing its on-disk version to the manifest pin;
  pins are guaranteed applied only on a real install or `bundle update` refresh.

Added a regression asserting executable sources map to 0755 and plain files to
0644 in the built artifact.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(bundler): skip exec-bit packager test on Windows

Windows filesystems do not carry Unix execute bits, so chmod(0o755) is a no-op
and the source file reports no execute bit — the packager then correctly stores
the member as 0644. The assertion that an executable source maps to 0755 is only
meaningful on POSIX, so skip it on nt rather than asserting platform-specific
behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): normalize prerelease spellings inside version constraints

Addresses review 4535327154: parse_version() normalized SemVer prerelease
spellings (e.g. 1.2.3-rc1 -> 1.2.3rc1) but parse_constraint() passed the
constraint to packaging.SpecifierSet unmodified, so ">=1.2.3-rc1" raised
InvalidSpecifier even though the same spelling is accepted for installed
versions. parse_constraint() now normalizes the version portion of each
comma-separated clause via the shared _normalize_semver helper, so prerelease
handling is consistent across versions and constraints.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(bundler): validate schema versions and required record identity fields

Addresses review 4535351596:

- records.load_records: validate the on-disk 'schema_version' (required;
  forward-compatible across same-major minor bumps) and fail fast with an
  actionable error on a missing/unknown version, rather than silently parsing a
  possibly-incompatible format and risking incorrect bundle attribution/removal.
- records.InstalledBundleRecord.from_dict: treat missing 'bundle_id' or
  'version' as corruption and raise BundlerError, instead of coercing them to
  empty strings that let later list/remove/update operations behave
  unpredictably.
- catalog_config._read: validate 'schema_version' when present (same-major
  compatibility) and fail fast on an unsupported version so an incompatible
  future config shape can't be mis-parsed into a wrong effective catalog stack.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(bundler): scrub generated dogfooding scaffold before merge

The bundler feature was developed by dogfooding Spec Kit on itself. Now that
the work is complete, remove all generated scaffolding so it does not land in
the repository on merge:

- specs/001-spec-kit-bundler/** (spec, plan, research, data-model, contracts,
  quickstart, tasks, checklists)
- .specify/** (extensions, integrations, scripts, templates, workflows,
  feature/init/integration metadata)
- .github/agents/speckit.*.agent.md, .github/prompts/speckit.*.prompt.md, and
  .github/copilot-instructions.md (Copilot integration scaffold)

Retained: .specify/memory/constitution.md — the single dogfooding artifact
carried forward — with its whitespace exemption in .gitattributes.

.gitattributes and .markdownlint-cli2.jsonc are reverted to the upstream
baseline (plus the constitution whitespace exemption), dropping the now-moot
exemptions for the removed scaffold.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Pascal THUET <pascal.thuet@arte.tv>
2026-06-19 17:07:20 -05:00
Pascal THUET
d9370d909d fix: isolate per-extension failures so one bad extension can't drop the rest (#2951)
* fix: isolate per-extension failures in register_enabled_extensions_for_agent

The per-extension loop had no error isolation: if registering one enabled
extension raised (e.g. an OSError writing a command file), the loop aborted and
the exception propagated, so every subsequent enabled extension was silently
skipped. Callers wrap the whole call in a single best-effort try/except, so the
wholesale abort surfaced as one warning while the command still exited 0 —
leaving the agent with only a prefix of its extensions.

Wrap the per-extension body in try/except: warn (naming the extension) and
continue, so one bad extension can no longer drop the others. Add a regression
test that forces the first-iterated extension to raise and asserts the rest
still register.

Closes #2950

* fix(extensions): preserve command registry when skills fail

* fix: clarify skill registration warning
2026-06-19 12:41:02 -05:00
Ed Harrod
98ee02a98b feat(claude): run /analyze in a forked subagent (#2511)
* claude: run /analyze in a forked subagent

/analyze is explicitly read-only and produces a compact analysis
report from heavy artefact reads (spec.md, plan.md, tasks.md). It
matches the canonical use case for context: fork — bulk inputs that
collapse to a short summary, no need for conversation history.

Forking keeps the artefact contents out of the main conversation
context, which is the concern raised in #752.

Done as a per-command opt-in via FORK_CONTEXT_COMMANDS so other
spec-kit commands (which are interactive or have side effects) are
unaffected.

Refs #752

* claude: apply per-command frontmatter on every skill-generation path

argument-hint and fork context were injected only in setup(), so skills
produced via post_process_skill_content() directly (presets, extensions)
lost them - e.g. a preset overriding speckit-analyze dropped context: fork.

Move the per-command injection into post_process_skill_content(), deriving
the command stem from the frontmatter name, so all generation paths stay
consistent. setup() now just calls post_process_skill_content().

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* claude: drop redundant post-process loop from setup

SkillsIntegration.setup() already runs post_process_skill_content()
on every SKILL.md before writing it, and that method now applies the
argument-hint and fork-context injection. The per-file re-process loop
in ClaudeIntegration.setup() was therefore a no-op, so inherit the
base setup() directly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 10:28:45 -05:00
Ben Buttigieg
0c29d890ab feat: add /speckit.converge command (#3001)
* Add /speckit.converge SDD artifacts and project scaffolding

Dogfood the converge feature through Spec Kit's own workflow:

- spec.md, plan.md, tasks.md, research, data-model, contracts, quickstart
- requirements checklist for the feature
- ratified constitution v1.0.0 (.specify/memory)
- Specify project scaffolding (.specify/, .github agent + prompt files)

Defines a built-in /speckit.converge command that assesses spec/plan/tasks
against the codebase and appends remaining work as new tasks (no git, no
change tracking, append-only). Implementation not yet started.

Excludes unrelated working-tree changes to agents.py, extensions.py,
test_extensions.py, catalog.community.json, and README.md.

* Implement /speckit.converge command

Add the built-in converge command that assesses the codebase against a
feature's spec.md, plan.md, and tasks.md and appends remaining unbuilt work
as new traceable tasks to tasks.md (append-only; no git, no change tracking).

- templates/commands/converge.md: full command body (load artifacts, assess
  code, classify findings missing/partial/contradicts/unrequested, append
  '## Phase N — Convergence' tasks with source-ref + gap-type, read-only
  guardrails, converged branch, handoff, before/after_converge hooks)
- Register converge as a core command across all enumeration sites
  (SKILL_DESCRIPTIONS, _FALLBACK_CORE_COMMAND_NAMES, ARGUMENT_HINTS, and the
  integration test command lists incl. copilot/generic file inventories)
- init.py Next Steps panel + README Core Commands table
- tasks.md: T001-T024 complete (T025 manual quickstart pending)

Full suite green: 2343 passed.

* Record quickstart validation results for /speckit.converge (T025)

All six quickstart scenarios validated (GitHub Copilot agent, macOS/zsh):
S1 gap->appended traceable task, S2 implement+re-converge, S3 converged leaves
tasks.md unchanged, S4 read-only boundaries, S5 missing-prereq stop, S6 cross-
integration install (copilot + windsurf). Automated suite: 2343 passed.

* Record 2026-06-16 re-verification results for /speckit.converge (T025)

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Fix integration upgrade deleting settings.json and dropping script +x

Two upgrade-path bugs surfaced during converge E2E validation:

- copilot upgrade stale-deleted .vscode/settings.json because setup() only tracks the file when it creates it; on upgrade the pre-existing file is merged and left untracked, so Phase 2 stale cleanup removed it. Add an integration-level stale_cleanup_exclusions() hook (CopilotIntegration returns {.vscode/settings.json}) and subtract it from stale_keys.

- shared .specify/scripts/*.sh lost their execute bit because the managed refresh rewrites them with the bundled source mode (often 0o644) and nothing restored perms. Call ensure_executable_scripts() after the managed-refresh block (POSIX only).

Add regression tests in TestIntegrationUpgrade covering both fixes (validated to fail without the fixes).

* fix: resolve markdownlint errors in PR files

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: clean up runtime state files from PR

Remove .specify state files that are per-project runtime artifacts:
- feature.json, init-options.json, integration.json
- manifest files, extension registry, bug artifacts

These are generated by 'specify init' and should not be committed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: fold converge artifacts from #3003 and #3005

- Add speckit.converge Copilot agent and prompt files (#3003)
- Add regression test for Claude argument hints (#3005)
- Remove invalid converge entry from Claude argument hints
- Fix documentation removing branch-prefix fallback claims

Supersedes: #3003, #3005

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: remove non-converge specify scaffolding from PR

Remove .specify/ artifacts, non-converge .github/agents and prompts,
and copilot-instructions.md that were generated by 'specify init'
and are not part of the converge command feature.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: remove SDD spec artifacts from PR

Remove specs/001-converge-command/ — the spec/plan/tasks/research SDD
artifacts produced while building this feature. spec-kit does not track
a specs/ directory on main (those are outputs of running the workflow on
the repo, not part of the shipped tool).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: remove generated Copilot converge command files

Remove .github/agents/speckit.converge.agent.md and
.github/prompts/speckit.converge.prompt.md — these are generated by
'specify init --integration copilot' from templates/commands/converge.md
(all __SPECKIT_COMMAND_*__/{SCRIPT} tokens are resolved). main tracks no
.github/agents or .github/prompts files; the template is the source of truth.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: split out unrelated integration-upgrade fix

Move the stale_cleanup_exclusions / executable-bit upgrade fix
(base.py, copilot, _migrate_commands.py, test_integration_subcommand.py)
out of this PR into its own change. This PR is now scoped purely to the
/speckit.converge command.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: add converge to core command template ordering

converge is a core command in SKILL_DESCRIPTIONS but was missing from
_CORE_COMMAND_TEMPLATE_ORDER, so it sorted with the fallback rank. Add it
after 'implement' to keep core-command ordering consistent across
integrations.

Addresses review feedback on #3001.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: make converge findings example neutral

Replace the self-referential sample evidence text in the Convergence
Findings table with a neutral placeholder so agents are less likely to copy
nonsensical template-specific findings into real output.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* docs: clarify converge scope and hook outcome wording

- Remove FR-specific parenthetical from code-scope rule so it doesn't imply
  a hard FR-001 reference exists in every feature
- Replace unsupported 'pass outcome to hook context' instruction with explicit
  in-session outcome reporting before hook listing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: align converge task example with tasks format

Use  (no colon) in the convergence task example so it
matches tasks-template formatting and downstream expectations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Clarification of usage

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* docs: align converge phase/task-id format with tasks template

- Use  (colon) for consistency with tasks template
- Clarify appended task IDs must be zero-padded ( style)
- Update checklist example to a concrete zero-padded ID ()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: standardize converge phase heading format

Use  consistently in converge.md (including the
append-only contract section) to match Step 7 and tasks template style.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-17 14:47:00 -05:00
Ben Buttigieg
84db931f18 fix: preserve .vscode/settings.json and script +x bit on integration upgrade (#3020)
* fix: preserve .vscode/settings.json and script +x bit on integration upgrade

During 'specify integration upgrade', Phase 2 stale-cleanup removes files
present in the old manifest but absent from the new one. Copilot's setup()
merges into an existing .vscode/settings.json and stops tracking it, so the
file was being deleted on upgrade (destroying user settings). Add a
stale_cleanup_exclusions() hook that integrations use to protect such
conditionally-tracked merge targets. Also restore the executable bit on
shared .sh scripts after the managed-refresh step on POSIX.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address review on stale-cleanup fix

- Normalize stale_cleanup_exclusions() to POSIX before subtracting from
  manifest keys, so exclusions built with os.path.join / backslashes still
  match on Windows.
- Strengthen test_upgrade_preserves_existing_vscode_settings to add a
  user-defined key and assert it survives the upgrade (via --force, exercising
  the merge + stale-cleanup path) instead of the brittle after == before check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-17 14:22:04 -05:00
Huy Do
affbf5ead5 feat(workflows): add from_json expression filter (#2961)
* feat(workflows): add from_json expression filter

Step outputs captured as strings could never become typed values in
templates - the filter set was default/join/map/contains only, so e.g.
a fan-out items: could never consume a step's JSON stdout. Add an
arg-less from_json pipe filter with parse-or-raise semantics: invalid
JSON or non-string input raises a clear ValueError rather than passing
through silently.

Fixes #2960

* fix(expressions): make from_json strict — reject any arguments

Address review (#2961): from_json('x') and from_json() previously fell through to a silent passthrough of the unparsed value. Reject any parenthesized form with a clear error so mis-wired templates fail loudly. Rename test to ...parses_object (JSON under test is an object) and add coverage for the strict no-arguments behavior.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(workflows): document the from_json expression filter

Address Copilot review: the user-facing filter references omitted the
newly added `from_json` filter. Add it to the ARCHITECTURE.md filter table
(with the `{{ steps.emit.output.stdout | from_json }}` example) and to the
filter enumerations in workflows/README.md and docs/reference/workflows.md
so the docs match the evaluator's capabilities.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(workflows): make from_json strictness reject trailing tokens; fix docstring

Address Copilot review:
- Strictness only rejected parenthesized forms, so typos like
  `| from_json)` or `| from_json extra` still fell through to the
  unknown-filter path and silently returned the unparsed value. Match on
  the leading filter token and require the whole filter to be exactly
  `from_json`, so every mis-wired form raises. Extend the rejection test to
  cover the trailing-token cases.
- The module docstring claimed "no imports", which is misleading now that
  the module imports `json`. Reword to state the actual sandbox guarantee:
  templates cannot do file I/O, import modules, or run arbitrary code.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-17 13:43:26 -05:00