39 Commits

Author SHA1 Message Date
Dyan Galih
6fb7e77b3e fix: allow prerelease spec-kit versions in compatibility checks (#2695)
* docs: generate integrations reference from catalog

* refactor: integrate table rendering into specify integration search --markdown

- Remove standalone scripts/generate_integrations_reference.py
- Strip doc injection machinery from catalog_docs.py; keep only table rendering
- Wire render_integrations_table() into existing --markdown flag of integration search
- Remove old simple markdown table block from integration_search (was Name|ID|Version|Description|Author)
- Simplify tests: drop subprocess/doc-path tests, keep table rendering and metadata tests
- Clean up docs/reference/integrations.md: remove generated markers, update note

* fix: address Copilot review feedback on catalog_docs and integration_search

- Warn when --markdown is combined with filters (query/--tag/--author) which are
  silently ignored; catch ValueError/FileNotFoundError and surface clean error
  via console instead of raw traceback (r3244821516)
- Add coverage enforcement in list_integrations_for_docs(): raises ValueError
  with actionable message if any registry key is missing from INTEGRATION_DOC_URLS,
  preventing silently incomplete doc tables (r3244821589)
- Rename test to accurately reflect sources: label derives from registry config,
  URL comes from INTEGRATION_DOC_URLS doc map — not solely from registry (r3244821607)
- Simplify test dict construction to idiomatic dict comprehension (r3244821619)

* fix: add sync test, INTEGRATIONS_REFERENCE_PATH constant, and fix naming

* revert: restore docs/reference/integrations.md to upstream/main; remove sync test (GH Actions job will handle)

* fix: remove dead INTEGRATIONS_REFERENCE_PATH, drop URL-length padding, fix docstring, drop FileNotFoundError

* fix: send --markdown warnings/errors to stderr, rename test for clarity

* fix: detect stale doc-map keys, test _render_cell escaping, strengthen header assertion

* refactor: promote _render_cell to public render_cell function

* test: mock registry and doc maps to avoid brittle live registry coupling

* refactor: flatten patches, remove unused imports, fix trailing whitespace, optimize missing calculation

* refactor: make validation non-fatal, fix context manager syntax, add CLI tests

* fix: improve docstring clarity, test robustness, and exception handling

* fix: improve test assertions, disable warnings by default, enhance exception handling

* fix: make CLI tests deterministic and improve config access resilience

* fix: remove extra blank line, add stale keys validation, add regression test for docs sync

* Fix 5 remaining feedback items:
- Rename _get_mocked_cli_runner() to _get_catalog_docs_patches() for clarity
- Use ExitStack context manager for guaranteed patch cleanup
- Add explicit UTF-8 encoding to file reads
- Skip doc sync test gracefully when docs aren't present
- Remove exception chaining from typer.Exit to avoid noisy tracebacks

* address all outstanding copilot review feedback on PR 2563

* Address Copilot feedback: escape URLs in markdown links, deduplicate cell rendering, fix table parser for escaped pipes

* Address 3 new Copilot feedback: add URL escaping test, fix parse_first_markdown_table for escaped pipes, guard community tests with skip

* Address 3 new Copilot feedback: escape id field, remove unused alias, escape integration URLs

* Address 3 new Copilot feedback: fix comment name, include all integrations in list

* Fix architectural issue: escape raw fields before composing Markdown to prevent double-escaping

* Deduplicate _escape_url_for_markdown_link and add URL escaping test

* Address 4 new Copilot feedback: add trailing newline, fix test helper ExitStack, update warning message

* Address 4 new Copilot feedback: make escape function public, fix error message, validate test rows, prevent double newline

* Update error message in test_missing_catalog_file for clarity

* Remove obsolete integrations sync test

* keep integrations docs in sync

* fix: allow prerelease spec-kit versions in compatibility checks

Allow prerelease/dev builds to satisfy extension and preset compatibility
checks when their version number falls within the required specifier range.
Also harden the integrations docs rendering helpers and add regression
coverage for the markdown table parsing and version gating paths.

Tests: pytest -q; python3 -m compileall -q .; black/flake8 unavailable
Reference: branch 002-generate-integrations-docs; source patch /tmp/spec-kit-changes.patch

* fix: isolate prerelease compatibility gate changes

Keep the prerelease/version compatibility fix on its own branch and remove
the unrelated integrations docs updates that belong with PR 2563.

Tests: full suite passed on the prerelease branch before splitting; docs branch covered by targeted docs tests
Reference: upstream/main; source patch /tmp/spec-kit-changes.patch

* Address PR 2695 feedback: Centralize prerelease policy and add boundary test

* Address remaining Copilot PR feedback: revert docs and add preset prerelease tests

* Remove unreachable raise CompatibilityError

* Fix PEP8 E302 and E303 formatting issues
2026-06-30 09:41:57 -05:00
Noor ul ain
cb7c36c95b fix: reject host-less catalog URLs in base and preset validators (#3209) (#3227)
`CatalogStackBase._validate_catalog_url` (inherited by `IntegrationCatalog`)
and `PresetCatalog._validate_catalog_url` checked `parsed.netloc`, which is
truthy for host-less URLs like `https://:8080` (port only) or `https://user@`
(userinfo only). Such URLs slipped past validation despite the error message
promising "a valid URL with a host", then failed later with a confusing fetch
error.

Switch both validators to `parsed.hostname` (None for those inputs), matching
the workflow, step, and bundler catalog validators that already do this.

Add regression tests covering port-only and userinfo-only URLs for both
validators.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 07:18:39 -05:00
Quratulain-bilal
5bdcb4ad14 fix(catalogs): reject host-less catalog URLs in base and preset validators (#3210)
the shared CatalogStackBase validator and PresetCatalog validator
checked parsed.netloc to enforce 'a valid URL with a host'. but netloc
is truthy for host-less URLs like https://:8080 or https://user@, so
those slipped through even though they have no host - contradicting the
error message. the workflow, step, and bundler validators already check
parsed.hostname (which is None in those cases); this aligns the two
stragglers with that. add regression tests covering port-only and
userinfo-only URLs.
2026-06-29 10:29:14 -05:00
Si Zengyu
1add20341d fix(extensions,presets,workflows): resolve private GHES release assets via /api/v3 (#3157)
* feat(auth): add github_provider_hosts() to enumerate GHES hosts from auth.json

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* fix(extensions): resolve GHES release assets via /api/v3

Generalizes resolve_github_release_asset_api_url to GitHub Enterprise
Server hosts (gated by auth.json github hosts), fixing private GHES
extension/preset downloads. github/spec-kit#3147

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* fix(extensions,presets): pass auth.json github hosts into release resolver

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* docs(auth): document GHES private catalog + release-asset auth

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* fix(presets,workflows): pass auth.json github hosts into remaining release resolvers

Wires preset add --from and workflow add through github_provider_hosts()
so private GHES release assets resolve via /api/v3 there too. github/spec-kit#3147

Assisted-by: Claude Code (model: claude-sonnet-4-6, autonomous)

* test(presets): use module-level io.BytesIO in GHES preset test

Addresses Copilot review on PR #3157: drop unnecessary __import__("io")
in test_preset_add_from_ghes_release_url_resolves_via_api_v3 since io is
already imported at module level.

* fix(github-http): pass through GHES asset API URLs by path shape

Addresses Copilot review on PR #3157. A direct GHES /api/v3 release asset
URL was only returned as already-resolved when its host was in the
allowlist; otherwise the resolver returned None and the caller downloaded
the same URL without 'Accept: application/octet-stream', fetching JSON
metadata instead of the binary.

Gate the passthrough on path shape alone, mirroring the github.com case.
This is safe: passthrough returns the input URL unchanged and the caller
fetches it either way, so no new request to an arbitrary host is induced;
the token stays independently gated by auth.json in open_url. The
allowlist remains the anti-SSRF gate on the tag-lookup resolving path.

Add test_passthrough_for_unlisted_ghes_api_asset_url.
2026-06-25 10:44:30 -05:00
meymchen
dc840f07d0 feat(integration): update Kimi integration for Kimi Code CLI (#2979)
* feat(integration): update Kimi integration for Kimi Code CLI

Update the Kimi integration to target the new Kimi Code CLI
(MoonshotAI/kimi-code) layout:

- Change skills directory from .kimi/skills/ to .kimi-code/skills/
- Change context file from KIMI.md to AGENTS.md
- Extend --migrate-legacy to move old .kimi/skills/ installs and
  migrate KIMI.md user content to AGENTS.md
- Clean up leftover legacy .kimi/skills/ directories on teardown
- Update devcontainer installer to @moonshot-ai/kimi-code
- Update docs and tests

Relates to #1532

* fix(integration): align Kimi dispatch and harden legacy migration

- Override build_command_invocation to emit /skill:speckit-<stem>
  so dispatched commands match Kimi Code CLI's native slash syntax.
- Skip symlinked .kimi/skills directories during legacy migration
  and teardown to avoid operating on files outside the project.
- Remove kimi from the multi-install-safe integrations table.
- Add tests for command invocation and symlink safety.

* fix(integration): resolve custom context markers in Kimi legacy migration

Use IntegrationBase._resolve_context_markers() when migrating legacy
KIMI.md content so that projects with customized context_markers in
.specify/extensions/agent-context/agent-context-config.yml have the
managed section stripped with the correct markers instead of the
hard-coded defaults.

Adds a test verifying custom markers are respected during
--migrate-legacy.

* fix(integration): harden Kimi legacy migration against symlinked paths

* fix(kimi): guard symlinked SKILL.md during migration and teardown

* docs(kimi): mention KIMI.md→AGENTS.md migration in --migrate-legacy help

The --migrate-legacy help text listed only the skills directory move and
dotted→hyphenated renaming, but the flag also migrates KIMI.md user content
into AGENTS.md. Align the help with the actual behavior, docs, and tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(kimi): validate legacy migration destination; clarify docstrings

Address Copilot review feedback on PR #2979:

- setup(): gate skills migration on _is_safe_legacy_dir(new_skills_dir)
  as well as the source. base setup() already rejects a destination that
  escapes the project root, but an in-tree symlinked .kimi-code/skills
  (e.g. -> .) could still misdirect the move; this gives the destination
  the same symlink-component protection as the source.
- _migrate_legacy_kimi_dotted_skills: rewrite docstring as a compatibility
  shim describing same-path delegation to _migrate_legacy_kimi_skills_dir.
- test_presets: clarify that the dotted-skill test exercises legacy naming
  under the current .kimi-code/ base, not the legacy .kimi/ location.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(kimi): harden legacy KIMI.md→AGENTS.md context migration

- Skip context-file migration when the agent-context extension is
  disabled, matching upsert/remove_context_section opt-out behavior so
  an opted-out project's KIMI.md/AGENTS.md are left untouched.
- Safely skip (instead of raising) on filesystem edge cases: unreadable
  or non-UTF-8 KIMI.md, and AGENTS.md existing as a non-file/unwritable.
- Refuse to migrate a corrupted managed section (single marker, or end
  before start) so a partial managed block is never copied into
  AGENTS.md; KIMI.md is preserved for manual repair.

Add regression tests for all three cases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Approve fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* chore(kimi): revert CHANGELOG.md edit (auto-generated)

The CHANGELOG is generated from merged PR titles, so a hand-written entry
is redundant; it was also placed under the already-released 0.10.2 section,
which would make those release notes historically inaccurate. Revert to
match main per maintainer feedback.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(kimi): skip symlink-safety tests when symlinks are unavailable

The Kimi legacy-migration safety tests create symlinks to assert that
migration/teardown never follow them out of the project. Symlink creation
fails on Windows without the create-symlink privilege and in some restricted
CI sandboxes, so these tests errored during setup instead of skipping.

Wrap every symlink_to() call in a shared _symlink_or_skip() helper that
pytest.skip()s on OSError/NotImplementedError, matching the guard pattern
already used by one of these tests. Verified on Windows: the 6 symlink tests
now skip cleanly (51 passed, 6 skipped) instead of erroring.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(kimi): reject symlinked skills destination before install

Add a destination symlink pre-check in KimiIntegration.setup() before
super().setup() writes any SKILL.md. The base class only rejects a
destination that escapes project_root after resolve(), so an in-tree
symlinked .kimi-code/.kimi-code/skills (e.g. `-> .`) would still
misdirect writes into an unintended in-tree location (./skills/).

Extract the symlink-component walk into a shared _has_symlinked_component()
helper and reuse it from _is_safe_legacy_dir(). Add a regression test.

Also clarify that --migrate-legacy only migrates KIMI.md -> AGENTS.md when
the agent-context extension is enabled, in the CLI help text and the
integration docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Refactor formatting and simplify logic in Kimi integration

* fix(kimi): reject symlinked target dir during legacy skills migration

When the migration destination already exists, guard against a symlinked
(or non-directory) target_dir before comparing SKILL.md bytes, so the
comparison never follows a link outside the project root. Also skip a
missing/non-file target SKILL.md explicitly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-24 15:22:08 -05:00
Zied Jlassi
b042d2a843 feat(extensions): verify catalog archive sha256 before install (#3080)
* feat(extensions): verify catalog archive sha256 before install

Extension and preset archives were downloaded over HTTPS and unpacked
(with Zip-Slip protection) but their bytes were never checked against a
known digest. Trust rested entirely on TLS and the integrity of the
release host, so a tampered or swapped archive from a compromised
third-party release would be installed silently. Maintainers do not audit
extension code, so consumer-side integrity is the only available defence.

Catalog entries may now pin an optional `sha256` digest. When present, the
downloaded archive is verified before it is written to disk and installed;
a mismatch aborts with a clear error. Entries without `sha256` keep
working unchanged (a DEBUG line records that the download was unverified),
so the change is backwards compatible. The check runs on both download
paths (extensions and presets) via a single shared helper so the two stay
in parity.

- Add `verify_archive_sha256` helper in shared_infra (digest match,
  `sha256:` prefix, case-insensitive; DEBUG log when no digest declared)
- Enforce it in ExtensionCatalog.download_extension and
  PresetCatalog.download_pack, before the archive is written to disk
- Document the optional `sha256` field in the publishing guides
- Tests: helper unit tests + matching/mismatch/no-digest on both paths

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
Assisted-by: AI

* fix(extensions): harden sha256 parsing and tidy download test mocks

Follow-up to the review on #3080:

- shared_infra.verify_archive_sha256: strip only a literal `sha256:`
  algorithm prefix (case-insensitive) instead of `split(':', 1)[-1]`,
  which silently dropped any prefix — so `md5:<64-hex>` was accepted as
  if it were a valid SHA-256. Validate that the declared value is exactly
  64 hex characters and raise a clear error otherwise, and compare with
  `hmac.compare_digest` for a constant-time check. Add tests covering a
  malformed digest and a non-`sha256:` prefix (both previously accepted).
- Download test helpers: configure the context-manager mock via
  `__enter__.return_value`/`__exit__.return_value` rather than assigning a
  `lambda s: s`, which is clearer and independent of the invocation arity.

Assisted-by: AI
Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>

* fix(extensions): reject a declared-but-empty sha256 instead of skipping verification

verify_archive_sha256 skipped on any falsy expected value, so a present-but-empty digest (e.g. sha256: "" reached via ...get("sha256")) silently disabled the integrity check instead of surfacing the authoring error. Guard on expected is None so only an absent digest skips; blank/whitespace/bare-prefix values fall through to the 64-hex validation and are rejected. Adds a regression test.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

* docs(shared_infra): clarify _SHA256_HEX_RE accepts and normalizes uppercase

The comment described the regex as matching '64 lowercase' hex characters,
but verify_archive_sha256 lowercases the declared value (raw.lower()) before
matching, so an uppercase digest is accepted and normalized rather than
rejected. Clarify the comment to avoid misleading future readers.

Addresses Copilot review feedback on shared_infra.py.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

* test(presets): cover the no-sha256 backwards-compatible path

Address Copilot review: download_pack's optional sha256 verification was
tested for match/mismatch but not the backwards-compatible path where a
catalog entry has no sha256 (pack_info.get("sha256") is None). Add a
no-sha256 test mirroring the extensions coverage so the helper never
silently becomes mandatory for presets.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

---------

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
Signed-off-by: Zied Jlassi (Architect AI) <6190550+zied-jlassi@users.noreply.github.com>
2026-06-24 14:52:24 -05:00
Manfred Riem
79a34b892d fix(presets): use _repo_root() for bundled-core source-checkout fallback (#3086) (#3091)
* fix(presets): use _repo_root() for bundled-core source-checkout fallback

The tier-5 fallback in PresetResolver.resolve() and
_find_bundled_core() computed the repo root as
Path(__file__).parent.parent.parent. After presets.py was moved to
presets/__init__.py (#2826) that chain is one level short, resolving
to src/ and looking for src/templates/commands/<cmd>.md, which never
exists. As a result, wrap-strategy presets found no core base layer in
source/editable installs.

Use the shared _repo_root() helper so both fallbacks resolve against
the actual repo-root templates/ tree. Wheel installs were unaffected
(core_pack path), so this only impacts source/editable checkouts.

Refs #3086

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(presets): restore dropped def for oserror-manifest test

A prior edit accidentally removed the
def test_resolve_extension_command_via_manifest_skips_oserror_manifests
line, orphaning its body inside the new bundled-core test. Restore the
test definition so pytest collects it again.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(presets): move bundled-core tests into TestPresetResolver

The two tier-5 fallback regression tests exercise collect_all_layers()
and resolve(), not resolve_core(), so they belong in TestPresetResolver
rather than TestResolveCore. Relocate them for clearer suite navigation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-22 11:33:44 -05:00
Ali jawwad
f9c6cf83e5 fix(presets): preserve argument-hint in preset SKILL.md generation (#2978)
* fix(presets): preserve argument-hint in preset SKILL.md generation

Preset-provided and extension-override commands that declare
`argument-hint:` in their frontmatter had it dropped from the generated
Claude SKILL.md, and it was re-dropped when a preset was removed and its
overridden skill restored. This is the preset-side analog of the
extension fix in #2903 / #2916.

Factor the argument-hint carry-over into a shared
CommandRegistrar.apply_argument_hint() helper and apply it at the four
preset skill-generation sites (register, reconcile override-restore, and
the core/extension unregister-restore paths). The extension path from

The helper writes argument-hint into the frontmatter dict before
serialization (so a folded multi-line description cannot be split into
invalid YAML) and only for integrations that support it (those exposing
inject_argument_hint -- currently Claude), leaving build_skill_frontmatter's
shared shape unchanged for every other agent. Core templates carry no
argument-hint, so the core-restore path is a no-op. No behavior change for
non-Claude agents or the core path.

Add regression tests covering a folding description (Claude) and the
non-Claude gate (codex).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(presets): address review - guard skill_frontmatter type and tighten apply_argument_hint annotations

Add a symmetric isinstance(skill_frontmatter, dict) guard so the helper stays a safe no-op if a caller passes a non-dict, and annotate the parameters as Dict[str, Any] with an optional integration to match real call-site usage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 09:52:13 -05:00
darion-yaphet
f20e8ee6f7 refactor: move preset command handlers to presets/_commands.py (PR-6/8) (#2826)
* refactor(presets): convert presets.py module to presets/ package

Pure structural move to mirror integrations/. presets.py becomes
presets/__init__.py with relative imports rebased one level deeper.
No behavior change; public import surface (from .presets import ...)
preserved. Prepares for co-locating preset command handlers in PR-6/8.

* refactor: move preset command handlers to presets/_commands.py (PR-6/8)

Cut the preset_app / preset_catalog_app Typer groups and all 12 command
handlers out of __init__.py into presets/_commands.py, exposing register(app)
— mirrors the integration co-location from PR-5. __init__.py now registers
via _register_preset_cmds(app), dropping ~620 lines (3282 -> 2663).

Handlers lazy-import root helpers (_require_specify_project, get_speckit_version,
_locate_bundled_preset, _display_project_path) via 'from .. import' so test
monkeypatching of specify_cli.<helper> keeps working. _locate_bundled_preset
kept as an explicit re-export in __init__.py for that resolution path.

CLI surface and public imports unchanged. Full suite: 3162 passed, 40 skipped.
2026-06-16 14:52:12 -05:00
darion-yaphet
36ad3cde1b fix(presets): harden preset URL installs against unsafe redirects (#2911)
* Harden preset URL installs against unsafe redirects

Preset URL installs already rejected non-HTTPS source URLs, but the authenticated opener follows redirects. Validate the final response URL before writing the ZIP, preserve GitHub release asset URL resolution after the preset command module split, stream the response to disk, and keep catalog config serialization on safe YAML output.

Constraint: open_url follows redirects, so source URL validation alone does not constrain the downloaded target

Rejected: Keep response.read() for simplicity | large preset downloads should not be buffered entirely in memory

Confidence: high

Scope-risk: narrow

Directive: Keep preset URL policy aligned with workflow installer redirect validation

Tested: uvx ruff check src/specify_cli/__init__.py src/specify_cli/presets/__init__.py src/specify_cli/presets/_commands.py tests/test_presets.py

Tested: uv run pytest tests/test_presets.py -q

Not-tested: Real network redirect integration against a live HTTP server

Co-authored-by: OmX <omx@oh-my-codex.dev>

* Reject malformed preset download URLs

Preset downloads should fail early when a URL lacks a hostname, even if the scheme is HTTPS. The redirect error now describes any disallowed target instead of implying that only non-HTTPS redirects are blocked.

* Prevent credentialed preset redirects from downgrading transport

Preset URL downloads already checked the final URL after urllib followed redirects, but that was too late for authenticated requests because same-host redirects could preserve Authorization during the redirect itself. The authenticated HTTP helper now supports an opt-in redirect validator, and preset downloads use it to reject disallowed redirect targets before following them. The redirect auth handlers also stop preserving credentials across HTTPS to non-HTTPS downgrades as defense in depth.

* test(presets): 修复 URL 解析测试 mock 缺少 redirect_validator 参数

重定向安全加固为 open_url 新增 redirect_validator 参数,
两处 fake_open_url mock 签名未同步导致 TypeError。
补齐参数后全部 3717 个测试通过。

---------

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-06-11 07:21:50 -05:00
Quratulain-bilal
de88c23bb6 fix(catalogs): validate extension and preset catalog payload shape (#2621)
* fix(catalogs): validate extension and preset catalog payload shape

`ExtensionCatalog._fetch_single_catalog` and
`PresetCatalog._fetch_single_catalog` only check that the `extensions` /
`presets` key is *present* in the parsed catalog JSON. They don't check
that the value is a JSON object, and they don't check that the root is
a JSON object at all. A malformed (or compromised) upstream catalog
returning:

    {"schema_version": "1.0", "extensions": []}

passes both `"extensions" not in catalog_data` and the subsequent
`response.read()` JSON parse, gets cached on disk, and then crashes
deep inside `_get_merged_extensions` (resp. `_get_merged_packs`) with:

    AttributeError: 'list' object has no attribute 'items'

instead of the existing user-facing
`ExtensionError("Invalid catalog format from <url>")` /
`PresetError("Invalid preset catalog format")` that the surrounding
code is clearly trying to produce.

The sibling integration-catalog reader already validates this — see
`src/specify_cli/integrations/catalog.py` where the fetch path
explicitly checks both `isinstance(catalog_data, dict)` and
`isinstance(catalog_data.get("integrations"), dict)` before returning.
This change mirrors that pattern in the extension and preset readers so
the three catalog fetchers stay consistent and a malformed upstream
surfaces as the user-facing error instead of a raw Python traceback.

Adds parametrized regression tests covering:
- root payload is not a JSON object (list, str, int, null)
- root is a dict but `extensions` / `presets` value is the wrong type
  (list, str, null, int)

All eight bad-payload shapes now raise the expected catalog error.

* fix(catalogs): skip non-mapping entries during extension and preset merge

Addresses Copilot review feedback on this PR.

`_fetch_single_catalog` now validates that the ``extensions`` / ``presets``
value is a mapping, but it doesn't (and shouldn't) validate every entry
inside that mapping. A payload like:

    {"schema_version": "1.0", "extensions": {"good": {...}, "bad": []}}

passes the fetch-level guard, then later crashes inside
``_get_merged_extensions`` (resp. ``_get_merged_packs``) at
``{**ext_data, ...}`` with ``TypeError: 'list' object is not a mapping``.

The sibling integration-catalog reader at
``src/specify_cli/integrations/catalog.py:245`` handles this with a
per-entry ``isinstance(integ_data, dict)`` skip during merge, so one
malformed entry doesn't poison an otherwise valid catalog. This change
mirrors that pattern in the extension and preset mergers and adds
regression tests asserting that valid entries continue to merge while
malformed siblings are silently dropped.

* fix(catalogs): validate cached extension and preset payload shape

Addresses Copilot review feedback on this PR (round 2).

The earlier commits in this branch added payload-shape validation on the
network fetch path. The cache-hit path still returned
``json.loads(cache_file.read_text())`` directly without re-checking the
shape, so a cache poisoned by an older spec-kit version (or a manual
edit, or an upstream that briefly served a bad payload before the
network guards landed) would re-crash every invocation of
``_get_merged_extensions`` / ``_get_merged_packs`` with
``AttributeError: 'list' object has no attribute 'items'`` despite the
cache being "valid" by age.

Extracts the shape validation into ``_validate_catalog_payload`` on both
``ExtensionCatalog`` and ``PresetCatalog``, and calls it from both the
cache-load and network-fetch branches of ``_fetch_single_catalog``. If
the cached payload fails validation, the cache read is treated like a
``json.JSONDecodeError`` — the cached value is discarded and the
function falls through to the network fetch, which refreshes the cache
with a clean payload on success. Never propagates ``AttributeError`` to
the caller.

Regression tests parametrize the four root-bad-type variants plus three
``extensions``/``presets``-bad-type variants per file, asserting that a
poisoned cache silently recovers via network refetch and returns the
freshly-fetched payload.

* fix(catalogs): include URL in missing-keys error to match sibling branches

Addresses Copilot review feedback on this PR (round 3).

``_validate_catalog_payload`` advertises in its docstring that the
catalog URL is included in error messages "so the user can tell which
catalog in a multi-catalog stack is malformed" — but the missing-keys
branch raised ``PresetError("Invalid preset catalog format")`` without
the URL, breaking that contract and making multi-catalog debugging
harder. The root-bad-type and nested-bad-type branches in the same
helper already include the URL; this commit brings the middle branch
in line.

For consistency, the same fix is applied to the legacy single-catalog
fetch paths in ``ExtensionCatalog.fetch_catalog`` and
``PresetCatalog.fetch_catalog`` (where the URL was likewise dropped
from the missing-keys error).

The existing regex matchers in the regression tests target the
``"Invalid (preset )?catalog format"`` prefix, which is preserved
verbatim before the ``from <url>`` suffix — no test changes needed.

* fix(catalogs): broaden cache except tuples and reuse validator in fetch_catalog

Addresses Copilot review feedback on this PR (round 4):

1. ``ExtensionCatalog.fetch_catalog`` and ``PresetCatalog.fetch_catalog``
   — the legacy single-catalog methods — still only checked key
   presence. A payload like ``42`` (root non-object) crashed with
   ``TypeError: argument of type 'int' is not iterable`` during the
   ``"schema_version" in catalog_data`` check, and an entry mapping of
   the wrong type crashed downstream. Both now reuse
   ``_validate_catalog_payload`` so the network-side behaviour of the
   legacy methods stays consistent with the multi-catalog
   ``_fetch_single_catalog`` path. (Copilot #3335623482, #3335623556.)

2. The cache-read ``except`` tuples in ``_fetch_single_catalog`` and
   ``fetch_catalog`` were too narrow. ``read_text`` can raise
   ``OSError`` (permissions / disk / handle limit) or ``UnicodeError``
   (cache file written by an older client in a different encoding)
   in addition to ``json.JSONDecodeError``. Without those in the
   tuple, an unreadable cache crashed the caller instead of falling
   through to the network refetch the cache contract documents. Both
   sites now catch ``(json.JSONDecodeError, OSError, UnicodeError,
   <DomainError>)``. (Copilot #3335623588, #3335623608.)

3. While here, pinned ``encoding="utf-8"`` on every cache ``read_text``
   call so cache files written by an older Windows client (with a
   non-UTF-8 default locale) decode the same way on a newer client.

Regression tests:

  - ``test_fetch_catalog_rejects_malformed_payload`` — 7 parametrized
    payloads per file covering root-non-object + nested-bad-type
    variants asserting ``fetch_catalog`` raises the named domain error.
  - ``test_fetch_catalog_recovers_from_unreadable_cache`` — writes
    ``b"\xff\xfe\x00not-utf-8"`` to the cache file and asserts
    ``fetch_catalog`` silently falls through to the mocked network and
    returns the freshly-fetched payload.

* fix(catalogs): harden cache-validity checks and pin UTF-8 on writes

The cache-best-effort contract added in 7f44b25 was incomplete on two
points raised by Copilot:

1. The cache-validity helpers (is_cache_valid /
   _is_url_cache_valid, plus the inline metadata-age check inside
   _fetch_single_catalog for per-URL caches) read the metadata file
   without specifying an encoding and only caught
   json.JSONDecodeError / ValueError / KeyError /
   TypeError. A metadata file written by a tool using the system
   locale codec, or one whose handle is briefly unavailable, would
   raise UnicodeDecodeError / OSError and propagate past the
   read-side try/except in fetch_catalog — the very crash the
   read-side guard was meant to prevent. The validity checks now read
   with encoding="utf-8" and treat OSError / UnicodeError
   as cache-invalid, matching the documented contract.

2. The network-fetch path wrote the cache and metadata files with bare
   write_text(...), picking up the platform default encoding. The
   read path was already pinned to UTF-8 (and the
   integrations/catalog.py:193-203 sibling writes UTF-8 too), so
   on hosts whose default codec isn't UTF-8 the write/read pair could
   disagree and force an unnecessary refetch on every invocation. All
   four write_text calls now pass encoding="utf-8" so the
   cache survives a round trip on any platform.

Also rewords the misleading # Fetch from network comment in
extensions.fetch_catalog — it sat above the cache-check block,
which read as if the cache step had been skipped.

Tests
-----

Adds two parametrized regression tests per catalog:

* test_fetch_catalog_recovers_from_unreadable_metadata plants
  non-UTF-8 bytes in the metadata file, asserts is_cache_valid()
  returns False (rather than raising), and confirms
  fetch_catalog falls through to the network instead of crashing.

* test_fetch_catalog_writes_cache_as_utf8 round-trips a payload
  containing a non-ASCII identifier (café) through the public
  fetch path and reads the cache back with
  read_text(encoding="utf-8"), catching encoding drift at the
  byte level rather than relying on the system codec to happen to be
  UTF-8.

Both pairs follow the established sibling-file symmetry — the
extension and preset suites stay in lock-step.

* test(catalogs): assert UTF-8 write encoding by recording write_text kwargs

Copilot's review on this PR caught that test_fetch_catalog_writes_cache_as_utf8
claimed to validate UTF-8 at the byte level but actually only round-tripped a
non-ASCII string through json.dumps/read_text. Because json.dumps defaults to
ensure_ascii=True, 'café' was serialized as the all-ASCII escape 'caf\u00e9'
before reaching write_text — the bytes on disk were identical regardless of the
encoding kwarg, so a locale-encoded write would have round-tripped just fine.
The drift guard the test name advertised was not actually being enforced.

Rewriting these tests to observe the production code's argument directly:
each test now monkey-patches pathlib.Path.write_text with a recorder that
captures the encoding kwarg for every call, runs the production fetch, and
asserts every write into the cache directory passed encoding='utf-8'. That is
the substantive thing the regression guard cares about — non-ASCII payload
tricks were the wrong lever to pull, because json.dumps was masking the
encoding choice before write_text ever ran.

Both tests verified locally against the current production code (492 passed in
the extensions+presets suites) and confirmed to fail against a synthetic
no-encoding write (the recorder records None instead of 'utf-8', the assertion
catches it). Same change applied symmetrically to test_extensions.py and
test_presets.py to keep the sibling files in lockstep with the production
code paths in extensions.py and presets.py.

* fix(catalogs): catch AttributeError on non-mapping cache metadata; drop stale line refs

Copilot's review on the previous push pointed out that the
cache-validity helpers still had a gap: metadata.get("cached_at", "")
assumes metadata is a dict, but json.loads happily parses a
file containing [] / "oops" / 42 / true / null into
a non-mapping. The except tuple covered json.JSONDecodeError,
OSError, UnicodeError, ValueError, KeyError and
TypeError but not AttributeError, so a valid-JSON-but-non-dict
metadata payload would still crash the caller instead of degrading to
"cache invalid" as the docstring promised.

This affected four cache-validity sites — symmetric across the two
catalog modules:

* extensions.py — inline per-URL metadata-age check in
  _fetch_single_catalog
* extensions.py — is_cache_valid (legacy default-URL path)
* presets.py — _is_url_cache_valid
* presets.py — is_cache_valid

All four except tuples now include AttributeError with a comment
naming the exact failure (metadata.get(...) on a non-mapping) so
the next reader doesn't have to reconstruct the reasoning.

Separately, Copilot flagged that several comments hard-coded a line
range from a sibling file
(integrations/catalog.py:193-203) — those references will go stale
the moment that file changes. Replaced the hard-coded ranges with
file-only references (integrations/catalog.py) so the pointer
stays accurate as that file evolves. Same change applied to both
modules.

Tests
-----

test_is_cache_valid_handles_non_mapping_metadata is added to both
test_extensions.py and test_presets.py, parametrized over the
five JSON non-mapping root types ([], "oops", 42,
true, null). Each variant plants the metadata file with that
exact content and asserts is_cache_valid() returns False
without raising. The parametrize covers every JSON type the public
spec allows at the root, so a regression that drops AttributeError
from any except tuple is caught against every observable shape rather
than relying on the next reviewer to remember the .get /
non-mapping interaction.

pytest tests/test_extensions.py tests/test_presets.py — 502
passed (was 492 before; the parametrize adds five vectors per file).

* fix(catalogs): make cache writes best-effort to match read-side contract
2026-06-09 07:22:49 -05:00
Copilot
7106858c4e feat!: remove legacy --ai, --ai-commands-dir, and --ai-skills flags (0.10.0) (#2872)
* Initial plan

* feat!: remove legacy --ai, --ai-commands-dir, and --ai-skills flags at 0.10.0

* refactor(tests): rename stale test_ai_help_* methods to test_agent_config_*

* fix: address review — derive agent folder for generic integration and remove redundant test

- Security notice now falls back to integration_parsed_options['commands_dir']
  when AGENT_CONFIG folder is None (generic integration).
- Remove test_agent_config_includes_kiro_cli which duplicates the assertion
  in test_runtime_config_uses_kiro_cli_and_removes_q.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: scrub all remaining --ai flag references from source and tests

- Remove dead AI_ASSISTANT_ALIASES, AI_ASSISTANT_HELP, and
  _build_ai_assistant_help() from _agent_config.py
- Update comments/docstrings in extensions.py, presets.py, and
  integration subpackages to reference 'skills mode' or
  '--integration' instead of the removed flags
- Fix catalog.json generic integration description
- Update test docstrings/comments in test_extension_skills.py,
  test_extensions.py, and test_presets.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: remove legacy --ai flag rejection tests

The flags are fully removed from the CLI; typer handles unknown options
generically. No custom rejection logic exists to test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* revert: remove manual CHANGELOG.md entry

CHANGELOG is generated automatically; manual edits should not be made.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make generic catalog description self-explanatory

Include the required --commands-dir sub-option in the description so
readers don't need to look up integration docs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(tests): rename duplicate test classes to avoid shadowing

The rename from Test*AutoPromote to Test*Integration collided with the
existing Test*Integration(SkillsIntegrationTests) base classes, causing
the shared test suites to be silently overwritten. Rename the CLI init
flow classes to Test*InitFlow instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-05 14:56:28 -05:00
lselvar
f512b8b0d1 fix: resolve GitHub release asset API URL for private repo preset and workflow downloads (#2855)
* fix: resolve GitHub release asset API URL for private repo preset and workflow downloads

- Add shared `resolve_github_release_asset_api_url` utility to `_github_http.py` for
  reuse across preset and workflow download paths
- Apply the same private-repo fix from PR #2792 (extensions) to:
  - `PresetCatalog.download_pack` — ZIP downloads via catalog `download_url`
  - `preset add --from <url>` — ZIP downloads from a direct URL
  - `workflow add <url>` — workflow YAML downloads from a direct URL
  - `workflow add <id>` (catalog) — workflow YAML downloads via catalog `url`
- For browser release URLs (`github.com/…/releases/download/…`), the asset is
  resolved via the GitHub REST API and downloaded with `Accept: application/octet-stream`
- Direct REST API asset URLs (`api.github.com/…/releases/assets/<id>`) are
  downloaded directly with `Accept: application/octet-stream`
- Auth is preserved end-to-end through the existing `open_url` infrastructure
- Update `test_download_pack_sends_auth_header` and add
  `test_download_pack_accepts_direct_github_rest_asset_url` to cover both paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: URL-encode tag in release API URL to handle special characters

Encode the tag as a path segment (using quote with safe='') when
building the releases/tags/<tag> API URL. This prevents malformed
URLs when tags contain reserved characters like '/' or '#'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: add CLI-level tests for preset add --from GitHub release URL resolution

Adds regression tests covering:
- resolve_github_release_asset_api_url unit tests (passthrough, resolution,
  network error, URL encoding of special chars in tags)
- CLI-level 'preset add --from <github-release-url>' end-to-end flow
- CLI-level 'preset add --from <api-asset-url>' direct passthrough

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: deduplicate release URL resolution; fix test issues

- ExtensionCatalog._resolve_github_release_asset_api_url now delegates
  to the shared helper in _github_http.py (also gains URL-encoding fix)
- Remove unused 'io' import from test_github_http.py
- Remove duplicate 'provides' dict keys accidentally added to test_presets.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: align resolver timeout with download timeout; add workflow CLI tests

- Pass timeout=30 to resolve_github_release_asset_api_url in both
  workflow add paths so worst-case latency matches the download timeout
- Add CLI-level regression tests for 'workflow add <url>' covering
  browser URL resolution and direct API asset URL passthrough

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: remove unused urllib.request import; add catalog workflow test

- Remove unused 'import urllib.request' in preset add --from path
- Add CLI test for catalog-based 'workflow add <id>' with GitHub
  release URL resolution

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* style: remove unused MagicMock imports from tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Manfred Riem <mnriem@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-05 10:41:40 -05:00
minbang
a9a759450d fix: recover active skills registration for extensions (#2803)
Extension command registration now resolves the active skills directory before writing command artifacts. This lets initialized skills-backed agents recover a missing active skills directory while preserving the existing preset registration behavior.

Add regression coverage for missing active skills directories, shared skills directories, and symlinked parent guards.

Fixes #2769.

Co-authored-by: OpenAI Codex <codex@openai.com>
2026-06-04 09:53:31 -05:00
Quratulain-bilal
442a581358 fix(cli): pin UTF-8 encoding on init-options and .extensionignore I/O (#2686)
* fix(cli): pin UTF-8 encoding on init-options and .extensionignore I/O

``Path.read_text`` / ``Path.write_text`` default to the system locale
codec, which is cp1252 / gb2312 / cp932 on Windows. Two user-facing
file paths in spec-kit were calling them without an explicit
``encoding=`` argument:

  - ``src/specify_cli/__init__.py:400,412`` —
    ``save_init_options`` / ``load_init_options`` for
    ``.specify/init-options.json``. A peer machine with a different
    default locale (or a UTF-8 Unix CI runner reading a file written on
    a cp1252 Windows host) cannot decode the file, raising
    ``UnicodeDecodeError``. ``UnicodeDecodeError`` is a subclass of
    ``ValueError`` — not ``OSError`` / ``json.JSONDecodeError`` — so
    the existing fall-back ``except`` tuple in ``load_init_options``
    also misses it and the error propagates raw to the CLI.

  - ``src/specify_cli/extensions.py:764`` — ``.extensionignore``
    pattern reader. The very next line already normalises
    backslashes "so Windows-authored files work", proving the codebase
    expects Windows authors to write this file. Multibyte UTF-8
    patterns (Chinese filenames, accented directory names) silently
    mojibake when the host locale is not UTF-8, so the patterns fail
    to match and unintended files are shipped with the extension.

The sibling integration-catalog reader at
``src/specify_cli/integrations/catalog.py:150,156,193,202,374``
already pins ``encoding="utf-8"`` everywhere. PR #2280 fixed the
symmetric PowerShell-template BOM bug. This change brings the two
remaining drifted paths in line with that precedent.

Regression tests:

  - ``tests/test_presets.py::TestInitOptions`` — parametrized non-ASCII
    round-trip (CJK, Latin-1, Greek, emoji) plus a corrupted-file case
    that asserts the existing "fall back to {}" contract still holds
    when a peer file contains bytes invalid as UTF-8.
  - ``tests/test_extensions.py::TestExtensionIgnore`` — Japanese
    (``ドキュメント/``) and Latin-1 (``café/``) ignore patterns
    correctly exclude their directories during install.

* fix(cli): wrap .extensionignore decode error and tighten UTF-8 contract

Addresses Copilot review feedback on this PR.

Three issues, three fixes:

1. ``save_init_options`` now writes JSON with ``ensure_ascii=False``.
   Without that flag, ``json.dumps`` emits ASCII-only ``\uXXXX``
   escapes, which means the ``encoding="utf-8"`` pin on the
   surrounding ``Path.write_text`` makes no observable difference for
   any value we currently write. Flipping ``ensure_ascii`` makes the
   non-ASCII bytes hit the file directly, so the encoding pin becomes
   the thing that decides between cp1252 garbage and clean UTF-8 on
   Windows. The comment above the call now describes the real reason
   instead of the previously-misleading rationale Copilot flagged.

2. ``test_save_load_round_trip_preserves_non_ascii`` was a no-op under
   the old ``ensure_ascii=True`` writer (Copilot's second comment).
   Added ``test_save_writes_real_utf8_bytes`` that asserts the on-disk
   bytes contain the UTF-8 encoding of ``café`` (``0xC3 0xA9``), not
   its JSON escape form ``é``. Removing either
   ``ensure_ascii=False`` or ``encoding="utf-8"`` from the writer now
   breaks this test — the contract is pinned.

3. ``.extensionignore`` reader wraps ``UnicodeDecodeError`` as
   ``ValidationError`` with a pointer to the offending byte
   (Copilot's third comment). Mirrors
   ``ExtensionManifest._load_yaml``'s existing handler for
   ``extension.yml``. Adds
   ``test_extensionignore_invalid_utf8_raises_validation_error``
   asserting installation aborts with the wrapped error instead of a
   raw Python traceback.
2026-06-02 07:19:11 -05:00
Huy Bui Minh
66884db85b fix: resolve __SPECKIT_COMMAND_*__ refs in preset skill rendering (#2717) (#2718)
* fix: resolve __SPECKIT_COMMAND_*__ refs in preset skill rendering (#2717)

The preset skill layer mirrors command templates into SKILL.md files but
only ran resolve_skill_placeholders(), leaving command cross-references as
raw __SPECKIT_COMMAND_<NAME>__ placeholders instead of rendering them as
/speckit-<cmd> the way CommandRegistrar.register_commands() does. As a
result, presets that override core commands under the agent skill layer
(e.g. Claude --ai-skills) leaked the raw tokens into SKILL.md.

Add a shared PresetManager._resolve_skill_command_refs() helper that maps
the agent's invoke separator to IntegrationBase.resolve_command_refs(), and
call it right after resolve_skill_placeholders() in every preset
skill-rendering path: _register_skills() (install), the _reconcile_skills()
override-restoration block, and both _unregister_skills() restore paths.
This mirrors register_commands() and addresses the path divergence flagged
in #1976.

Add regression tests covering the install and restore paths.

AI assistance: authored with Claude Code (Anthropic) — analysis, patch, and
tests. Verified via the existing pytest suite plus a manual CLI install and
remove cycle on a Claude --ai-skills project.

* test: cover reconcile-override and extension restore command-ref paths (#2718 review)

Copilot review flagged that the install and core-template restore paths
gained regression tests, but the reconcile project-override branch and the
extension-backed restore branch were uncovered. Add focused tests for both:

- test_reconcile_override_skill_resolves_command_refs: a project override
  wins after preset removal; _reconcile_skills must render command refs.
- test_extension_restore_resolves_command_refs: a skill restored from an
  extension command body must also render command refs.

Both fail on main and pass with the fix in 8dd93c0.
2026-05-27 12:49:54 -05:00
Quratulain-bilal
0dee2faf11 fix(catalogs): reject boolean priority in extension and preset catalog readers (#2589)
`bool` is a subclass of `int` in Python, so `int(True)` silently returns
`1`. The extension- and preset-catalog config readers coerced priority
with a bare `int(item.get("priority", idx + 1))`, which meant a YAML
config like:

    catalogs:
      - name: mine
        url: https://example.com/catalog.json
        priority: yes     # parses to True

was silently accepted as a valid priority of 1, quietly reordering the
catalog stack instead of raising the same `Invalid priority` error a
typo of `priority: not-a-number` already raises.

The sibling integration-catalog reader in `src/specify_cli/catalogs.py`
already guards this case (see `catalogs.py:137`). This change mirrors
that pattern in `extensions.py` and `presets.py` so the three catalog
validators stay consistent, and adds regression tests for both readers
matching the existing `test_load_catalog_config_rejects_boolean_priority`
template in `tests/integrations/test_integration_catalog.py`.
2026-05-21 08:21:13 -05:00
WOLIKIMCHENG
be382804c7 Fix preset skill description precedence (#2538)
* Fix preset skill description precedence

* Fix skill description precedence during restore

---------

Co-authored-by: root <1647273252@qq.com>
2026-05-14 14:59:38 -05:00
Quratulain-bilal
bf47e89249 test(presets): silence expected UserWarnings in self-test composition… (#2373)
* test(presets): silence expected UserWarnings in self-test composition tests

The self-test preset that ships with the repo provides a wrap-strategy
command (speckit.wrap-test) intentionally without a corresponding core
base layer, exercising the 'no base layer' branch of
_reconcile_composed_commands().

Eighteen tests across TestSelfTestPreset and TestPresetSkills install
this preset and trigger an expected UserWarning. Running the suite with
-W error::UserWarning surfaces them as test noise that could obscure
unrelated warnings.

Add class-level pytest.mark.filterwarnings filters to acknowledge the
two known messages ('Cannot compose command speckit.wrap-test' and
'Post-install reconciliation failed for self-test') so other UserWarning
sources still propagate normally.

Fixes #2363

* test(presets): scope filterwarnings to UserWarning category

Address Copilot review on #2373: the previous filterwarnings entries
omitted the warning category, so any warning class with a matching
message would have been silenced. Append :UserWarning to the four
filters so only the deliberately-emitted UserWarnings from
_reconcile_composed_commands() are ignored.

* test(presets): narrow self-test warning filter to install helper only

Address Copilot feedback: the class-level @pytest.mark.filterwarnings on
TestPresetSkills was too broad. The 'Post-install reconciliation failed'
filter could mask real reconciliation regressions, since that warning is
only emitted when _reconcile_composed_commands/_reconcile_skills raises.

Tests in TestPresetSkills already call install_self_test_preset(), which
scopes a narrow filter to the expected wrap-strategy 'Cannot compose'
warning. The class-level filters are redundant for those calls and unsafe
elsewhere, so they are removed.

* test(presets): align TestSelfTestPreset docstring with helper-based filtering

Address Copilot feedback: docstring referred to 'filters above', but the
fix uses warnings.filterwarnings inside install_self_test_preset rather
than class-level decorators. Updated the docstring to describe the actual
mechanism.

* test(presets): remove extra blank line between helper and class (PEP 8)

Address Copilot feedback: PEP 8 expects two blank lines between top-level
definitions; reduce the three blank lines between install_self_test_preset
and TestSelfTestPreset to two.
2026-05-11 15:16:55 -05:00
Copilot
f0998348be feat: Config-driven opt-in authentication registry with multi-platform support (#2393)
* Initial plan

* feat: add authentication provider registry (GitHub + Azure DevOps)

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/da7ecfd0-e1c9-48dc-b692-27be0879e976

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* feat: add try-each-provider HTTP helper and wire all catalog fetches through auth registry

- Add authentication/http.py with open_url() that tries each configured
  provider in registry order, falling through on 401/403 to the next,
  and finally to unauthenticated
- Add build_request() for one-shot request construction
- Add configured_providers() to registry __init__
- Remove api_base_url() from AuthProvider ABC (unused)
- Remove hosts attribute from providers (no host matching)
- Replace _github_http.py usage in ExtensionCatalog and PresetCatalog
- Wire IntegrationCatalog and WorkflowCatalog through open_url (were unauthenticated)
- Wire _fetch_latest_release_tag() through open_url
- Wire all inline --from-url downloads through open_url
- Fix unused stub variable flagged by code-quality bot
- 49 auth tests (positive + negative), 1805 total tests passing

* fix: address review — fix stale docstrings, restore Accept header, add extra_headers to open_url

- Fix _open_url() docstrings in extensions.py and presets.py that
  incorrectly claimed redirect stripping behavior
- Add extra_headers parameter to open_url() so callers can pass
  additional headers (e.g. Accept) that persist across retries
- Restore Accept: application/vnd.github+json header in
  _fetch_latest_release_tag() via extra_headers

* feat: config-driven opt-in auth via ~/.specify/auth.json

Security-first redesign: no credentials are sent unless the user
explicitly creates ~/.specify/auth.json mapping hosts to providers.

- Add authentication/config.py: loads and validates auth.json with
  host-to-provider mappings, supports token/token_env/azure-ad/azure-cli
- Refactor AuthProvider ABC: auth_headers(token, scheme) + resolve_token(entry)
- Refactor GitHubAuth: bearer scheme only, token from config entry
- Refactor AzureDevOpsAuth: 4 schemes (basic-pat, bearer, azure-cli, azure-ad)
  with dynamic token acquisition for azure-cli and azure-ad
- Rewrite authentication/http.py: host matching, redirect stripping,
  provider fallthrough on 401/403, unauthenticated fallback
- Add docs/reference/authentication.md with full reference and template
- 1823 tests passing (67 auth-specific)

* fix: address review — unused imports, host normalization, provider+scheme validation, security hardening

- Remove unused imports (os, field, Any) in config.py
- Normalize hosts during load (strip + lowercase)
- Validate token/token_env are non-empty strings during load
- Validate provider+scheme compatibility during load
- Fix extra_headers order: auth headers applied last, cannot be overridden
- Remove unused 'tried' variable in http.py
- Warn (once) on malformed auth.json instead of silent fallback
- URL-encode OAuth2 client credentials body in azure_devops.py
- Update 403 message to mention auth.json configuration
- Fix registry leak in test_register_duplicate (try/finally)
- Fix import style consistency in test_authentication.py
- Add azure-cli and azure-ad token acquisition tests (mock subprocess/urlopen)
- Add autouse fixture to isolate upgrade tests from real auth.json
- 1829 tests passing

* fix: reject unknown providers, validate azure-ad fields, strip Authorization from extra_headers

- Reject unknown provider keys during auth.json load with clear error message
- Validate azure-ad tenant_id/client_id/client_secret_env as non-empty strings
- Strip Authorization from extra_headers in both build_request and open_url
  to prevent accidental or intentional bypass of provider-configured auth
- Add tests for unknown provider and incompatible scheme validation
- 1831 tests passing

* fix: extract shared auth test helpers, global config isolation, align docstring

- Move _inject_github_config / make_github_auth_entry to tests/auth_helpers.py
  to eliminate duplication across test_extensions, test_presets, test_upgrade
- Move auth config isolation fixture to global conftest.py (autouse) so ALL
  tests are isolated from ~/.specify/auth.json, not just test_upgrade
- Align load_auth_config docstring with actual behavior: ValueError may be
  caught by higher-level HTTP helpers that warn and continue unauthenticated
- 1831 tests passing

* fix: preserve auth header across multi-hop redirect chains

- Read Authorization from both headers and unredirected_hdrs in
  _StripAuthOnRedirect to survive multi-hop chains within allowed hosts
- Add test_multi_hop_redirect_within_hosts_preserves_auth
- 1832 tests passing

* fix: use resolved config path in warning/error messages and patch build_opener in no-network test

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/86df9557-54f1-4fe4-a25f-9501cb2356cf

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: assert full resolved config path in rate-limit output test

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/86df9557-54f1-4fe4-a25f-9501cb2356cf

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: close HTTPError on 401/403, remove _VALID_AUTH_SCHEMES, catch TimeoutExpired, skip POSIX test on Windows, remove unused import

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/a1e29737-dd6e-4287-96c1-509e0c96fb21

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: use stable ~/.specify/auth.json in rate-limit message, skip POSIX permission check on Windows

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/4636bcdb-87ae-45d6-9545-a40e4effd617

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: validate host patterns, cache auth config per-process

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: clarify _is_valid_host_pattern docstring, clean up test sentinel type

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: improve _is_valid_host_pattern docstring and test observability

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-05-07 12:51:20 -05:00
Pascal THUET
38fd1f6cc2 Support controlled multi-install for safe AI agent integrations (#2389)
* support controlled multi-install integrations

* fix: harden multi-install integration state

* refactor: isolate integration runtime helpers

* fix: address copilot review feedback

* fix: address follow-up copilot feedback

* fix: tighten integration switch semantics

* fix: address final copilot review feedback

* fix: harden integration manifest read errors

* fix: refuse symlinked shared infra paths

* test: filter expected self-test preset warning

* test: address copilot review nits

* refactor: centralize safe shared infra writes

* fix: use no-follow writes for shared infra

* fix: keep default integration atomic on template refresh

* fix: harden shared infra error paths

* fix: preflight shared infra and future state schemas

* fix: support nested shared scripts during preflight

* test: tolerate wrapped schema error output

* fix: use safe default mode for shared text writes

* fix: use posix paths in shared skip output

* fix: share project guard for integration use

* fix: centralize spec-kit project guards

* fix: use posix project paths in cli output

* fix: harden shared manifest and upgrade refresh
2026-05-01 11:54:41 -05:00
Quratulain-bilal
3a7f64c8a5 fix(extensions): use explicit UTF-8 encoding when reading manifest YAML (#2370)
* fix(extensions): use explicit UTF-8 encoding when reading manifest YAML

On Windows, Python's open() defaults to the system locale encoding
(e.g., GBK on Chinese Windows), which causes UnicodeDecodeError when
extension.yml or preset.yml contains non-ASCII content such as Chinese
characters in description fields.

Add encoding='utf-8' to ExtensionManifest._load_yaml and
PresetManifest._load_yaml so manifests are read consistently across
platforms.

Fixes #2325

* test(extensions,presets): add UTF-8 manifest regression tests for #2325

Positive: extension.yml/preset.yml with non-ASCII (Chinese + emoji)
descriptions load correctly when written as UTF-8 bytes — fails on
Windows without explicit encoding='utf-8'.

Negative: files containing invalid UTF-8 bytes raise a clean error
(ValidationError or UnicodeDecodeError), not a silent crash.

* fix(extensions,presets): wrap I/O and decode errors as ValidationError

Address remaining Copilot concerns on #2370:

- Catch UnicodeDecodeError and OSError in both manifest loaders and
  re-raise as ValidationError / PresetValidationError so callers see a
  consistent error type, not a bare decode/IO traceback.
- Validate that PresetManifest YAML root is a mapping (extensions.py
  already had this; presets.py was missing it). Treat None as {} for
  empty-file compatibility.
- Tighten the negative regression tests to assert the specific message,
  and add a non-mapping-root test for PresetManifest matching the
  existing one for ExtensionManifest.
2026-04-28 08:47:22 -05:00
Taylor Mulder
232c19cb04 feat(extensions,presets): authenticate GitHub-hosted catalog and download requests with GITHUB_TOKEN/GH_TOKEN (#2331)
* feat(extensions,presets): authenticate GitHub-hosted catalog and download requests with GITHUB_TOKEN/GH_TOKEN

Squashed from #2087 (original author: @anasseth).

Adds GitHub-token authentication to extension and preset catalog fetching
and ZIP downloads so private GitHub repos work when GITHUB_TOKEN/GH_TOKEN
is set, while preventing credential leakage to non-GitHub hosts.

- Introduces shared _github_http module with build_github_request() and
  open_github_url() helpers
- Routes ExtensionCatalog and PresetCatalog network calls through
  GitHub-auth-aware opener
- Adds comprehensive unit/integration tests for auth header behavior
- Updates user docs for both extensions and presets

Co-authored-by: anasseth <16745089+anasseth@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(auth): address review feedback from #2087

- Fix redirect handler to preserve Authorization on GitHub-to-GitHub
  redirects (e.g. github.com → codeload.github.com). The previous
  implementation relied on super().redirect_request() which strips
  auth on cross-host redirects, breaking private repo archive downloads.
- Add codeload.github.com to documented host lists in both
  EXTENSION-USER-GUIDE.md and presets/README.md
- Add redirect auth-preservation and auth-stripping tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(auth): use Bearer scheme instead of token for consistency

Aligns with the rest of the codebase (e.g. __init__.py:1721) and
GitHub's current API guidance. Updates all test assertions accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address second round of Copilot review feedback

- Fix docstring to say Bearer instead of token (matches implementation)
- Remove unused imports/fixtures from redirect tests (GITHUB_HOSTS,
  MagicMock, temp_dir, monkeypatch)
- Replace __import__('io').BytesIO() with normal import io pattern
  in test_presets.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: anasseth <16745089+anasseth@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 14:17:40 -05:00
Copilot
a067d4c2e3 feat(presets): Composition strategies (prepend, append, wrap) for templates, commands, and scripts (#2133)
* fix: rebase onto upstream/main, resolve conflicts with PR #2189

upstream/main merged PR #2189 (wrap-only strategy) which overlaps with
our comprehensive composition strategies (prepend/append/wrap). Resolved
conflicts keeping our implementation as source of truth:

- README: keep our future considerations (composition is now fully
  implemented, not a future item)
- presets.py: keep our composition architecture (_reconcile_composed_commands,
  collect_all_layers, resolve_content) while preserving #2189's
  _substitute_core_template which is used by agents.py for skill
  generation
- tests: keep both test sets (our composition tests + #2189's wrap
  tests), removed TestReplayWrapsForCommand and
  TestInstallRemoveWrapLifecycle which test the superseded
  _replay_wraps_for_command API; our composition tests cover equivalent
  scenarios
- Restored missing _unregister_commands call in remove() that was lost
  during #2189 merge

* fix: re-create skill directory in _reconcile_skills after removal

After _unregister_skills removes a skill directory, _register_skills
skips writing because the dir no longer passes the is_dir() check.
Fix by ensuring the skill subdirectory exists before calling
_register_skills so the next winning preset's content gets registered.

Fixes the Claude E2E failure where removing a top-priority override
preset left skill-based agents without any SKILL.md file.

* fix: address twenty-third round of Copilot PR review feedback

- Protect reconciliation in remove(): wrap _reconcile_composed_commands
  and _reconcile_skills in try/except so failures emit a warning instead
  of leaving the project in an inconsistent state
- Protect reconciliation in install(): same pattern for post-install
  reconciliation so partial installs don't lack cleanup
- Inherit scripts/agent_scripts from base frontmatter: when composing
  commands, merge scripts and agent_scripts keys from the base command's
  frontmatter into the top layer's frontmatter if missing, preventing
  composed commands from losing required script references
- Add tier-5 bundled core fallback to collect_all_layers(): check the
  bundled core_pack (wheel) or repo-root templates (source checkout) when
  .specify/templates/ doesn't contain the core file, matching resolve()'s
  tier-5 fallback so composition can always find a base layer

* fix: address twenty-fourth round of Copilot PR review feedback

- Use yaml.safe_load for frontmatter parsing in resolve_content instead
  of CommandRegistrar.parse_frontmatter which uses naive find('---',3);
  strip strategy key from final frontmatter to prevent leaking internal
  composition directives into rendered agent command files
- Filter _reconcile_skills to specific commands: use _FilteredManifest
  wrapper so only the commands being reconciled get their skills updated,
  preventing accidental overwrites of other commands' skills that may be
  owned by higher-priority presets

* fix: address twenty-fifth round of Copilot PR review feedback

- Support legacy command-frontmatter strategy: when preset.yml doesn't
  declare a strategy, check the command file's YAML frontmatter for
  strategy: wrap as a fallback so legacy wrap presets participate in
  composition and multi-preset chaining
- Guard skill dir creation in _reconcile_skills: only re-create the
  skill directory if the skill was previously managed (listed in some
  preset's registered_skills), avoiding creation of new skill dirs
  that _register_skills would normally skip

* fix: add explanatory comment to empty except in legacy frontmatter parsing

* fix: address twenty-sixth round of Copilot PR review feedback

- Unregister stale commands when composition fails: when resolve_content
  returns None during reconciliation (base layer removed), unregister
  the command from non-skill agents and emit a warning
- Load extension aliases during reconciliation: _register_command_from_path
  now checks extension.yml for aliases when the winning layer is an
  extension, so alias files are restored after preset removal
- Use line-based fence detection for legacy frontmatter strategy fallback:
  scan for --- on its own line instead of split('---',2) to avoid
  mis-parsing YAML values containing ---

* fix: address twenty-seventh round of Copilot PR review feedback

- Handle non-preset winners in _reconcile_skills: when the winning
  layer is core/extension/project-override, restore skills via
  _unregister_skills so skill-based agents stay consistent with the
  priority stack
- Update base_frontmatter_text on replace layers: when a higher-priority
  replace layer occurs during composition, update both top and base
  frontmatter so scripts/agent_scripts inheritance reflects the
  effective base beneath the top composed layer

* fix: address twenty-eighth round of Copilot PR review feedback

- Parse only interior lines in _parse_fm_yaml: use lines[1:-1] instead
  of filtering all --- lines, preventing corruption when YAML values
  contain a line that is exactly ---
- Omit empty frontmatter: skip re-rendering when top_fm is empty dict
  to avoid emitting ---/{}/--- for intentionally empty frontmatter
- Update scaffold wrap comment: mention both {CORE_TEMPLATE} and
  $CORE_SCRIPT placeholders for templates/commands vs scripts
- Clarify shell composition scope in ARCHITECTURE.md: note that bash/PS1
  resolve_template_content only handles templates; command/script
  composition is handled by the Python resolver

* fix: address twenty-ninth round of Copilot PR review feedback

- Fix TestCollectAllLayers docstring: reference collect_all_layers()
- Add default/unknown strategy handling in bash/PS1 composition: error
  on unrecognized strategy values instead of silently skipping
- Fix comment: .composed/ is a persistent dir, not temporary
- Fix comment: legacy fallback checks all valid strategies, not just wrap
- Cache PresetRegistry in _reconcile_skills: build presets_by_priority
  once instead of constructing registry per-command

* fix: address thirtieth round of Copilot PR review feedback

- Guard legacy frontmatter fallback: only check command file frontmatter
  for strategy when the manifest entry doesn't explicitly include the
  strategy key, preventing override of manifest-declared strategies
- Document rollback limitation: note that mid-registration failures may
  leave orphaned agent command files since partial progress isn't
  captured by the local vars

* fix: handle project override skills and extension context in reconciliation

* fix: add comment to empty except in extension registration fallback

* fix: filter extension commands in reconciliation and fix type annotation

* fix: filter extension commands from post-install reconciliation

Apply the same extension-installed check used in _register_commands to
the reconciliation command list, preventing reconciliation from
registering commands for extensions that are not installed.

* fix: skip convention fallback for explicit file paths and add stem fallback to tier-5

When a preset manifest provides an explicit file path that does not
exist, skip the convention-based fallback to avoid masking typos.
Also add speckit.<stem> to <stem>.md fallback in tier-5 bundled/source
core lookup for consistency with tier-4.

* fix: scan past non-replace layers to find base in resolve_content

The base-finding scan now skips non-replace layers below a replace
layer instead of stopping at the first non-replace. This fixes the
case where a low-priority append/prepend layer sits below a replace
that should serve as the base for composition.

* fix: add context_note to non-skill agent registration for extensions

Add context_note parameter to register_commands_for_non_skill_agents
and pass extension name/id during reconciliation so rendered command
files preserve the extension-specific context markers.

* fix: Optional type, rollback safety, and override skill restoration

- Fix context_note type to Optional[str]
- Wrap shutil.rmtree in try/except during install rollback
- Separate override-backed skills from core/extension in _reconcile_skills

* fix: align bash/PS1 base-finding with Python resolver

Rewrite bash and PowerShell composition loops to find the effective
base replace layer first (scanning bottom-up, skipping non-replace
layers below it), then compose only from the base upward. This
prevents evaluation of irrelevant lower layers (e.g. a wrap with
no placeholder below a replace) and matches resolve_content behavior.

* fix: PS1 no-python warning, integration hook for override skills, alias cleanup

- Warn when no Python 3 found in PS1 and presets use composition strategies
- Apply post_process_skill_content integration hook when restoring
  override-backed skills so agent-specific flags are preserved
- Unregister command aliases alongside primary name when composition
  fails to prevent orphaned alias files

* fix: include aliases in removed_cmd_names during preset removal

Read aliases from preset manifest before deleting pack_dir so alias
command files are included in unregistration and reconciliation.

* fix: add comment to empty except in alias extraction during removal

* fix: scan top-down for effective base in all resolvers

Change base-finding to scan from highest priority downward to find the
nearest replace layer, then compose only layers above it. Prevents
evaluation of irrelevant lower layers (e.g. a wrap without placeholder
below a higher-priority replace) across Python, bash, and PowerShell.

* fix: align CLI composition chain display with top-down base-finding

Show only contributing layers (base and above) in preset resolve
output, matching resolve_content top-down semantics. Layers below
the effective base are omitted since they do not contribute.

* fix: guard corrupted registry entries and make manifest authoritative

- Add isinstance(meta, dict) guard in bash registry parsing so corrupted
  entries are skipped instead of breaking priority ordering
- Only use convention-based file lookup when the manifest does not list
  the requested template, making preset.yml authoritative and preventing
  stray on-disk files from creating unintended layers

* fix: align resolve() with manifest file paths and match extension context_note

- Update resolve() preset tier to consult manifest file paths before
  convention-based lookup, matching collect_all_layers behavior
- Use exact extension context_note format matching extensions.CommandRegistrar
- Update test to declare template in manifest (authoritative manifest)

* revert: restore resolve() convention-based behavior for backwards compatibility

resolve() is the existing public API used by shell scripts and other
callers. Changing it to manifest-authoritative breaks backward compat
for presets that rely on convention-based file lookup. Only the new
collect_all_layers/resolve_content path uses manifest-authoritative
logic.

* fix: only pre-compose when this preset is the top composing layer

Skip composition in _register_commands when a higher-priority replace
layer already wins for the command. Register the raw file instead and
let reconciliation write the correct final content.

* fix: deduplicate PyYAML warnings and use self.registry in reconciliation

- Emit PyYAML-missing warning once per function call in bash/PS1 instead
  of per-preset to avoid spamming stderr
- Use self.registry.list_by_priority() in reconciliation methods instead
  of constructing new PresetRegistry instances to avoid redundant I/O
  and potential consistency issues

* fix: document strategy handling consistency between layers and registrar

Composed output already strips strategy from frontmatter (resolve_content
pops it). Raw file registration preserves legacy frontmatter strategy
for backward compat; reconciliation corrects the final state.

* fix: correct stale comments for alias tracking and base-finding algorithm

* security: validate manifest file paths in bash/PowerShell resolvers

Reject absolute paths and parent directory traversal (..) in the
manifest-declared file field before joining with the preset directory.
Matches the Python-side validation in PresetManifest._validate().

---------

Co-authored-by: Manfred Riem <15701806+mnriem@users.noreply.github.com>
2026-04-23 10:07:52 -05:00
Kennedy
22e76995c7 feat: implement preset wrap strategy (#2189)
* feat: implement strategy: wrap

* fix: resolve merge conflict for strategy wrap correctness

* feat: multi-preset composable wrapping with priority ordering

Implements comment #4 from PR review: multiple installed wrap presets
now compose in priority order rather than overwriting each other.

Key changes:
- PresetResolver.resolve() gains skip_presets flag; resolve_core() wraps
  it to skip tier 2, preventing accidental nesting during replay
- _replay_wraps_for_command() recomposed all enabled wrap presets for a
  command in ascending priority order (innermost-first) after any
  install or remove
- _replay_skill_override() keeps SKILL.md in sync with the recomposed
  command body for ai-skills-enabled projects
- install_from_directory() detects strategy: wrap commands, stores
  wrap_commands in the registry entry, and calls replay after install
- remove() reads wrap_commands before deletion, removes registry entry
  before rmtree so replay sees post-removal state, then replays
  remaining wraps or unregisters when none remain

Tests: TestResolveCore (5), TestReplayWrapsForCommand (5),
TestInstallRemoveWrapLifecycle (5), plus 2 skill/alias regression tests

* fix: resolve extension commands via manifest file mapping

PresetResolver.resolve_extension_command_via_manifest() consults each
installed extension.yml to find the actual file declared for a command
name, rather than assuming the file is named <cmd_name>.md.  This fixes
_substitute_core_template for extensions like selftest where the manifest
maps speckit.selftest.extension → commands/selftest.md.

Resolution order in _substitute_core_template is now:
  1. resolve_core(cmd_name) — project overrides win, then name-based lookup
  2. resolve_extension_command_via_manifest(cmd_name) — manifest fallback
  3. resolve_core(short_name) — core template short-name fallback

Path traversal guard mirrors the containment check already present in
ExtensionManager to reject absolute paths or paths escaping the extension
root.

* fix: add bundled core_pack as Priority 5 in PresetResolver.resolve()

resolve_core() was returning None for built-in commands (implement,
specify, etc.) because PresetResolver only checked .specify/templates/
commands/ (Priority 4), which is never populated for commands in a
normal project. strategy:wrap presets rely on resolve_core() to fetch
the {CORE_TEMPLATE} body, so the wrap was silently skipped and SKILL.md
was never updated.

Priority 5 now checks core_pack/commands/ (wheel install) or
repo_root/templates/commands/ (source checkout), mirroring the pattern
used by _locate_core_pack() elsewhere.

Updated two tests whose assertions assumed resolve_core() always
returned None when .specify/templates/commands/ was absent.

* fix: harden preset wrap replay removal

* fix: stabilize existing directory error output

* fix: track outermost_pack_id from contributing preset; use Path.parts in tests

- outermost_pack_id now updates alongside outermost_frontmatter inside
  the wrap loop, so it reflects the actual last contributing preset
  rather than always taking wrap_presets[0] (which may have been skipped)
- Replace str(path) substring checks in TestResolveCore with Path.parts
  tuple comparisons for correct behaviour on Windows (CI runs windows-latest)

* fix: guard against non-mapping YAML manifests; apply integration post-processing in replay

- ExtensionManifest._load raises ValidationError for non-dict YAML roots instead of TypeError
- PresetManager._replay_wraps_for_command calls integration.post_process_skill_content,
  matching _register_skills behaviour
- PresetResolver skips extensions that raise OSError/TypeError/AttributeError on manifest load
- Tests: non-mapping YAML, OSError manifest skip, and replay integration post-processing
2026-04-21 15:02:31 -05:00
BachVQ
2568422085 fix(integrations): migrate Antigravity (agy) layout to .agents/ and deprecate --skills (#2276)
* refactor(agy): update storage directory from .agent to .agents

* feat: update Antigravity integration to use .agents/ directory layout and add version compatibility warnings

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor: remove deprecated --skills flag from AgyIntegration and update related test assertions

* Update src/specify_cli/integrations/agy/__init__.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor: update Antigravity integration requirement to v1.20.5 and remove obsolete tests

* test: update skills directory path from .agent to .agents in preset restoration test

* Update tests/integrations/test_integration_agy.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/integrations/test_integration_agy.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-20 11:57:45 -05:00
Manfred Riem
fc3d1244c0 fix: replace shell-based context updates with marker-based upsert (#2259)
* Replace shell-based context updates with marker-based upsert

Replace ~3500 lines of bash/PowerShell agent context update scripts
with a Python-based approach using <!-- SPECKIT START/END --> markers.

IntegrationBase now manages the agent context file directly:
- upsert_context_section(): creates or updates the marked section at
  init/install/switch time with a directive to read the current plan
- remove_context_section(): removes the section at uninstall, deleting
  the file only if it becomes empty
- __CONTEXT_FILE__ placeholder in command templates is resolved per
  integration so the plan command references the correct agent file
- context_file is persisted in init-options.json for extension access

The plan command template instructs the LLM to update the plan
reference between the markers in the agent context file.

Removed:
- scripts/bash/update-agent-context.sh (857 lines)
- scripts/powershell/update-agent-context.ps1 (515 lines)
- 56 integration wrapper scripts (update-context.sh/.ps1)
- templates/agent-file-template.md
- agent_scripts frontmatter key and {AGENT_SCRIPT} replacement logic
- update-context reference from integration.json
- tests/test_cursor_frontmatter.py (tested deleted scripts)

Added:
- upsert/remove context section methods on IntegrationBase
- __CONTEXT_FILE__ placeholder support in process_template()
- context_file field in init-options.json (init/switch/uninstall)
- Per-integration tests: context file correctness, plan reference,
  init-options persistence (78 new context_file tests)
- End-to-end CLI validation across all 28 integrations

* fix: search for end marker after start marker in context section methods

Address Copilot review: content.find(CONTEXT_MARKER_END) searched from
the start of the file rather than after the located start marker. If
the file contained a stray end marker before the start marker, the
wrong slice could be replaced.

Now both upsert_context_section() and remove_context_section() pass
start_idx as the second argument to find() and validate end_idx >
start_idx before performing the replacement.

* fix: address Copilot review feedback on context section handling

1. Fix grammar in _build_context_section() directive text — add commas
   for a complete sentence.

2. Resolve __CONTEXT_FILE__ in resolve_skill_placeholders() — skills
   generated via extensions/presets for codex/kimi now replace the
   placeholder using the context_file value from init-options.json.

3. Handle Cursor .mdc frontmatter — when creating a new .mdc context
   file, prepend alwaysApply: true YAML frontmatter so Cursor
   auto-loads the rules.

4. Fix empty-file leading newline — when the context file exists but
   is empty, write the section directly instead of prepending a blank
   line.

* fix: address second round of Copilot review feedback

1. Ensure .mdc frontmatter on existing files — upsert_context_section()
   now checks for missing YAML frontmatter on .mdc files during updates
   (not just creation), so pre-existing Cursor files get alwaysApply.

2. Guard against context_file=None — use 'or ""' instead of a default
   arg so explicit null values in init-options.json don't cause a
   TypeError in str.replace().

3. Clean up .mdc files on removal — remove_context_section() treats
   files containing only the Speckit-generated frontmatter block as
   empty, deleting them rather than leaving orphaned frontmatter.

* fix: address third round of Copilot review feedback

1. CRLF-safe .mdc frontmatter check — use lstrip().startswith('---')
   instead of startswith('---\n') so CRLF files don't get duplicate
   frontmatter.

2. CRLF-safe .mdc removal check — normalize line endings before
   comparing against the sentinel frontmatter string.

3. Call remove_context_section() during integration_uninstall() — the
   manifest-only uninstall was leaving the managed SPECKIT markers
   behind in the agent context file.

4. Fix stale docstring — remove 'agent_scripts' mention from
   test_lean_commands_have_no_scripts().

* fix: address fourth round of Copilot review feedback

1. Remove unused script_type parameter from _write_integration_json()
   and all 3 call sites — the parameter was no longer referenced after
   the update-context script removal.

2. Fix _build_context_section() docstring — correct example path from
   '.specify/plans/plan.md' to 'specs/<feature>/plan.md'.

3. Improve .mdc frontmatter-only detection in remove_context_section()
   — use regex to match any YAML frontmatter block (not just the exact
   Speckit-generated one), so .mdc files with additional frontmatter
   keys are also cleaned up when no body content remains.

* fix: handle corrupted markers and parse .mdc frontmatter robustly

1. Handle partial/corrupted markers in upsert_context_section() —
   if only the START marker exists (no END), replace from START
   through EOF. If only the END marker exists, replace from BOF
   through END. This keeps upsert idempotent even when a user
   accidentally deletes one marker.

2. Parse .mdc YAML frontmatter properly — new _ensure_mdc_frontmatter()
   helper parses existing frontmatter and ensures alwaysApply: true is
   set, rather than just checking for the --- delimiter. Handles
   missing frontmatter, existing frontmatter without alwaysApply, and
   already-correct frontmatter.

* fix: preserve .mdc frontmatter, add tests, clean up on switch

1. Rewrite _ensure_mdc_frontmatter() with regex — preserves comments,
   formatting, and custom keys in existing frontmatter instead of
   destructively re-serializing via yaml.safe_dump(). Inserts or
   fixes alwaysApply: true in place.

2. Add 6 focused .mdc frontmatter tests to cursor-agent test file:
   new file creation, missing frontmatter, preserved custom keys,
   wrong alwaysApply value, idempotent upserts, removal cleanup.

3. Call remove_context_section() during integration switch Phase 1 —
   prevents stale SPECKIT markers from being left in the old
   integration's context file. Also clear context_file from
   init-options during the metadata reset.

* fix: remove unused MDC_FRONTMATTER, preserve inline comments, normalize bare CR

1. Remove unused MDC_FRONTMATTER class variable — dead code after
   _ensure_mdc_frontmatter() was rewritten with regex.

2. Preserve inline comments when fixing alwaysApply — the regex
   substitution now captures trailing '# comment' text and keeps it.

3. Normalize bare CR in upsert_context_section() — match the
   behavior of remove_context_section() which already normalizes
   both CRLF and bare CR.

4. Clarify .mdc removal comment — 'treat frontmatter-only as empty'
   instead of misleading 'strip frontmatter'.

* fix: handle corrupted markers in remove, CRLF-safe end-marker consumption

1. Handle corrupted markers in remove_context_section() — mirror
   upsert's behavior: start-only removes start→EOF, end-only removes
   BOF→end. Previously bailed out leaving partial markers behind.

2. CRLF-safe end-marker consumption — both upsert and remove now
   handle \r\n after the end marker, not just \n. Prevents extra
   blank lines at replacement boundaries in CRLF files.

3. Clarify path rule in plan template — distinguish filesystem
   operations (absolute paths) from documentation/agent context
   references (project-relative paths).

* fix: only remove context section when both markers are well-ordered

remove_context_section() previously treated mismatched markers as
corruption and aggressively removed from BOF→end-marker or
start-marker→EOF, which could delete user-authored content if only
one marker remained. Now it only removes when both START and END
markers exist and are properly ordered, returning False otherwise.
2026-04-17 13:57:51 -05:00
Manfred Riem
8fc2bd3489 fix: allow Claude to chain skills for hook execution (#2227)
* fix: allow Claude to chain skills for hook execution (#2178)

- Set disable-model-invocation to false so Claude can invoke extension
  skills (e.g. speckit-git-feature) from within workflow skills
- Inject dot-to-hyphen normalization note into Claude SKILL.md hook
  sections so the model maps extension.yml command names to skill names
- Replace Unicode checkmark with ASCII [OK] in auto-commit scripts to
  fix PowerShell encoding errors on Windows
- Move Claude-specific frontmatter injection to ClaudeIntegration via
  post_process_skill_content() hook on SkillsIntegration, wired through
  presets and extensions managers
- Add positive and negative tests for all changes

Fixes #2178

* refactor: address PR review feedback

- Preserve line-ending style (CRLF/LF) in _inject_hook_command_note
  instead of always inserting \n, matching the convention used by other
  injection helpers in the same module.

- Extract duplicated _post_process_skill() from extensions.py and
  presets.py into a shared post_process_skill() function in agents.py.
  Both modules now import and call the shared helper.

* fix: match full hook instruction line in regex

The regex in _inject_hook_command_note only matched lines ending
immediately after 'output the following', but the actual template
lines continue with 'based on its `optional` flag:'. Use [^\r\n]*
to capture the rest of the line before the EOL.

* refactor: use integration object directly for post_process_skill_content

Instead of a free function in agents.py that re-resolves the
integration by key, callers in extensions.py and presets.py now
resolve the integration once via get_integration() and call
integration.post_process_skill_content() directly. The base
identity method lives on SkillsIntegration.
2026-04-15 14:35:05 -05:00
Michal Bachorik
33a28ec8f7 fix: unofficial PyPI warning (#1982) and legacy extension command name auto-correction (#2017) (#2027)
* docs: warn about unofficial PyPI packages and recommend version verification (#1982)

Clarify that only packages from github/spec-kit are official, and add
`specify version` as a post-install verification step to help users
catch accidental installation of an unrelated package with the same name.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): auto-correct legacy command names instead of hard-failing (#2017)

Community extensions that predate the strict naming requirement use two
common legacy formats ('speckit.command' and 'extension.command').
Instead of rejecting them outright, auto-correct to the required
'speckit.{extension}.{command}' pattern and emit a compatibility warning
so authors know they need to update their manifest. Names that cannot be
safely corrected (e.g. single-segment names) still raise ValidationError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): isolate preset catalog search test from community catalog network calls

test_search_with_cached_data asserted exactly 2 results but was getting 4
because _get_merged_packs() queries the full built-in catalog stack
(default + community). The community catalog had no local cache and hit
the network, returning real presets. Writing a project-level
preset-catalogs.yml that pins the test to the default URL only makes
the count assertions deterministic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): extend auto-correction to aliases (#2017)

The upstream #1994 added alias validation in _collect_manifest_command_names,
which also rejected legacy 2-part alias names (e.g. 'speckit.verify').
Extend the same auto-correction logic from _validate() to cover aliases,
so both 'speckit.command' and 'extension.command' alias formats are
corrected to 'speckit.{ext_id}.{command}' with a compatibility warning
instead of hard-failing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): address PR review feedback (#2017)

- _try_correct_command_name: only correct 'X.Y' to 'speckit.ext_id.Y'
  when X matches ext_id, preventing misleading warnings followed by
  install failure due to namespace mismatch
- _validate: add aliases type/string guards matching _collect_manifest
  _command_names defensive checks
- _validate: track command renames and rewrite any hook.*.command
  references that pointed at a renamed command, emitting a warning
- test: fix test_command_name_autocorrect_no_speckit_prefix to use
  ext_id matching the legacy namespace; add namespace-mismatch test
- test: replace redundant preset-catalogs.yml isolation with
  monkeypatch.delenv("SPECKIT_PRESET_CATALOG_URL") so the env var
  cannot bypass catalog restriction in CI environments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Update docs/installation.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(extensions): warn when hook command refs are silently canonicalized; fix grammar

- Hook rewrites (alias-form or rename-map) now always emit a warning so
  extension authors know to update their manifests. Previously only
  rename-map rewrites produced a warning; pure alias-form lifts were
  silent.
- Pluralize "command/commands" in the uninstall confirmation message so
  single-command extensions no longer print "1 commands".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): raise ValidationError for non-dict hook entries

Silently skipping non-dict hook entries left them in manifest.hooks,
causing HookExecutor.register_hooks() to crash with AttributeError
when it called hook_config.get() on a non-mapping value.

Also updates PR description to accurately reflect the implementation
(no separate _try_correct_alias_name helper; aliases use the same
_try_correct_command_name path).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): derive remove cmd_count from registry, fix wording

Previously cmd_count used len(ext_manifest.commands) which only counted
primary commands and missed aliases. The registry's registered_commands
already tracks every command name (primaries + aliases) per agent, so
max(len(v) for v in registered_commands.values()) gives the correct
total.

Also changes "from AI agent" → "across AI agents" since remove()
unregisters commands from all detected agents.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): distinguish missing vs empty registered_commands in remove prompt

Using get() without a default lets us tell apart:
- key missing (legacy registry entry) → fall back to manifest count
- key present but empty dict (installed with no agent dirs) → show 0

Previously the truthiness check `if registered_commands and ...` treated
both cases the same, so an empty dict fell back to len(manifest.commands)
and overcounted commands that would actually be removed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): clarify removal prompt wording to 'per agent'

'across AI agents' implied a total count, but cmd_count uses max()
across agents (per-agent count). Using sum() would double-count since
users think in logical commands, not per-agent files. 'per agent'
accurately describes what the number represents.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): clarify cmd_count comment — per-agent max, not total

The comment said 'covers all agents' implying a total, but cmd_count uses
max() across agents (per-agent count). Updated comment to explain the
max() choice and why sum() would double-count.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(extensions): add CLI tests for remove confirmation pluralization

Adds TestExtensionRemoveCLI with two CliRunner tests:
- singular: 1 registered command → '1 command per agent'
- plural:   2 registered commands → '2 commands per agent'

These prevent regressions on the cmd_count pluralization logic
and the 'per agent' wording introduced in this PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(agents): remove orphaned SKILL.md parent dirs on unregister

For SKILL.md-based agents (codex, kimi), each command lives in its own
subdirectory (e.g. .agents/skills/speckit-ext-cmd/SKILL.md). The previous
unregister_commands() only unlinked the file, leaving an empty parent dir.

Now attempts rmdir() on the parent when it differs from the agent commands
dir. OSError is silenced so non-empty dirs (e.g. user files) are safely left.

Adds test_unregister_skill_removes_parent_directory to cover this.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): drop alias pattern enforcement from _validate()

Aliases are intentionally free-form to preserve community extension
compatibility (e.g. 'speckit.verify' short aliases used by spec-kit-verify
and other existing extensions). This aligns _validate() with the intent of
upstream commit 4deb90f (fix: restore alias compatibility, #2110/#2125).

Only type and None-normalization checks remain for aliases. Pattern
enforcement continues for primary command names only.

Updated tests to verify free-form aliases pass through unchanged with
no warnings instead of being auto-corrected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(extensions): guard against non-dict command entries in _validate()

If provides.commands contains a non-mapping entry (e.g. an int or string),
'name' not in cmd raises TypeError instead of a user-facing ValidationError.
Added isinstance(cmd, dict) check at the top of the loop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: iamaeroplane <michal.bachorik@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-15 07:35:49 -05:00
Manfred Riem
43cb0fa7ab feat: add bundled lean preset with minimal workflow commands (#2161)
* feat: add bundled lean preset with minimal workflow commands

Add a lean preset that overrides the 5 core workflow commands (specify,
plan, tasks, implement, constitution) with minimal prompts that produce
exactly one artifact each — no extension hooks, no scripts, no git
branching, no templates.

Bundled preset infrastructure:
- Add _locate_bundled_preset() mirroring _locate_bundled_extension()
- Update 'specify init --preset' to try bundled -> catalog fallback
- Update 'specify preset add' to try bundled -> catalog fallback
- Add bundled guard in download_pack() for presets without download URLs
- Add lean to presets/catalog.json with 'bundled: true' marker
- Add lean to pyproject.toml force-include for wheel packaging
- Align error messages with bundled extension error pattern

Tests: 15 new tests (TestLeanPreset + TestBundledPresetLocator)

* refactor: address review — clean up unused imports, strengthen test assertions

- Remove unused MagicMock import and cache_dir setup in download test
- Assert 'bundled' and 'reinstall' in CLI error output (not just exit code)
- Mock catalog in missing-locally test for deterministic bundled error path
- Fix test versions to satisfy updated speckit_version >=0.6.0 requirement

* refactor: address review — fix constitution paths, add REINSTALL_COMMAND to presets.py

- Fix constitution path to .specify/memory/constitution.md in plan, tasks,
  implement commands (matching core command convention)
- Include REINSTALL_COMMAND in download_pack() bundled guard for consistent
  recovery instructions across bundled extensions and presets

* refactor: address review — explicit feature_directory paths, ZIP cleanup in finally

- Prefix spec.md/plan.md/tasks.md with <feature_directory>/ in plan, tasks,
  and implement commands so the agent doesn't operate on repo root by mistake
- Move ZIP unlink into finally block in init --preset path so cleanup runs
  even when install_from_zip raises (matching preset_add pattern)

* refactor: address review — replace Unicode em dashes with ASCII, fix grammar

- Replace all Unicode em dashes with ASCII hyphens in preset.yml and
  catalog.json to avoid decode errors on non-UTF-8 environments
- Fix grammar: 'store it in tasks.md' -> 'store them in tasks.md'

* refactor: address review - align task format between tasks and implement

- Remove undefined [P] marker from implement (lean uses sequential execution)
- Clarify checkbox update: 'change - [ ] to - [x]' instead of ambiguous '[X]'
- Simplify implement to execute tasks in order without parallel complexity

* refactor: address review - parse frontmatter instead of raw substring search

- Use CommandRegistrar.parse_frontmatter() to check for scripts/agent_scripts
  keys in YAML frontmatter instead of brittle 'scripts:' substring search
2026-04-10 16:18:06 -05:00
Manfred Riem
b1832c9477 Stage 6: Complete migration — remove legacy scaffold path (#1924) (#2063)
* Stage 6: Complete migration — remove legacy scaffold path (#1924)

Remove the legacy GitHub download and offline scaffold code paths.
All 26 agents now use the integration system exclusively.

Code removal (~1073 lines from __init__.py):
- download_template_from_github(), download_and_extract_template()
- scaffold_from_core_pack(), _locate_release_script()
- install_ai_skills(), _get_skills_dir (restored slim version for presets)
- _has_bundled_skills(), _migrate_legacy_kimi_dotted_skills()
- AGENT_SKILLS_MIGRATIONS, _handle_agent_skills_migration()
- _parse_rate_limit_headers(), _format_rate_limit_error()
- Three-way branch in init() collapsed to integration-only

Config derivation (single source of truth):
- AGENT_CONFIG derived from INTEGRATION_REGISTRY (replaced 180-line dict)
- CommandRegistrar.AGENT_CONFIGS derived from INTEGRATION_REGISTRY (replaced 160-line dict)
- Backward-compat constants kept for presets/extensions: SKILL_DESCRIPTIONS,
  NATIVE_SKILLS_AGENTS, DEFAULT_SKILLS_DIR

Release pipeline cleanup:
- Deleted create-release-packages.sh/.ps1 (948 lines of ZIP packaging)
- Deleted create-github-release.sh, generate-release-notes.sh
- Deleted simulate-release.sh, get-next-version.sh, update-version.sh
- Removed .github/workflows/scripts/ directory entirely
- release.yml is now self-contained: check, notes, release all inlined
- Install instructions use uv tool install with version tag

Test cleanup:
- Deleted test_ai_skills.py (tested removed functions)
- Deleted test_core_pack_scaffold.py (tested removed scaffold)
- Cleaned test_agent_config_consistency.py (removed 19 release-script tests)
- Fixed test_branch_numbering.py (removed dead monkeypatches)
- Updated auto-promote tests (verify files created, not tip messages)

1089 tests pass, 0 failures, ruff clean.

* fix: resolve merge conflicts with #2051 (claude as skills)

- Fix circular import: move CommandRegistrar import in claude
  integration to inside method bodies (was at module level)
- Lazy-populate AGENT_CONFIGS via _ensure_configs() to avoid
  circular import at class definition time
- Set claude registrar_config to .claude/commands (extension/preset
  target) since the integration handles .claude/skills in setup()
- Update tests from #2051 to match: registrar_config assertions,
  remove --integration tip assertions, remove install_ai_skills mocks

1086 tests pass.

* fix: properly preserve claude skills migration from #2051

Restore ClaudeIntegration.registrar_config to .claude/skills (not
.claude/commands) so extension/preset registrations write to the
correct skills directory.

Update tests that simulate claude setup to use .claude/skills and
check for SKILL.md layout. Some tests still need updating for the
full skills path — 10 remaining failures from the #2051 test
expectations around the extension/preset skill registration flow.

WIP: 1076/1086 pass.

* fix: properly handle SKILL.md paths in extension update rollback and tests

Fix extension update rollback using _compute_output_name() for SKILL.md
agents (converts dots to hyphens in skill directory names). Previously
the backup and cleanup code constructed paths with raw command names
(e.g. speckit.test-ext.hello/SKILL.md) instead of the correct computed
names (speckit-test-ext-hello/SKILL.md).

Test fixes for claude skills migration:
- Update claude tests to use .claude/skills paths and SKILL.md layout
- Use qwen (not claude) for skills-guard tests since claude's agent dir
  IS the skills dir — creating it triggers command registration
- Fix test_extension_command_registered_when_extension_present to check
  skills path format

1086 tests pass, 0 failures, ruff clean.

* fix: address PR review — lazy init, assertions, deprecated flags

- _ensure_configs(): catch ImportError (not Exception), don't set
  _configs_loaded on failure so retries work
- Move _ensure_configs() before unregister loop (not inside it)
- Module-level try/except catches ImportError specifically
- Remove tautology assertion (or True) in test_extensions.py
- Strengthen preset provenance assertion to check source: field
- Mark --offline, --skip-tls, --debug, --github-token as hidden
  deprecated no-ops in init()

1086 tests pass.

* fix: remove deleted release scripts from pyproject.toml force-include

Removes force-include entries for create-release-packages.sh/.ps1
which were deleted but still referenced in [tool.hatch.build].
2026-04-02 12:34:34 -05:00
Andrii Furmanets
a858c1d6da Install Claude Code as native skills and align preset/integration flows (#2051)
* Use Claude skills for generated commands

* Fix Claude integration and preset skill flows

* Group Claude tests in integration suite

* Align Claude skill frontmatter across generators

* Fix native skill preset cleanup

* Keep legacy AI skills test on legacy path

* Move Claude here-mode test to CLI suite
2026-04-02 09:44:48 -05:00
Manfred Riem
4f9d966beb Stage 5: Skills, Generic & Option-Driven Integrations (#1924) (#2052)
* Stage 5: Skills, Generic & Option-Driven Integrations (#1924)

Add SkillsIntegration base class and migrate codex, kimi, agy, and
generic to the integration system.

Integrations:
- SkillsIntegration(IntegrationBase) in base.py — creates
  speckit-<name>/SKILL.md layout matching release ZIP output byte-for-byte
- CodexIntegration — .agents/skills/, --skills default=True
- KimiIntegration — .kimi/skills/, --skills + --migrate-legacy options,
  dotted→hyphenated skill directory migration
- AgyIntegration — .agent/skills/, skills-only (commands deprecated v1.20.5)
- GenericIntegration — user-specified --commands-dir, MarkdownIntegration
- All four have update-context.sh/.ps1 scripts
- All four registered in INTEGRATION_REGISTRY

CLI changes:
- --ai <agent> auto-promotes to integration path for all registered agents
- Interactive agent selection also auto-promotes (bug fix)
- --ai-skills and --ai-commands-dir show deprecation notices on integration path
- Next-steps display shows correct skill invocation syntax for skills integrations
- agy added to CommandRegistrar.AGENT_CONFIGS

Tests:
- test_integration_base_skills.py — reusable mixin with setup, frontmatter,
  directory structure, scripts, CLI auto-promote, and complete file inventory
  (sh+ps) tests
- Per-agent test files: test_integration_{codex,kimi,agy,generic}.py
- Kimi legacy migration tests, generic --commands-dir validation
- Registry updated with Stage 5 keys
- Removed 9 dead-mock tests, moved 4 integration tests to proper locations
- Fixed all bare project-name tests to use tmp_path
- Fixed 6 pre-existing ANSI escape code test failures in test_extensions.py
  and test_presets.py

1524 tests pass, 0 failures.

* fix: remove unused variable flagged by ruff (F841)

* fix: address PR review — integration-type-aware deprecation messages and early generic validation

- --ai-skills deprecation message now distinguishes SkillsIntegration
  ("skills are the default") from command-based integrations ("has no effect")
- --ai-commands-dir validation for generic runs even when auto-promoted,
  giving clear CLI error instead of late ValueError from setup()
- Resolves review comments from #2052

* fix: address PR review round 2

- Remove unused SKILL_DESCRIPTIONS dict from base.py (dead code after
  switching to template descriptions for ZIP parity)
- Narrow YAML parse catch from Exception to yaml.YAMLError
- Remove unused shutil import from test_integration_kimi.py
- Remove unused _REGISTRAR_EXEMPT class attr from test_registry.py
- Reword --ai-commands-dir deprecation to be actionable
- Update generic validation error to mention both --ai and --integration

* fix: address PR review round 3

- Clarify parsed_options forwarding is intentional (all options passed,
  integrations decide what to use)
- Extract _strip_ansi() helper in test_extensions.py and test_presets.py
- Remove unused pytest import (test_cli.py), unused locals (test_integration_base_skills.py)
- Reword --ai-commands-dir deprecation to be actionable without referencing
  the not-yet-implemented --integration-options

* fix: address PR review round 4

- Reorder kimi migration: run super().setup() first so hyphenated
  targets exist, then migrate dotted dirs (prevents user content loss)
- Move _strip_ansi() to shared tests/conftest.py, import from there
  in test_extensions.py, test_presets.py, test_ai_skills.py
- Remove now-unused re imports from all three test files

* fix: address PR review round 5

- Use write_bytes() for LF-only newlines (no CRLF on Windows)
- Add --integration-options CLI parameter — raw string passed through
  to the integration via opts['raw_options']; the integration owns
  parsing of its own options
- GenericIntegration.setup() reads --commands-dir from raw_options
  when not in parsed_options (supports --integration-options="...")
- Skip early --ai-commands-dir validation when --integration-options
  is provided (integration validates in its own setup())
- Remove parse_integration_options from core — integrations parse
  their own options

* fix: address PR review round 6

- GenericIntegration is now stateless: removed self._commands_dir
  instance state, overrides setup() directly to compute destination
  from parsed_options/raw_options on the stack
- commands_dest() raises by design (stateless singleton)
- _quote() in SkillsIntegration now escapes backslashes and double
  quotes to produce valid YAML even with special characters

* fix: address PR review round 7

- Support --commands-dir=value form in raw_options parsing (not just
  --commands-dir value with space separator)
- Normalize CRLF to LF in write_file_and_record() before encoding
- Persist ai_skills=True in init-options.json when using a
  SkillsIntegration, so extensions/presets emit SKILL.md overrides
  correctly even without explicit --ai-skills flag
2026-04-02 08:00:12 -05:00
Hamilton Snow
ccc44dd00a Unify Kimi/Codex skill naming and migrate legacy dotted Kimi dirs (#1971)
* fix: unify hyphenated skills and migrate legacy kimi dotted dirs

* fix: preserve legacy kimi dotted preset skill overrides

* fix: migrate kimi legacy dotted skills without ai-skills flag

* fix: harden kimi migration and cache hook init options

* fix: apply kimi preset skill overrides without ai-skills flag

* fix: keep sequential branch numbering beyond 999

* test: align kimi scaffold skill path with hyphen naming

* chore: align hook typing and preset skill comment

* fix: restore AGENT_SKILLS_DIR_OVERRIDES compatibility export

* refactor: remove AGENT_SKILLS_DIR_OVERRIDES and update callers

* fix(ps1): support sequential branch numbers above 999

* fix: resolve preset skill placeholders for skills agents

* Fix legacy kimi migration safety and preset skill dir checks

* Harden TOML rendering and consolidate preset skill restore parsing

* Fix PowerShell overflow and hook message fallback for empty invocations

* Restore preset skills from extensions

* Refine preset skill restore helpers

* Harden skill path and preset checks

* Guard non-dict init options

* Avoid deleting unmanaged preset skill dirs

* Unify extension skill naming with hooks

* Harden extension native skill registration

* Normalize preset skill titles
2026-03-26 10:53:30 -05:00
Dhilip
36019ebf1b feat: Auto-register ai-skills for extensions whenever applicable (#1840)
* feat: Auto-register ai-skills for extensions whenever applicable

* fix: failing test

* fix: address copilot review comments – path traversal guard and use short_name in title

* fix: address remaining copilot review comments – is_file guard, skills type-validation, and exact extension ownership check on fallback rmtree

* fix: address copilot round-3 comments – align skill naming with presets.py convention, safe rmdir on fail, require SKILL.md for fallback rmtree, normalize skill_count in CLI

* fix: is_dir() guard in fast-path rmtree and fix ghost-skill assertion naming

* fix: path-traversal guard on skill_name in both rmtree paths of _unregister_extension_skills

* fix: add SKILL.md ownership check to fast-path rmtree and alias shadowed _get_skills_dir import
2026-03-25 07:48:36 -05:00
Manfred Riem
00e5dc1f91 Add AIDE, Extensify, and Presetify to community extensions (#1961)
* Add AIDE, Extensify, and Presetify to community extensions

Add three extensions from the mnriem/spec-kit-extensions repository:

- AI-Driven Engineering (AIDE): structured 7-step workflow for building
  new projects from scratch with AI assistants
- Extensify: create and validate extensions and extension catalogs
- Presetify: create and validate presets and preset catalogs

Updates both the README community extensions table and
catalog.community.json with entries in alphabetical order.

* fix(tests): isolate preset search test from community catalog growth

Mock get_active_catalogs to return only the default catalog entry so
the test uses only its own cached data and won't break as the
community preset catalog grows.
2026-03-24 13:18:30 -05:00
Michal Bachorik
2bf655e261 feat(presets): add enable/disable toggle and update semantics (#1891)
* feat(presets): add enable/disable toggle and update semantics

Add preset enable/disable CLI commands and update semantics to match
the extension system capabilities.

Changes:
- Add `preset enable` and `preset disable` CLI commands
- Add `restore()` method to PresetRegistry for rollback scenarios
- Update `get()` and `list()` to return deep copies (prevents mutation)
- Update `list_by_priority()` to filter disabled presets by default
- Add input validation to `restore()` for defensive programming
- Add 16 new tests covering all functionality and edge cases

Closes #1851
Closes #1852

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address PR review - deep copy and error message accuracy

- Fix error message in restore() to match actual validation ("dict" not "non-empty dict")
- Use copy.deepcopy() in restore() to prevent caller mutation
- Apply same fixes to ExtensionRegistry for parity
- Add /defensive-check command for pre-PR validation
- Add tests for restore() validation and deep copy behavior

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* revert: remove defensive-check command from PR

* fix: address PR review - clarify messaging and add parity

- Add note to enable/disable output clarifying commands/skills remain active
- Add include_disabled parameter to ExtensionRegistry.list_by_priority for parity
- Add tests for extension disabled filtering

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address PR review - disabled extension resolution and corrupted entries

- Fix _get_all_extensions_by_priority to use include_disabled=True for tracking
  registered IDs, preventing disabled extensions from being picked up as
  unregistered directories
- Add corrupted entry handling to get() - returns None for non-dict entries
- Add integration tests for disabled extension template resolution
- Add tests for get() corrupted entry handling in both registries

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: handle corrupted registry in list() methods

- Add defensive handling to list() when presets/extensions is not a dict
- Return empty dict instead of crashing on corrupted registry
- Apply same fix to both PresetRegistry and ExtensionRegistry for parity
- Add tests for corrupted registry handling

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: validate top-level registry structure in get() and restore()

- get() now validates self.data["presets/extensions"] is a dict before accessing
- restore() ensures presets/extensions dict exists before writing
- Prevents crashes when registry JSON is parseable but has corrupted structure
- Applied same fixes to both PresetRegistry and ExtensionRegistry for parity

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: validate root-level JSON structure in _load() and is_installed()

- _load() now validates json.load() result is a dict before returning
- is_installed() validates presets/extensions is a dict before checking membership
- Prevents crashes when registry file is valid JSON but wrong type (e.g., array)
- Applied same fixes to both registries for parity

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: normalize presets/extensions field in _load()

- _load() now normalizes the presets/extensions field to {} if not a dict
- Makes corrupted registries recoverable for add/update/remove operations
- Applied same fix to both registries for parity

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use raw registry keys to track corrupted extensions

- Use registry.list().keys() instead of list_by_priority() for tracking
- Corrupted entries are now treated as tracked, not picked up as unregistered
- Tighten test assertion for disabled preset resolution
- Update test to match new expected behavior for corrupted entries

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: handle None metadata in ExtensionManager.remove()

- Add defensive check for corrupted metadata in remove()
- Match existing pattern in PresetManager.remove()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: add keys() method and filter corrupted entries in list()

- Add lightweight keys() method that returns IDs without deep copy
- Update list() to filter out non-dict entries (match type contract)
- Use keys() instead of list().keys() for performance
- Fix comment to reflect actual behavior

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address defensive-check findings - deep copy, corruption guards, parity

- Extension enable/disable: use delta pattern matching presets
- add(): use copy.deepcopy(metadata) in both registries
- remove(): guard outer field for corruption in both registries
- update(): guard outer field for corruption in both registries

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: deep copy updates in update() to prevent caller mutation

Both PresetRegistry.update() and ExtensionRegistry.update() now deep
copy the input updates/metadata dict to prevent callers from mutating
nested objects after the call.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: iamaeroplane <michal.bachorik@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-19 07:48:48 -05:00
Michal Bachorik
d2ecf6560d feat(extensions,presets): add priority-based resolution ordering (#1855)
* feat(extensions,presets): add priority-based resolution ordering

Add priority field to extension and preset registries for deterministic
template resolution when multiple sources provide the same template.

Extensions:
- Add `list_by_priority()` method to ExtensionRegistry
- Add `--priority` option to `extension add` command
- Add `extension set-priority` command
- Show priority in `extension list` and `extension info`
- Preserve priority during `extension update`
- Update RFC documentation

Presets:
- Add `preset set-priority` command
- Show priority in `preset info` output
- Use priority ordering in PresetResolver for extensions

Both systems:
- Lower priority number = higher precedence (default: 10)
- Backwards compatible with legacy entries (missing priority defaults to 10)
- Comprehensive test coverage including backwards compatibility

Closes #1845
Closes #1854

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address code review feedback

- list_by_priority(): add secondary sort by ID for deterministic ordering,
  return deep copies to prevent mutation
- install_from_directory/zip: validate priority >= 1 early
- extension add CLI: validate --priority >= 1 before install
- PresetRegistry.update(): preserve installed_at timestamp
- Test assertions: use exact source string instead of substring match

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address additional review feedback

- PresetResolver: add fallback to directory scanning when registry is
  empty/corrupted for robustness and backwards compatibility
- PresetRegistry.update(): add guard to prevent injecting installed_at
  when absent in existing entry (mirrors ExtensionRegistry behavior)
- RFC: update extension list example to match actual CLI output format

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: restore defensive code and RFC descriptions lost in rebase

- Restore defensive code in list_by_priority() with .get() and isinstance check
- Restore detailed --from URL and --dev option descriptions in RFC

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: add defensive code to presets list_by_priority()

- Add .get() and isinstance check for corrupted/empty registry
- Move copy import to module level (remove local import)
- Matches defensive pattern used in extensions.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address reviewer feedback on priority resolution

- Rename _normalize_priority to normalize_priority (public API)
- Add comprehensive tests for normalize_priority function (9 tests)
- Filter non-dict metadata entries in list_by_priority() methods
- Fix extension priority resolution to merge registered and unregistered
  extensions into unified sorted list (unregistered get implicit priority 10)
- Add tests for extension priority resolution ordering (4 tests)

The key fix ensures unregistered extensions with implicit priority 10
correctly beat registered extensions with priority > 10, and vice versa.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: DRY refactor and strengthen test assertions

- Extract _get_all_extensions_by_priority() helper in PresetResolver
  to eliminate duplicated extension list construction
- Add priority=10 assertion to test_legacy_extension_without_priority_field
- Add priority=10 assertion to test_legacy_preset_without_priority_field

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: add isinstance(dict) checks for corrupted registry entries

Add defensive checks throughout CLI commands and manager methods
to handle cases where registry entries may be corrupted (non-dict
values). This prevents AttributeError when calling .get() on
non-dict metadata.

Locations fixed:
- __init__.py: preset/extension info, set-priority, enable/disable,
  upgrade commands
- extensions.py: list_installed()
- presets.py: list_installed()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: normalize priority display to match resolution behavior

Use normalize_priority() for all priority display in CLI commands
to ensure displayed values match actual resolution behavior when
registry data is corrupted/hand-edited.

Locations fixed:
- extensions.py: list_installed()
- presets.py: list_installed(), PresetResolver
- __init__.py: preset info, extension info, set-priority commands

Also added GraphQL query for unresolved PR comments to CLAUDE.md.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: repair corrupted priority values in set-priority commands

Changed set-priority commands to check if the raw stored value is
already a valid int equal to the requested priority before skipping.
This ensures corrupted values (e.g., "high") get repaired even when
setting to the default priority (10).

Also removed CLAUDE.md that was accidentally added to the repo.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: harden registry update methods against corrupted entries

- Normalize priority when restoring during extension update to prevent
  propagating corrupted values (e.g., "high", 0, negative)
- Add isinstance(dict) checks in ExtensionRegistry.update() and
  PresetRegistry.update() to handle corrupted entries (string/list)
  that would cause TypeError on merge

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use safe fallback for version in list_installed()

When registry entry is corrupted (non-dict), metadata becomes {} after
the isinstance check. Use metadata.get("version", manifest.version)
instead of metadata["version"] to avoid KeyError.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: iamaeroplane <michal.bachorik@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-17 09:44:34 -05:00
Copilot
69ee7a836e feat(presets): Pluggable preset system with catalog, resolver, and skills propagation (#1787)
* Initial plan

* feat(templates): add pluggable template system with packs, catalog, resolver, and CLI commands

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* test(templates): add comprehensive unit tests for template pack system

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* feat(presets): pluggable preset system with template/command overrides, catalog, and resolver

- Rename 'template packs' to 'presets' to avoid naming collision with core templates
- PresetManifest, PresetRegistry, PresetManager, PresetCatalog, PresetResolver in presets.py
- Extract CommandRegistrar to agents.py as shared infrastructure
- CLI: specify preset list/add/remove/search/resolve/info
- CLI: specify preset catalog list/add/remove
- --preset option on specify init
- Priority-based preset stacking (--priority, lower = higher precedence)
- Command overrides registered into all detected agent directories (17+ agents)
- Extension command safety: skip registration if target extension not installed
- Multi-catalog support: env var, project config, user config, built-in defaults
- resolve_template() / Resolve-Template in bash/PowerShell scripts
- Self-test preset: overrides all 6 core templates + 1 command
- Scaffold with 4 examples: core/extension template and command overrides
- Preset catalog (catalog.json, catalog.community.json)
- Documentation: README.md, ARCHITECTURE.md, PUBLISHING.md
- 110 preset tests, 253 total tests passing

* feat(presets): propagate command overrides to skills via init-options

- Add save_init_options() / load_init_options() helpers that persist
  CLI flags from 'specify init' to .specify/init-options.json
- PresetManager._register_skills() overwrites SKILL.md files when
  --ai-skills was used during init and corresponding skill dirs exist
- PresetManager._unregister_skills() restores core template content
  on preset removal
- registered_skills stored in preset registry metadata
- 8 new tests covering skill override, skip conditions, and restore

* fix: address PR check failures (ruff F541, CodeQL URL substring)

- Remove extraneous f-prefix from two f-strings without placeholders
- Replace substring URL check in test with startswith/endswith assertions
  to satisfy CodeQL incomplete URL substring sanitization rule

* fix: address Copilot PR review comments

- Move save_init_options() before preset install so skills propagation
  works during 'specify init --preset --ai-skills'
- Clean up downloaded ZIP after successful preset install during init
- Validate --from URL scheme (require HTTPS, HTTP only for localhost)
- Expose unregister_commands() on extensions.py CommandRegistrar wrapper
  instead of reaching into private _registrar field
- Use _get_merged_packs() for search() and get_pack_info() so all
  active catalogs are searched, not just the highest-priority one
- Fix fetch_catalog() cache to verify cached URL matches current URL
- Fix PresetResolver: script resolution uses .sh extension, consistent
  file extensions throughout resolve(), and resolve_with_source()
  delegates to resolve() to honor template_type parameter
- Fix bash common.sh: fall through to directory scan when python3
  returns empty preset list
- Fix PowerShell Resolve-Template: filter out dot-folders and sort
  extensions deterministically

* fix: narrow empty except blocks and add explanatory comments

* fix: address Copilot PR review comments (round 2)

- Fix init --preset error masking: distinguish "not found" from real errors
- Fix bash resolve_template: skip hidden dirs in extensions (match Python/PS)
- Fix temp dir leaks in tests: use temp_dir fixture instead of mkdtemp
- Fix self-test catalog entry: add note that it's local-only (no download_url)
- Fix Windows path issue in resolve_with_source: use Path.relative_to()
- Fix skill restore path: use project's .specify/templates/commands/ not source tree
- Add encoding="utf-8" to all file read/write in agents.py
- Update test to set up core command templates for skill restoration

* fix: remove self-test from catalog.json (local-only preset)

* fix: address Copilot PR review comments (round 3)

- Fix PS Resolve-Template fallback to skip dot-prefixed dirs (.cache)
- Rename _catalog to _catalog_name for consistency with extension system
- Enforce install_allowed policy in CLI preset add and download_pack()
- Fix shell injection: pass registry path via env var instead of string interpolation

* fix: correct PresetError docstring from template to preset

* Removed CHANGELOG requirement

* Applying review recommendations

* Applying review recommendations

* Applying review recommendations

* Applying review recommendations

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>
2026-03-13 15:09:14 -05:00