Files
github-spec-kit/tests/test_extension_update_hardening.py
darion-yaphet 826e193cee refactor: move extension command handlers to extensions/_commands.py (PR-7/8) (#3014)
* refactor: move extension command handlers to extensions/_commands.py (PR-7/8)

Convert the flat extensions.py module into an extensions/ package and
extract all extension_app and catalog_app command handlers plus their
private helpers (_resolve_installed_extension, _resolve_catalog_extension,
_print_extension_info) out of __init__.py into the new
extensions/_commands.py, mirroring the domain-dir layout used for
presets/_commands.py (PR-6) and integrations/_commands.py (PR-5).

- extensions.py -> extensions/__init__.py (pure rename, 99%); intra-module
  relative imports bumped from `.x` to `..x` since they reference root
  siblings.
- Root helpers (_require_specify_project, _locate_bundled_extension,
  load_init_options, _display_project_path) are reached through thin shims
  that re-fetch from the parent package at call time, so test
  monkeypatching of specify_cli.<helper> keeps working unchanged.
- __init__.py drops ~1444 lines (3511 -> 2067); CLI surface preserved via
  register(app).

No behavior change. Full suite failure set is identical before/after
(82 pre-existing env failures, 0 new).

* fix(extensions): preserve per-command path in update backup for skills agents

Skills agents (extension == "/SKILL.md") name every command file SKILL.md,
each in its own per-command subdir (e.g. speckit-plan/SKILL.md). The update
backup keyed the backup path on cmd_file.name alone, so all of an agent's
skill files collided onto a single backup path — each shutil.copy2 overwrote
the previous one, and rollback restored one skill's content over all the
others, corrupting or losing the rest.

Mirror the real on-disk layout by using cmd_file.relative_to(commands_dir),
keeping each backup path unique. This also makes backed_up_command_files
values unique so restore copies the correct content back to each command.

Add a regression test asserting two distinct skill files survive a
backup -> failed-update -> rollback cycle with their own content.

* style(extensions): use yaml.safe_dump when writing catalog config

The catalog add/remove handlers wrote the integration catalog config with
yaml.dump. Switch to yaml.safe_dump to align with the SafeDumper used by the
presets commands and to refuse emitting !!python/object tags if a non-basic
value ever reaches the config dict.

Output is unchanged for the current basic-type payload (str/int/bool/dict/
list) — this is a defensive/consistency change, not a behavioral fix.

* fix(extensions): correct _print_cli_warning import path in skill registration

register_enabled_extensions_for_agent imported _print_cli_warning from `.` (the extensions package), but the helper lives in the parent specify_cli package. The wrong level raised ImportError inside the error handlers, aborting extension/skill registration on the first failure instead of warning and continuing. Use `..` to match the other parent-package imports.

* fix(extensions): escape untrusted values in Rich markup output

User-provided arguments and extension/catalog metadata (names, descriptions, versions, IDs, paths) were interpolated into Rich markup strings without escaping. Values containing markup sequences (e.g. [red]...) would be parsed as markup, allowing output injection that could corrupt or mislead CLI messages.

Wrap all such interpolations with rich.markup.escape across the extension/catalog command handlers: list, search, info (_print_extension_info), add (including --dev paths), remove, enable, disable, set-priority, update, and the ambiguous-match resolvers (error strings and Table rows). Reuse the already-computed safe_extension where available.

Escaping is a no-op for benign strings, so normal output is unchanged.

* Prevent Rich markup injection in extension CLI output

User-controlled catalog URLs and extension IDs are rendered through Rich-enabled console paths, so every remaining output-only interpolation now escapes markup while leaving stored values and filesystem behavior unchanged. Regression tests cover catalog add, install hints, remove hints, and state command messages with bracketed markup-like values.

* Prevent markup injection from exception text

Rich markup remains enabled for styled CLI messages, so exception text and config path labels must be escaped before rendering. YAML parser errors, URL validation failures, download errors, and extension validation errors can include user-controlled catalog or manifest values.

Constraint: Preserve existing exception handling and user-facing error paths

Rejected: Disable Rich markup for these messages | existing output intentionally uses markup for labels and styling

Confidence: high

Scope-risk: narrow

Directive: Escape user-controlled exception text before interpolating into Rich-rendered strings

Tested: .venv/bin/python -m pytest tests/test_extensions.py -q

Co-authored-by: OmX <omx@oh-my-codex.dev>

* Prevent path and manifest review regressions

Catalog path labels are rendered through Rich markup and downloaded update manifests are trusted long enough to validate extension IDs. Escape displayed project paths before rendering, and reject non-mapping extension.yml payloads before ID validation so bad archives fail with a clear rollback reason.

---------

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-06-22 13:40:57 -05:00

158 lines
7.9 KiB
Python

from specify_cli.extensions import ExtensionManager, ExtensionRegistry, ExtensionCatalog
import pytest
import yaml
from typer.testing import CliRunner
from specify_cli import app
runner = CliRunner()
@pytest.fixture
def project_dir(tmp_path):
"""Create a mock spec-kit project directory."""
proj_dir = tmp_path / "project"
proj_dir.mkdir()
(proj_dir / ".specify").mkdir()
# Create required files for a project
(proj_dir / ".specify" / "config.toml").write_text("ai = 'claude'")
return proj_dir
def test_extension_update_corrupted_config_root(project_dir, monkeypatch):
"""Regression: extension update must handle corrupted extensions.yml (root is scalar)."""
# chdir into project_dir so _require_specify_project() succeeds
monkeypatch.chdir(project_dir)
# Corrupt extensions.yml
config_path = project_dir / ".specify" / "extensions.yml"
config_path.write_text(yaml.dump(123))
# Mock ExtensionManager to return an installed extension for resolution
monkeypatch.setattr(ExtensionManager, "list_installed", lambda self: [{"id": "test-ext", "name": "Test Ext", "version": "1.0.0"}])
monkeypatch.setattr(ExtensionRegistry, "get", lambda self, ext_id: {"version": "1.0.0", "enabled": True})
monkeypatch.setattr(ExtensionCatalog, "get_extension_info", lambda self, ext_id: {"id": "test-ext", "name": "Test Ext", "version": "1.1.0", "download_url": "https://example.com/ext.zip"})
# Mock download_extension to avoid network calls; use tmp_path so the test is hermetic
# and returns a Path so zip_path.exists() / zip_path.unlink() work without AttributeError
mock_zip = project_dir / "mock.zip"
monkeypatch.setattr(ExtensionCatalog, "download_extension", lambda self, ext_id: mock_zip)
# Mock confirmation to true
monkeypatch.setattr("typer.confirm", lambda _: True)
# Run update
result = runner.invoke(app, ["extension", "update", "test-ext"], obj={"project_root": project_dir})
# extension_update() catches exceptions internally and exits with code 1 on failure.
assert result.exit_code == 1
assert "AttributeError" not in result.output
assert not isinstance(result.exception, AttributeError)
def test_extension_update_corrupted_hooks_value(project_dir, monkeypatch):
"""Regression: extension update must handle non-dict 'hooks' in extensions.yml."""
monkeypatch.chdir(project_dir)
config_path = project_dir / ".specify" / "extensions.yml"
config_path.write_text(yaml.dump({
"installed": ["test-ext"],
"hooks": ["not", "a", "dict"]
}))
monkeypatch.setattr(ExtensionManager, "list_installed", lambda self: [{"id": "test-ext", "name": "Test Ext", "version": "1.0.0"}])
monkeypatch.setattr(ExtensionRegistry, "get", lambda self, ext_id: {"version": "1.0.0", "enabled": True})
monkeypatch.setattr(ExtensionCatalog, "get_extension_info", lambda self, ext_id: {"id": "test-ext", "name": "Test Ext", "version": "1.1.0", "download_url": "https://example.com/ext.zip"})
# Use tmp_path-scoped zip so the test is hermetic and returns a Path for zip_path.exists()
mock_zip = project_dir / "mock.zip"
monkeypatch.setattr(ExtensionCatalog, "download_extension", lambda self, ext_id: mock_zip)
monkeypatch.setattr("typer.confirm", lambda _: True)
result = runner.invoke(app, ["extension", "update", "test-ext"], obj={"project_root": project_dir})
# extension_update() catches exceptions internally and exits with code 1 on failure.
assert result.exit_code == 1
assert "AttributeError" not in result.output
assert not isinstance(result.exception, AttributeError)
def test_extension_update_rollback_corrupted_config(project_dir, monkeypatch):
"""Regression: extension update rollback must handle corrupted extensions.yml."""
monkeypatch.chdir(project_dir)
config_path = project_dir / ".specify" / "extensions.yml"
# Write config with hooks: null; get_project_config() normalizes this to {}
# so the backup captures {} and the restored config will have hooks: {}.
config_path.write_text(yaml.dump({"installed": ["test-ext"], "hooks": None}))
# Mock update process to fail after backup
monkeypatch.setattr(ExtensionManager, "list_installed", lambda self: [{"id": "test-ext", "name": "Test Ext", "version": "1.0.0"}])
monkeypatch.setattr(ExtensionRegistry, "get", lambda self, ext_id: {"version": "1.0.0", "enabled": True})
# Force failure in download_extension to trigger rollback
def mock_download_fail(*args, **kwargs):
# Corrupt the config BEFORE rollback is triggered
config_path.write_text(yaml.dump("CORRUPTED"))
raise Exception("Download failed")
monkeypatch.setattr(ExtensionCatalog, "get_extension_info", lambda self, ext_id: {"id": "test-ext", "name": "Test Ext", "version": "1.1.0", "download_url": "https://example.com/ext.zip"})
monkeypatch.setattr(ExtensionCatalog, "download_extension", mock_download_fail)
monkeypatch.setattr("typer.confirm", lambda _: True)
result = runner.invoke(app, ["extension", "update", "test-ext"], obj={"project_root": project_dir})
# Should handle Exception and NOT crash with AttributeError during rollback
assert result.exit_code == 1
assert "Download failed" in result.output
assert not isinstance(result.exception, AttributeError)
# Verify hooks key was preserved (normalized to {} if it was null/corrupted)
restored_config = yaml.safe_load(config_path.read_text())
assert isinstance(restored_config, dict)
assert "hooks" in restored_config
assert restored_config["hooks"] == {}
def test_extension_update_skills_backup_no_collision(project_dir, monkeypatch):
"""Regression: skills agents name every command file SKILL.md (one per
command subdirectory). Backup must keep the per-command path so rollback
restores each skill's own content instead of overwriting them onto a
single backup path."""
monkeypatch.chdir(project_dir)
config_path = project_dir / ".specify" / "extensions.yml"
config_path.write_text(yaml.dump({"installed": ["test-ext"], "hooks": {}}))
# Two skill command files with DISTINCT content, mirroring the claude
# skills layout (.claude/skills/<name>/SKILL.md).
skills_root = project_dir / ".claude" / "skills"
plan_file = skills_root / "speckit-plan" / "SKILL.md"
tasks_file = skills_root / "speckit-tasks" / "SKILL.md"
plan_file.parent.mkdir(parents=True)
tasks_file.parent.mkdir(parents=True)
plan_file.write_text("PLAN CONTENT")
tasks_file.write_text("TASKS CONTENT")
monkeypatch.setattr(ExtensionManager, "list_installed", lambda self: [{"id": "test-ext", "name": "Test Ext", "version": "1.0.0"}])
monkeypatch.setattr(ExtensionRegistry, "get", lambda self, ext_id: {
"version": "1.0.0",
"enabled": True,
"registered_commands": {"claude": ["speckit.plan", "speckit.tasks"]},
})
monkeypatch.setattr(ExtensionCatalog, "get_extension_info", lambda self, ext_id: {"id": "test-ext", "name": "Test Ext", "version": "1.1.0", "download_url": "https://example.com/ext.zip"})
# Fail at download (step 5, after the command backup in step 3). Delete the
# originals first to simulate an install clobbering them, forcing rollback
# to rely entirely on the backups.
def mock_download_fail(self, ext_id):
plan_file.unlink()
tasks_file.unlink()
raise Exception("Download failed")
monkeypatch.setattr(ExtensionCatalog, "download_extension", mock_download_fail)
monkeypatch.setattr("typer.confirm", lambda _: True)
result = runner.invoke(app, ["extension", "update", "test-ext"], obj={"project_root": project_dir})
assert result.exit_code == 1
# Rollback must restore EACH skill's own content, not a single collided copy.
assert plan_file.exists() and tasks_file.exists()
assert plan_file.read_text() == "PLAN CONTENT"
assert tasks_file.read_text() == "TASKS CONTENT"