Harden command registration path handling (#3088)

* fix: validate command 'file' field against path traversal in registrar CommandRegistrar.register_commands() read each command body from source_dir / cmd_file without validating the manifest 'file' field, unlike the parallel skill and preset readers which already reject absolute paths and '..' traversal. A malicious extension/preset/bundle manifest with file: ../../../etc/passwd (or an absolute path) could read arbitrary host files verbatim into a generated agent command at a predictable path (GHSA-w5fv-7w9x-7fc5, CWE-22). Add the same containment guard at the command read site and reject a traversal/absolute 'file' at manifest-load time in ExtensionManifest._validate() for defense-in-depth, plus regression tests for both the read path and the manifest validator. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test/fix: address review — robust absolute-path test and tolerant reads - register_commands(): use is_file() instead of exists() and skip the command if read_text() raises (directory or non-UTF8 file), aligning with the other command/skill readers. - Traversal tests: point the absolute-path payload at the real temp secret.txt (guaranteed to exist on all platforms) instead of /etc/passwd, so the absolute-path guard is genuinely exercised and the test fails if it regresses, rather than passing because the target happens not to exist (e.g. on Windows runners). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: rename traversal fixtures to avoid CodeQL secret-storage false positive The regression fixtures named an out-of-tree file secret.txt with TOP-SECRET-CREDENTIAL content. CodeQL's clear-text-storage heuristic treated that read content as sensitive and followed the static path into the pre-existing write_text sinks in _write_registered_output, raising false 'clear-text storage of sensitive information' alerts on PR 3088. Rename the fixtures to neutral outside.txt / OUTSIDE-FILE-MARKER and drop /etc/passwd payloads; the test semantics (a file outside source_dir must never be read into a generated command) are unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: reject Windows drive-relative 'file' values in traversal guards is_absolute() is False for Windows drive-relative paths like C:outside.txt, which contain no '..' yet resolve against the process CWD on that drive — bypassing the containment guard on Windows. Evaluate the 'file' value under PureWindowsPath as well so both the registrar runtime guard and the manifest-load validator reject drive letters (and backslash '..' segments) cross-platform. Extend the regression tests with drive-relative cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use anchor under both path flavors so POSIX-absolute is rejected on Windows On a Windows runner WindowsPath('/abs/outside.md').is_absolute() is False (no drive), so the prior native-Path check let a leading-slash 'file' value through and the manifest validator did not raise. Evaluate the value under both PurePosixPath and PureWindowsPath and reject any non-empty anchor — covering POSIX-absolute, Windows drive-relative, Windows absolute, and rooted-without-drive — in both the registrar guard and the manifest validator. The registrar join now uses the raw 'file' string so native separators are handled by the resolve()/relative_to() containment check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: validate command 'file' field against path traversal in registrar CommandRegistrar.register_commands() read each command body from source_dir / cmd_file without validating the manifest 'file' field, unlike the parallel skill and preset readers which already reject absolute paths and '..' traversal. A malicious extension/preset/bundle manifest with file: ../../../etc/passwd (or an absolute path) could read arbitrary host files verbatim into a generated agent command at a predictable path (GHSA-w5fv-7w9x-7fc5, CWE-22). Add the same containment guard at the command read site and reject a traversal/absolute 'file' at manifest-load time in ExtensionManifest._validate() for defense-in-depth, plus regression tests for both the read path and the manifest validator. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test/fix: address review — robust absolute-path test and tolerant reads - register_commands(): use is_file() instead of exists() and skip the command if read_text() raises (directory or non-UTF8 file), aligning with the other command/skill readers. - Traversal tests: point the absolute-path payload at the real temp secret.txt (guaranteed to exist on all platforms) instead of /etc/passwd, so the absolute-path guard is genuinely exercised and the test fails if it regresses, rather than passing because the target happens not to exist (e.g. on Windows runners). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: rename traversal fixtures to avoid CodeQL secret-storage false positive The regression fixtures named an out-of-tree file secret.txt with TOP-SECRET-CREDENTIAL content. CodeQL's clear-text-storage heuristic treated that read content as sensitive and followed the static path into the pre-existing write_text sinks in _write_registered_output, raising false 'clear-text storage of sensitive information' alerts on PR 3088. Rename the fixtures to neutral outside.txt / OUTSIDE-FILE-MARKER and drop /etc/passwd payloads; the test semantics (a file outside source_dir must never be read into a generated command) are unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: reject Windows drive-relative 'file' values in traversal guards is_absolute() is False for Windows drive-relative paths like C:outside.txt, which contain no '..' yet resolve against the process CWD on that drive — bypassing the containment guard on Windows. Evaluate the 'file' value under PureWindowsPath as well so both the registrar runtime guard and the manifest-load validator reject drive letters (and backslash '..' segments) cross-platform. Extend the regression tests with drive-relative cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use anchor under both path flavors so POSIX-absolute is rejected on Windows On a Windows runner WindowsPath('/abs/outside.md').is_absolute() is False (no drive), so the prior native-Path check let a leading-slash 'file' value through and the manifest validator did not raise. Evaluate the value under both PurePosixPath and PureWindowsPath and reject any non-empty anchor — covering POSIX-absolute, Windows drive-relative, Windows absolute, and rooted-without-drive — in both the registrar guard and the manifest validator. The registrar join now uses the raw 'file' string so native separators are handled by the resolve()/relative_to() containment check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: harden register_commands inputs and tighten manifest 'file' validation Address review feedback on #3088: - register_commands(): skip non-string/empty 'file' values instead of raising TypeError, and hoist source_dir.resolve() out of the per-command loop. - ExtensionManifest._validate(): reject 'file' values with leading/trailing whitespace with a clear ValidationError instead of a confusing missing-file failure later. - tests: add non-string 'file' and whitespace cases; use yaml.safe_dump with explicit utf-8 encoding in the manifest validation test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: align runtime '..' policy, correct comment, dedupe test helper Address review feedback on #3088: - register_commands(): also reject '..' segments under both POSIX and Windows semantics, keeping runtime policy consistent with ExtensionManifest._validate() and the skill/preset readers (not just relying on the resolve()/relative_to() containment backstop). - Replace the version-dependent is_absolute() claim in the extensions.py comment with the actual portability rationale (native Path is OS- dependent; C:foo is anchored but not absolute). - Extract the duplicated leak-detection assertion into _assert_no_marker_leak() and add an in-bounds '..' payload that exercises the new runtime '..' rejection. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Extract shared path-safety policy and warn on unreadable command files Introduce relative_extension_path_violation() in _utils.py as the single source of truth for the extension-relative `file` path-safety policy, and use it from both the runtime registrar guard (agents.py) and the manifest-load validator (extensions.py) so the two cannot drift. Warn (instead of silently skipping) when an in-bounds command file exists but cannot be read/decoded, surfacing misconfigured extensions. Add unit tests for the shared helper, a read-skip warning test, and make the in-bounds `..` test create its target file so the skip is attributable to the `..` rejection rather than file absence. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Retrigger CI Empty commit to re-trigger code scanning / CodeQL analysis on the PR merge ref. Assisted-by: GitHub Copilot CLI (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-07-03 12:28:06 +08:00 · 2026-06-22 10:25:29 -05:00
parent f9c6cf83e5
commit 902f5431f9
5 changed files with 295 additions and 7 deletions
--- a/src/specify_cli/_utils.py
+++ b/src/specify_cli/_utils.py
@@ -9,7 +9,7 @@ import stat
 import subprocess
 import tempfile
 import yaml
-from pathlib import Path
+from pathlib import Path, PurePosixPath, PureWindowsPath
 from typing import Any
 from ._console import console
@@ -17,6 +17,44 @@ CLAUDE_LOCAL_PATH = Path.home() / ".claude" / "local" / "claude"
 CLAUDE_NPM_LOCAL_PATH = Path.home() / ".claude" / "local" / "node_modules" / ".bin" / "claude"
 def relative_extension_path_violation(value: Any) -> str | None:
    """Return why ``value`` is unsafe as an extension-relative ``file`` path.
    Single source of truth for the path-safety policy shared by
    ``ExtensionManifest._validate()`` (manifest-load validation) and
    ``CommandRegistrar.register_commands()`` (runtime guard), so the two cannot
    drift. Returns a human-readable reason string when ``value`` is unsafe, or
    ``None`` when it is an acceptable relative path within the extension
    directory.
    Policy: the value must be a non-empty string with no leading/trailing
    whitespace, no absolute/anchored form, and no ``..`` traversal. The value is
    evaluated under both POSIX and Windows path semantics because a native
    ``Path`` is OS-dependent (a ``PurePosixPath`` on POSIX does not interpret
    Windows drive/UNC forms, and ``C:foo`` is anchored but not ``is_absolute()``
    yet resolves against the CWD on its drive). Rejecting any non-empty anchor
    covers POSIX-absolute (``/abs``), Windows drive-relative (``C:foo``), Windows
    absolute (``C:\\foo``), and UNC/rooted forms.
    """
    if not isinstance(value, str) or not value:
        return "must be a non-empty string"
    if value.strip() != value:
        return "must not have leading or trailing whitespace"
    posix_path = PurePosixPath(value)
    win_path = PureWindowsPath(value)
    if (
        posix_path.anchor
        or win_path.anchor
        or ".." in posix_path.parts
        or ".." in win_path.parts
    ):
        return (
            "must be a relative path within the extension directory "
            "(no absolute paths, drive letters, or '..' segments)"
        )
    return None
 def dump_frontmatter(data: dict[str, Any]) -> str:
    """Serialize skill/command frontmatter to a YAML string.
--- a/src/specify_cli/agents.py
+++ b/src/specify_cli/agents.py
@@ -16,6 +16,7 @@ from typing import Any, Dict, List, Optional
 import yaml
 from ._init_options import is_ai_skills_enabled, load_init_options
 from ._utils import relative_extension_path_violation
 def _build_agent_configs() -> dict[str, Any]:
@@ -567,17 +568,42 @@ class CommandRegistrar:
        registered = []
        is_cline_ext = agent_name == "cline" and source_id != "core"
        source_root = source_dir.resolve()
        for cmd_info in commands:
            cmd_name = cmd_info["name"]
            aliases = cmd_info.get("aliases", [])
            cmd_file = cmd_info["file"]
-            source_file = source_dir / cmd_file
+            # Guard against path traversal using the single shared policy in
-            if not source_file.exists():
+            # relative_extension_path_violation(), so the runtime guard stays
            # aligned with ExtensionManifest._validate() and the skill/preset
            # readers. Skip a malformed/unsafe ``file`` (non-string, empty,
            # whitespace, absolute/anchored, or ``..`` traversal); the
            # resolve()/relative_to() check below is the final containment
            # backstop.
            if relative_extension_path_violation(cmd_file):
                continue
            try:
                source_file = (source_root / cmd_file).resolve()
                source_file.relative_to(source_root)  # raises ValueError if outside
            except (OSError, ValueError):
                continue
-            content = source_file.read_text(encoding="utf-8")
+            if not source_file.is_file():
                continue
            try:
                content = source_file.read_text(encoding="utf-8")
            except (OSError, UnicodeDecodeError) as exc:
                import warnings
                warnings.warn(
                    f"Skipping command '{cmd_name}': could not read source file "
                    f"'{cmd_file}' ({exc.__class__.__name__}: {exc}).",
                    stacklevel=2,
                )
                continue
            frontmatter, body = self.parse_frontmatter(content)
            if frontmatter.get("strategy") == "wrap":
--- a/src/specify_cli/extensions.py
+++ b/src/specify_cli/extensions.py
@@ -28,7 +28,7 @@ from packaging.specifiers import InvalidSpecifier, SpecifierSet
 from ._init_options import is_ai_skills_enabled
 from ._invocation_style import is_slash_skills_agent
-from ._utils import dump_frontmatter
+from ._utils import dump_frontmatter, relative_extension_path_violation
 from .catalogs import CatalogEntry as BaseCatalogEntry
 from .catalogs import CatalogStackBase
@@ -290,6 +290,18 @@ class ExtensionManifest:
            if "name" not in cmd or "file" not in cmd:
                raise ValidationError("Command missing 'name' or 'file'")
            # Validate the 'file' field at manifest-load time using the single
            # shared policy in relative_extension_path_violation(), so manifest
            # validation cannot drift from the runtime registrar guard. This is
            # defense-in-depth: the command/skill/preset readers also contain
            # the resolved path, but rejecting an unsafe value here surfaces a
            # clear error instead of silently skipping the command.
            cmd_file = cmd["file"]
            reason = relative_extension_path_violation(cmd_file)
            if reason:
                label = repr(cmd_file) if isinstance(cmd_file, str) else f"for command '{cmd.get('name')}'"
                raise ValidationError(f"Invalid command 'file' {label}: {reason}")
            # Validate command name format
            if not EXTENSION_COMMAND_NAME_PATTERN.match(cmd["name"]):
                corrected = self._try_correct_command_name(cmd["name"], ext["id"])
--- a/tests/test_extensions.py
+++ b/tests/test_extensions.py
@@ -377,6 +377,40 @@ class TestExtensionManifest:
        with pytest.raises(ValidationError, match="Invalid command name"):
            ExtensionManifest(manifest_path)
    @pytest.mark.parametrize(
        "bad_file",
        ["../../../outside.md", "../escape.md", "a/../../escape.md", "/abs/outside.md", "C:escape.md", "C:\\Windows\\x.md", "..\\..\\escape.md"],
    )
    def test_command_file_traversal_rejected(self, temp_dir, valid_manifest_data, bad_file):
        """Manifest 'file' field with traversal/absolute path raises ValidationError.
        Defense-in-depth for GHSA-w5fv-7w9x-7fc5.
        """
        import yaml
        valid_manifest_data["provides"]["commands"][0]["file"] = bad_file
        manifest_path = temp_dir / "extension.yml"
        with open(manifest_path, 'w', encoding='utf-8') as f:
            yaml.safe_dump(valid_manifest_data, f)
        with pytest.raises(ValidationError, match="Invalid command 'file'"):
            ExtensionManifest(manifest_path)
    @pytest.mark.parametrize("bad_file", [" commands/hello.md", "commands/hello.md ", "\tcommands/hello.md"])
    def test_command_file_whitespace_rejected(self, temp_dir, valid_manifest_data, bad_file):
        """Manifest 'file' with leading/trailing whitespace raises ValidationError."""
        import yaml
        valid_manifest_data["provides"]["commands"][0]["file"] = bad_file
        manifest_path = temp_dir / "extension.yml"
        with open(manifest_path, 'w', encoding='utf-8') as f:
            yaml.safe_dump(valid_manifest_data, f)
        with pytest.raises(ValidationError, match="leading or trailing whitespace"):
            ExtensionManifest(manifest_path)
    def test_command_name_autocorrect_speckit_prefix(self, temp_dir, valid_manifest_data):
        """Test that 'speckit.command' is auto-corrected to 'speckit.{ext_id}.command'."""
        import yaml
--- a/tests/test_registrar_path_traversal.py
+++ b/tests/test_registrar_path_traversal.py
@@ -6,6 +6,7 @@ from pathlib import Path
 import pytest
 from specify_cli.agents import CommandRegistrar
 from specify_cli._utils import relative_extension_path_violation
 TRAVERSAL_PAYLOADS = [
@@ -135,8 +136,185 @@ class TestCopilotPromptTraversal:
        _assert_no_stray_files(tmp_path, Path(bad_name).name.replace("/", ""))
-class TestSafeRegistration:
+ABS_OUTSIDE = "__ABS_OUTSIDE__"
-    """Positive regression — well-formed names continue to register."""
+
 FILE_FIELD_PAYLOADS = [
    "../outside.txt",
    "../../outside.txt",
    "commands/../../outside.txt",
    "C:outside.txt",
    ABS_OUTSIDE,
 ]
 def _resolve_payload(bad_file: str, outside_file: Path) -> str:
    """Map the absolute-path sentinel to the real, existing outside file.
    Using the temp file's own absolute path (instead of ``/etc/passwd``)
    guarantees the file exists on every platform — so the test fails if the
    absolute-path guard regresses, rather than passing because the target
    happens not to exist (e.g. on Windows runners).
    """
    return str(outside_file) if bad_file == ABS_OUTSIDE else bad_file
 def _assert_no_marker_leak(project: Path, marker: str) -> None:
    """Fail if ``marker`` content was written into any file under ``project``."""
    leaked = [
        p for p in project.rglob("*")
        if p.is_file() and marker in p.read_text(encoding="utf-8", errors="ignore")
    ]
    assert leaked == [], f"Outside file leaked into generated command: {leaked}"
 class TestCommandFileTraversal:
    """The manifest ``file`` field must not read files outside source_dir.
    Regression for GHSA-w5fv-7w9x-7fc5: ``register_commands`` read
    ``source_dir / cmd_file`` with no containment check, so a manifest with
    a traversal (``file: ../../../outside.txt``) or an absolute path read an
    arbitrary host file verbatim into the generated agent command.
    """
    @pytest.mark.parametrize("bad_file", FILE_FIELD_PAYLOADS)
    def test_claude_skips_traversal_in_file_field(self, tmp_path, bad_file):
        project, ext_dir = _project_and_source(tmp_path)
        (project / ".claude" / "skills").mkdir(parents=True)
        outside_file = tmp_path / "outside.txt"
        outside_file.write_text("OUTSIDE-FILE-MARKER", encoding="utf-8")
        registrar = CommandRegistrar()
        registered = registrar.register_commands(
            "claude",
            [{"name": "speckit.myext.hello", "file": _resolve_payload(bad_file, outside_file), "aliases": []}],
            "myext",
            ext_dir,
            project,
        )
        assert registered == []
        _assert_no_marker_leak(project, "OUTSIDE-FILE-MARKER")
    @pytest.mark.parametrize("bad_file", FILE_FIELD_PAYLOADS)
    def test_gemini_skips_traversal_in_file_field(self, tmp_path, bad_file):
        project, ext_dir = _project_and_source(tmp_path)
        (project / ".gemini" / "commands").mkdir(parents=True)
        outside_file = tmp_path / "outside.txt"
        outside_file.write_text("OUTSIDE-FILE-MARKER", encoding="utf-8")
        registrar = CommandRegistrar()
        registered = registrar.register_commands(
            "gemini",
            [{"name": "speckit.myext.hello", "file": _resolve_payload(bad_file, outside_file), "aliases": []}],
            "myext",
            ext_dir,
            project,
        )
        assert registered == []
        _assert_no_marker_leak(project, "OUTSIDE-FILE-MARKER")
    @pytest.mark.parametrize("bad_value", [None, 123, "", ["x"]])
    def test_non_string_file_is_skipped(self, tmp_path, bad_value):
        """A non-string/empty ``file`` must be skipped, not raise TypeError."""
        project, ext_dir = _project_and_source(tmp_path)
        (project / ".gemini" / "commands").mkdir(parents=True)
        registrar = CommandRegistrar()
        registered = registrar.register_commands(
            "gemini",
            [{"name": "speckit.myext.hello", "file": bad_value, "aliases": []}],
            "myext",
            ext_dir,
            project,
        )
        assert registered == []
    def test_dotdot_rejected_even_when_target_is_in_bounds(self, tmp_path):
        """An in-bounds ``..`` payload is rejected by the ``..`` check itself.
        ``commands/../cmd.md`` resolves to ``ext_dir/cmd.md`` — inside
        source_dir — so the resolve()/relative_to() containment backstop would
        allow it. Creating that target file ensures the command is skipped
        because of the ``..`` rejection, not merely because the file is absent.
        """
        project, ext_dir = _project_and_source(tmp_path)
        (project / ".gemini" / "commands").mkdir(parents=True)
        (ext_dir / "cmd.md").write_text(
            "---\ndescription: test\n---\n\nbody\n", encoding="utf-8"
        )
        registrar = CommandRegistrar()
        registered = registrar.register_commands(
            "gemini",
            [{"name": "speckit.myext.hello", "file": "commands/../cmd.md", "aliases": []}],
            "myext",
            ext_dir,
            project,
        )
        assert registered == []
 class TestRelativeExtensionPathPolicy:
    """Unit tests for the shared ``relative_extension_path_violation`` policy."""
    @pytest.mark.parametrize(
        "value",
        [
            "commands/hello.md",
            "hello.md",
            "a/b/c/hello.md",
        ],
    )
    def test_safe_relative_paths_have_no_violation(self, value):
        assert relative_extension_path_violation(value) is None
    @pytest.mark.parametrize(
        "value",
        [
            None,
            123,
            ["x"],
            "",
            "   ",
            " hello.md",
            "hello.md ",
            "/abs/outside.md",
            "/etc/passwd",
            "C:foo.md",
            "C:\\Windows\\system32",
            "\\\\server\\share\\x.md",
            "../escape.md",
            "commands/../../escape.md",
        ],
    )
    def test_unsafe_values_report_violation(self, value):
        assert relative_extension_path_violation(value) is not None
 class TestReadSkipWarning:
    """Unregisterable but in-bounds files warn instead of failing silently."""
    def test_unreadable_target_warns_and_skips(self, tmp_path):
        project, ext_dir = _project_and_source(tmp_path)
        (project / ".gemini" / "commands").mkdir(parents=True)
        (ext_dir / "cmd.md").write_bytes(b"\xff\xfe\x00\x80bad")
        registrar = CommandRegistrar()
        with pytest.warns(UserWarning):
            registered = registrar.register_commands(
                "gemini",
                [{"name": "speckit.myext.hello", "file": "cmd.md", "aliases": []}],
                "myext",
                ext_dir,
                project,
            )
        assert registered == []
    def test_symlinked_subdir_under_commands_dir_is_preserved(self, tmp_path):
        """Lexical check must not block legitimately symlinked sub-directories.