mirror of
https://github.com/github/spec-kit.git
synced 2026-07-03 12:28:06 +08:00
fix(shared-infra): record skipped files in speckit.manifest.json (#2483)
* fix(shared-infra): record skipped files in speckit.manifest.json
`install_shared_infra` skipped files that already existed on disk
when `force=False`, but the skip branches in both the scripts loop
and the templates loop only appended to `skipped_files` without
calling `manifest.record_existing`. So when the function ran with a
fresh manifest against an already-populated `.specify/` tree (e.g.
after the manifest was deleted, corrupted, or extracted out of band),
every file went down the skip path, `planned_copies` /
`planned_templates` stayed empty, and `manifest.save()` wrote an
empty `files` field — leaving the integration believing nothing was
installed.
Record every skipped file in the manifest, but only when it is not
already tracked. This preserves the original hash for files that
were previously recorded so `check_modified()` (used by
`integration use` to decide whether a user has customized a
template) keeps working correctly.
Add `TestSpeckitManifestRecordsSkippedFiles` in
`tests/integrations/test_integration_claude.py` covering both the
fresh-skip path and the recover-after-lost-manifest path.
Fixes #2107
* fix(shared-infra): guard manifest.record_existing against non-file dst
Address Copilot review feedback on PR #2483. The previous fix called
``manifest.record_existing(rel_skip)`` from the skip branch of both
loops in ``install_shared_infra``, which would crash with
``IsADirectoryError`` (or another ``OSError``) if a directory or other
non-regular-file happened to exist at the expected destination path —
since ``record_existing`` opens the file to compute its SHA-256.
Three coordinated fixes:
1. ``IntegrationManifest.record_existing`` now validates its
precondition: it raises ``ValueError`` if the path is a symlink or
is not a regular file. The docstring already promised "an
already-existing file"; this enforces it. The symlink check runs on
the un-resolved path because ``_validate_rel_path`` calls
``resolve()``, which would silently follow the symlink. Mirrors the
existing ``_ensure_safe_manifest_destination`` precedent in the
same module.
2. In ``install_shared_infra``'s scripts and templates skip branches,
guard the ``record_existing`` call with ``dst.is_file()`` and wrap
it in ``try/except (OSError, ValueError)``. A directory collision,
permission error, or TOCTOU race no longer aborts the whole
install — the user gets a per-path warning, the path still
surfaces in ``skipped_files``, and the rest of the install
continues.
3. ``_read_manifest_files`` in the regression test no longer falls
back to ``data.get("_files")`` (Copilot's low-confidence finding):
the silent fallback could mask a schema regression where the
public ``files`` key is renamed. It now asserts ``"files" in data``
and that the value is a dict.
Add two regression tests in ``TestSpeckitManifestRecordsSkippedFiles``
covering the directory-at-destination edge case for both the scripts
loop and the templates loop. Both verify (a) install does not crash,
(b) the non-file path is not recorded in the manifest, and (c) the
path still surfaces in the user-visible warning.
The "shared infrastructure file(s)" warning text is changed to
"path(s)" so it remains accurate when non-file entries appear in the
list.
Refs #2107
* fix(manifest): lexical pre-check for record_existing + add error-case tests
Address Copilot review (2026-05-11, review id 4266902103):
1. `record_existing` was calling `(self.project_root / rel).is_symlink()`
BEFORE validating containment. For absolute paths or paths containing
`..`, this performed a filesystem stat outside the project root before
`_validate_rel_path()` raised. Add a cheap lexical pre-check that
delegates to `_validate_rel_path()` for the canonical error messages,
so the symlink stat only ever runs on paths that are already lexically
inside the project root.
2. Add focused unit tests in `tests/integrations/test_manifest.py` for
the symlink and non-regular-file error paths, including:
- symlink target rejection
- dangling symlink rejection (caught by the symlink guard before
the is_file check)
- directory path rejection (is_file == False)
- missing-path rejection (is_file == False)
- absolute-path lexical pre-check
The Copilot reviewer noted these guards had no focused coverage in
`test_manifest.py`, only via the `test_integration_claude.py`
regression test.
3. The third Copilot finding (repeated `dict(self._files)` copies via
`manifest.files` in the skip branches) is already resolved on this
branch by using `prior_hashes` — the function-scope snapshot taken at
the top of `install_shared_infra` — for the membership check, instead
of `manifest.files`.
AI disclosure: drafted with assistance from Claude (Opus 4.7).
* fix(manifest): track recovered files separately + symlink-ancestor + canonical-path guards
Address Copilot review id 4309888722 (2026-05-18) on PR #2483:
1. Recovery semantics (shared_infra.py:371, 412) — install_shared_infra
now passes ``recovered=True`` when re-recording a skipped existing
file. This flag funnels into a new ``recovered_files`` array in the
manifest JSON, so a future ``refresh_managed`` run can distinguish
"hash I produced" from "hash I observed on a file that may be a user
customization" and avoid silent overwrite without ``--refresh-shared-infra``.
Schema is purely additive: ``files: dict[str, str]`` is unchanged; the
new ``recovered_files: list[str]`` is omitted when empty.
2. Symlinked ancestor (manifest.py:172) — ``record_existing`` now walks
every component of the rel path and rejects any symlinked ancestor,
not just a symlinked leaf. Catches ``linked_dir/file.txt`` where
``linked_dir`` is a symlink, which previously slipped past the leaf-only
``is_symlink()`` check and was resolved through by ``_validate_rel_path``.
Mirrors the component-walk pattern in ``_ensure_safe_manifest_directory``.
3. Misleading "escapes project root" message (manifest.py:168) — paths
like ``dir/../file.txt`` normalize inside the project, so the old
message lied about what was wrong. New message: "Manifest paths must
be canonical; '..' segments are not allowed". Still rejects (canonical
keys are required so ``check_modified``/``uninstall`` cannot key the
same file under two paths).
Tests: 7 new test methods across TestManifestRecoveredFiles and
TestRecordExistingNewGuards covering all 4 Copilot findings. Full suite
passes locally.
🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).
* fix(manifest): normalize is_recovered input through _validate_rel_path
Address Copilot review comment id 4309888722 round-5 (2026-05-21) on PR #2483:
``is_recovered()`` previously checked ``self._recovered_files`` membership
with bare ``Path(rel).as_posix()``, while ``record_existing()`` stores keys
via ``_validate_rel_path(rel, root).relative_to(root).as_posix()``. The two
normalizations disagreed on absolute paths and paths that escape the
project root — ``is_recovered`` would silently return False for inputs that
``record_existing`` would have refused entirely.
The fix routes ``is_recovered`` through the same ``_validate_rel_path``
pipeline; ``ValueError`` from the validator is caught and converted to
False so query semantics stay exception-free (Python ``__contains__``
convention).
Tests: 2 new methods in ``TestManifestRecoveredFiles``:
- ``test_is_recovered_absolute_path_returns_false``
- ``test_is_recovered_escaping_path_returns_false``
🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).
* fix(manifest): clear recovered marker on managed re-record + reject '..' in is_recovered
Address Copilot Round-7 review comments on PR #2483:
1. record_existing(recovered=False) and record_file now BOTH discard the
path from _recovered_files. The marker is meant to flag "we observed
this file but cannot vouch it's a managed baseline" — once the same
path is re-recorded as managed (either explicitly or by writing fresh
bytes), the marker is stale and must clear so refresh_managed and
future is_recovered queries return the truthful answer.
2. is_recovered now applies the same canonical-key guard as record_existing
(rejects absolute paths and '..' segments lexically before delegating
to _validate_rel_path). Such paths can never be stored keys, so the
query correctly returns False without depending on _validate_rel_path
semantics that diverged from record_existing's stricter contract.
record_file docstring updated to mention the side-effect on recovered
markers.
Tests: 3 new methods in TestManifestRecoveredFiles covering
record_existing(false) clearing, record_file clearing, and is_recovered
dotdot rejection.
* test(manifest): update is_recovered comments to reflect Round-7 lexical guard
Round 8 — addresses Copilot review comment on tests/integrations/test_manifest.py:362.
After Round-7 (1dbf0c2), is_recovered() rejects absolute paths and '..' segments
up front via a lexical guard, returning False without calling _validate_rel_path
at all. The test comments still described the prior "_validate_rel_path raises;
we catch" code path, which is misleading for readers.
Updated comments in both:
- test_is_recovered_absolute_path_returns_false (Copilot's exact target)
- test_is_recovered_escaping_path_returns_false (same comment-class issue;
fixed preemptively to avoid a Round-9 finding on the same drift)
Pure documentation change. Test assertions and behavior unchanged; all manifest
tests still green.
* fix(manifest): document OS errors on record_existing + filter orphan recovered_files on load
Round 9 — addresses Copilot review on PR #2483:
1. record_existing's docstring now documents OSError/PermissionError as
possible raises (in addition to ValueError) — the implementation has
always been able to raise them from is_symlink, is_file, or the
file-read used to hash, but the contract did not reflect that.
Callers should be prepared for both surfaces.
2. load() now filters recovered_files entries that don't correspond to
keys in files. An externally-edited or partially-corrupted manifest
can deserialize with orphan recovered paths; rather than reject the
whole manifest (too strict on the upgrade path), we drop the orphans
and let the inconsistency self-correct on the next save(). is_recovered
then returns the truthful False for the orphan.
Tests: new test_load_filters_recovered_files_not_in_files asserting an
orphan recovered entry is dropped on load.
This commit is contained in:
@@ -115,6 +115,7 @@ class IntegrationManifest:
|
||||
self.project_root = project_root.resolve()
|
||||
self.version = version
|
||||
self._files: dict[str, str] = {} # rel_path → sha256 hex
|
||||
self._recovered_files: set[str] = set()
|
||||
self._installed_at: str = ""
|
||||
|
||||
# -- Manifest file location -------------------------------------------
|
||||
@@ -131,6 +132,9 @@ class IntegrationManifest:
|
||||
|
||||
Creates parent directories as needed. Returns the absolute path
|
||||
of the written file.
|
||||
If the path was previously marked as recovered via
|
||||
``record_existing(recovered=True)``, the recovered marker is
|
||||
cleared because the bytes are now produced, not merely observed.
|
||||
|
||||
Raises ``ValueError`` if *rel_path* resolves outside the project root.
|
||||
"""
|
||||
@@ -144,17 +148,77 @@ class IntegrationManifest:
|
||||
|
||||
normalized = abs_path.relative_to(self.project_root).as_posix()
|
||||
self._files[normalized] = hashlib.sha256(content).hexdigest()
|
||||
# ``record_file`` writes *produced* content, so any prior
|
||||
# recovered marker for this path is no longer accurate.
|
||||
self._recovered_files.discard(normalized)
|
||||
return abs_path
|
||||
|
||||
def record_existing(self, rel_path: str | Path) -> None:
|
||||
"""Record the hash of an already-existing file at *rel_path*.
|
||||
def record_existing(self, rel_path: str | Path, *, recovered: bool = False) -> None:
|
||||
"""Record the hash of an already-existing regular file at *rel_path*.
|
||||
|
||||
Raises ``ValueError`` if *rel_path* resolves outside the project root.
|
||||
When ``recovered=True``, the path is also marked in the manifest's
|
||||
``recovered_files`` list to signal that the file's on-disk hash was
|
||||
*observed* during install (because the file already existed and was not
|
||||
overwritten), not *produced* by the install. Future ``refresh_managed``
|
||||
runs should consult ``is_recovered`` before treating the recorded hash
|
||||
as a managed baseline.
|
||||
|
||||
Raises:
|
||||
ValueError: if *rel_path* resolves outside the project root, is
|
||||
a symlink, or is not a regular file. A directory or other
|
||||
non-file path cannot be silently recorded — its hash would
|
||||
be meaningless and ``check_modified``/``uninstall`` would
|
||||
treat the entry as permanently broken.
|
||||
OSError: if the underlying filesystem call (``is_symlink``,
|
||||
``is_file``, or the file-read used to compute the hash)
|
||||
fails — for example a ``PermissionError`` on the path.
|
||||
Callers should be prepared to handle ``OSError`` (and its
|
||||
subclasses such as ``PermissionError``) in addition to
|
||||
``ValueError``.
|
||||
"""
|
||||
rel = Path(rel_path)
|
||||
# Cheap lexical pre-check first so absolute / parent-traversal paths
|
||||
# don't trigger a filesystem stat outside the project root before
|
||||
# ``_validate_rel_path`` raises. ``_validate_rel_path`` produces the
|
||||
# canonical error messages used elsewhere.
|
||||
if rel.is_absolute() or ".." in rel.parts:
|
||||
_validate_rel_path(rel, self.project_root)
|
||||
# _validate_rel_path raised for any actually-escaping path. If we reach
|
||||
# here the path normalizes inside root (e.g. ``dir/../file.txt``).
|
||||
# Reject anyway: manifest keys must be canonical so ``check_modified``
|
||||
# and ``uninstall`` cannot key the same file under two paths.
|
||||
raise ValueError(
|
||||
f"Manifest paths must be canonical; '..' segments are not "
|
||||
f"allowed (got {rel})"
|
||||
)
|
||||
# Walk each path component before resolution so a symlinked ancestor
|
||||
# (e.g. ``linked_dir/file.txt`` where ``linked_dir`` is a symlink)
|
||||
# cannot be silently followed by ``_validate_rel_path().resolve()``
|
||||
# down to a target outside the project root. ``_ensure_safe_manifest_directory``
|
||||
# uses the same pattern.
|
||||
_walk = self.project_root
|
||||
for part in rel.parts:
|
||||
_walk = _walk / part
|
||||
if _walk.is_symlink():
|
||||
raise ValueError(
|
||||
f"Refusing to record symlinked manifest path: {rel} "
|
||||
f"(symlinked at {_walk.relative_to(self.project_root).as_posix()})"
|
||||
)
|
||||
abs_path = _validate_rel_path(rel, self.project_root)
|
||||
if not abs_path.is_file():
|
||||
raise ValueError(
|
||||
f"Manifest path is not a regular file: {rel}"
|
||||
)
|
||||
normalized = abs_path.relative_to(self.project_root).as_posix()
|
||||
self._files[normalized] = _sha256(abs_path)
|
||||
if recovered:
|
||||
self._recovered_files.add(normalized)
|
||||
else:
|
||||
# ``recovered=False`` means the caller is asserting this path is
|
||||
# managed-baseline now, not merely observed; drop any stale
|
||||
# recovered marker so future is_recovered() queries reflect the
|
||||
# transition. ``discard`` is a no-op when the key is absent.
|
||||
self._recovered_files.discard(normalized)
|
||||
|
||||
# -- Querying ---------------------------------------------------------
|
||||
|
||||
@@ -163,6 +227,37 @@ class IntegrationManifest:
|
||||
"""Return a copy of the ``{rel_path: sha256}`` mapping."""
|
||||
return dict(self._files)
|
||||
|
||||
@property
|
||||
def recovered_files(self) -> set[str]:
|
||||
"""Return a copy of the set of paths recorded with ``recovered=True``.
|
||||
|
||||
These entries had their hashes observed (not produced) during install
|
||||
because the file already existed on disk and the install skipped it.
|
||||
Their on-disk bytes may be user customizations — callers that would
|
||||
overwrite based on hash equality (e.g. ``refresh_managed``) MUST check
|
||||
``is_recovered`` first.
|
||||
"""
|
||||
return set(self._recovered_files)
|
||||
|
||||
def is_recovered(self, rel_path: str | Path) -> bool:
|
||||
"""Return True if *rel_path* was recorded via ``record_existing(recovered=True)``.
|
||||
|
||||
Input is normalized through the same pipeline as ``record_existing``:
|
||||
absolute paths, paths escaping the project root, AND paths containing
|
||||
``'..'`` segments are rejected (returned as ``False``). This mirrors
|
||||
``record_existing``'s canonicalization guard — such paths can never
|
||||
appear as stored keys, so the answer is always ``False``.
|
||||
"""
|
||||
rel = Path(rel_path)
|
||||
if rel.is_absolute() or ".." in rel.parts:
|
||||
return False
|
||||
try:
|
||||
abs_path = _validate_rel_path(rel, self.project_root)
|
||||
normalized = abs_path.relative_to(self.project_root).as_posix()
|
||||
except ValueError:
|
||||
return False
|
||||
return normalized in self._recovered_files
|
||||
|
||||
def check_modified(self) -> list[str]:
|
||||
"""Return relative paths of tracked files whose content changed on disk."""
|
||||
modified: list[str] = []
|
||||
@@ -269,6 +364,11 @@ class IntegrationManifest:
|
||||
"version": self.version,
|
||||
"installed_at": self._installed_at,
|
||||
"files": self._files,
|
||||
**(
|
||||
{"recovered_files": sorted(self._recovered_files)}
|
||||
if self._recovered_files
|
||||
else {}
|
||||
),
|
||||
}
|
||||
path = self.manifest_path
|
||||
content = json.dumps(data, indent=2) + "\n"
|
||||
@@ -320,6 +420,20 @@ class IntegrationManifest:
|
||||
inst._installed_at = data.get("installed_at", "")
|
||||
inst._files = files
|
||||
|
||||
recovered = data.get("recovered_files", [])
|
||||
if not isinstance(recovered, list) or not all(
|
||||
isinstance(p, str) for p in recovered
|
||||
):
|
||||
raise ValueError(
|
||||
f"Integration manifest 'recovered_files' at {path} must be a "
|
||||
"list of string paths"
|
||||
)
|
||||
inst._recovered_files = set(recovered)
|
||||
# Drop any recovered_files entries that don't correspond to tracked
|
||||
# files — defensive against externally-edited or partially-corrupted
|
||||
# manifests. Inconsistent state self-corrects on next save().
|
||||
inst._recovered_files &= set(inst._files.keys())
|
||||
|
||||
stored_key = data.get("integration", "")
|
||||
if stored_key and stored_key != key:
|
||||
raise ValueError(
|
||||
|
||||
@@ -365,6 +365,23 @@ def install_shared_infra(
|
||||
preserved_user_files.append(rel)
|
||||
else:
|
||||
skipped_files.append(rel)
|
||||
# Record the existing-on-disk file in the manifest so a
|
||||
# fresh manifest run against an already-populated
|
||||
# ``.specify/`` tree does not silently drop it (#2107).
|
||||
# ``prior_hashes`` is the function-scope snapshot taken
|
||||
# at entry, so this membership check is O(1) and avoids
|
||||
# the repeated ``dict(self._files)`` copy that
|
||||
# ``manifest.files`` performs on every access.
|
||||
if dst_path.is_file() and rel not in prior_hashes:
|
||||
try:
|
||||
manifest.record_existing(rel, recovered=True)
|
||||
except (OSError, ValueError) as exc:
|
||||
# Tolerate races / permission issues / non-file
|
||||
# collisions so one weird path does not abort
|
||||
# the whole install.
|
||||
console.print(
|
||||
f"[yellow]⚠[/yellow] could not record {rel} in manifest: {exc}"
|
||||
)
|
||||
continue
|
||||
|
||||
if not _ensure_or_bucket_dir(dst_path.parent):
|
||||
@@ -398,6 +415,23 @@ def install_shared_infra(
|
||||
preserved_user_files.append(rel)
|
||||
else:
|
||||
skipped_files.append(rel)
|
||||
# Record the existing-on-disk template in the manifest so a
|
||||
# fresh manifest run against an already-populated
|
||||
# ``.specify/`` tree does not silently drop it (#2107).
|
||||
# ``prior_hashes`` is the function-scope snapshot taken at
|
||||
# entry, so this membership check is O(1) and avoids the
|
||||
# repeated ``dict(self._files)`` copy that ``manifest.files``
|
||||
# performs on every access.
|
||||
if dst.is_file() and rel not in prior_hashes:
|
||||
try:
|
||||
manifest.record_existing(rel, recovered=True)
|
||||
except (OSError, ValueError) as exc:
|
||||
# Tolerate races / permission issues / non-file
|
||||
# collisions so one weird path does not abort
|
||||
# the whole install.
|
||||
console.print(
|
||||
f"[yellow]⚠[/yellow] could not record {rel} in manifest: {exc}"
|
||||
)
|
||||
continue
|
||||
|
||||
content = src.read_text(encoding="utf-8")
|
||||
@@ -416,7 +450,7 @@ def install_shared_infra(
|
||||
|
||||
if skipped_files:
|
||||
console.print(
|
||||
f"[yellow]⚠[/yellow] {len(skipped_files)} shared infrastructure file(s) already exist and were not updated:"
|
||||
f"[yellow]⚠[/yellow] {len(skipped_files)} shared infrastructure path(s) already exist and were not updated:"
|
||||
)
|
||||
for path in skipped_files:
|
||||
console.print(f" {path}")
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
import codecs
|
||||
import json
|
||||
import os
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
import yaml
|
||||
@@ -577,3 +578,204 @@ class TestClaudeHookCommandNote:
|
||||
assert "user-invocable: true" in result
|
||||
assert "disable-model-invocation: false" in result
|
||||
assert "replace dots" in result
|
||||
|
||||
|
||||
class TestSpeckitManifestRecordsSkippedFiles:
|
||||
"""Regression test for issue #2107.
|
||||
|
||||
``install_shared_infra`` must record every shared-infrastructure file
|
||||
under ``.specify/`` in ``speckit.manifest.json``, including files that
|
||||
were *skipped* because they already existed on disk and ``force=False``.
|
||||
|
||||
Before the fix, the skip branches in the scripts and templates loops
|
||||
appended to ``skipped_files`` without calling ``manifest.record_existing``.
|
||||
So when ``install_shared_infra`` ran with a fresh (or lost) manifest
|
||||
against an already-populated ``.specify/`` tree, every file went down the
|
||||
skip path, ``planned_copies`` and ``planned_templates`` stayed empty, and
|
||||
``manifest.save()`` wrote an empty ``files`` field — leaving the
|
||||
integration believing nothing was installed.
|
||||
|
||||
Reproduction (without the fix) using ``install_shared_infra`` directly:
|
||||
|
||||
install_shared_infra(p, "sh", ..., force=False) # 1st run → 10 files
|
||||
(p / ".specify/integrations/speckit.manifest.json").unlink()
|
||||
install_shared_infra(p, "sh", ..., force=False) # 2nd run → 0 files
|
||||
# ^^ BUG: empty
|
||||
"""
|
||||
|
||||
def _read_manifest_files(self, project_path: Path) -> dict:
|
||||
manifest_path = (
|
||||
project_path / ".specify" / "integrations" / "speckit.manifest.json"
|
||||
)
|
||||
assert manifest_path.exists(), (
|
||||
f"speckit.manifest.json not written at {manifest_path}"
|
||||
)
|
||||
data = json.loads(manifest_path.read_text(encoding="utf-8"))
|
||||
# ``IntegrationManifest.save`` serialises a ``files`` dict — assert
|
||||
# the schema explicitly so a regression to a different key (e.g.
|
||||
# the internal ``_files`` attribute name) fails loudly instead of
|
||||
# being masked by a silent fallback.
|
||||
assert isinstance(data, dict), (
|
||||
f"manifest root is not a dict, got {type(data).__name__}"
|
||||
)
|
||||
assert "files" in data, (
|
||||
f"manifest missing 'files' key, got keys: {sorted(data.keys())}"
|
||||
)
|
||||
files = data["files"]
|
||||
assert isinstance(files, dict), (
|
||||
f"manifest 'files' is not a dict, got {type(files).__name__}"
|
||||
)
|
||||
return files
|
||||
|
||||
def test_install_shared_infra_records_skipped_files(self, tmp_path):
|
||||
"""With ``force=False`` and ``.specify/`` already populated, the
|
||||
manifest must still record every file — the skip branches are not
|
||||
allowed to drop files from the manifest."""
|
||||
from rich.console import Console
|
||||
from specify_cli.shared_infra import install_shared_infra
|
||||
|
||||
# Resolve the project's own packaged sources by walking up from this
|
||||
# test file to the repo root (which contains ``scripts/`` and
|
||||
# ``templates/`` that ``shared_scripts_source`` looks for).
|
||||
repo_root = Path(__file__).resolve().parents[2]
|
||||
console = Console(quiet=True)
|
||||
|
||||
# First run — fresh project, manifest gets populated normally.
|
||||
install_shared_infra(
|
||||
tmp_path,
|
||||
"sh",
|
||||
version="0.0.0",
|
||||
core_pack=None,
|
||||
repo_root=repo_root,
|
||||
console=console,
|
||||
force=False,
|
||||
)
|
||||
first_files = self._read_manifest_files(tmp_path)
|
||||
assert first_files, "first install produced an empty manifest"
|
||||
|
||||
# Simulate a lost manifest while ``.specify/`` is still on disk
|
||||
# (e.g. the manifest was deleted, corrupted, or the layout was
|
||||
# extracted out-of-band).
|
||||
manifest_path = (
|
||||
tmp_path / ".specify" / "integrations" / "speckit.manifest.json"
|
||||
)
|
||||
manifest_path.unlink()
|
||||
|
||||
# Second run — every file already exists, so every iteration takes
|
||||
# the skip branch. With the fix, those files are still recorded.
|
||||
install_shared_infra(
|
||||
tmp_path,
|
||||
"sh",
|
||||
version="0.0.0",
|
||||
core_pack=None,
|
||||
repo_root=repo_root,
|
||||
console=console,
|
||||
force=False,
|
||||
)
|
||||
second_files = self._read_manifest_files(tmp_path)
|
||||
assert second_files, (
|
||||
"speckit.manifest.json files dict is empty after install with "
|
||||
"skipped files (issue #2107) — every file went down the skip "
|
||||
"branch but none were recorded"
|
||||
)
|
||||
|
||||
# The recovered manifest must cover everything the first run tracked.
|
||||
missing = set(first_files) - set(second_files)
|
||||
assert not missing, (
|
||||
f"these files were tracked on the first install but missing after "
|
||||
f"the skipped-files re-install: {sorted(missing)[:5]}"
|
||||
)
|
||||
|
||||
def test_install_shared_infra_handles_directory_at_script_destination(
|
||||
self, tmp_path
|
||||
):
|
||||
"""A non-file (directory) at a script's destination must NOT crash
|
||||
``install_shared_infra`` and must NOT be recorded in the manifest —
|
||||
the path still appears in the user-visible skipped-paths warning.
|
||||
"""
|
||||
from io import StringIO
|
||||
from rich.console import Console
|
||||
from specify_cli.shared_infra import install_shared_infra
|
||||
|
||||
repo_root = Path(__file__).resolve().parents[2]
|
||||
output = StringIO()
|
||||
console = Console(file=output, force_terminal=False, width=200)
|
||||
|
||||
# Pre-create the .specify/scripts/bash tree, then plant a directory
|
||||
# where a script file is expected so the skip branch hits a
|
||||
# non-regular-file path.
|
||||
bash_dir = tmp_path / ".specify" / "scripts" / "bash"
|
||||
bash_dir.mkdir(parents=True)
|
||||
(bash_dir / "common.sh").mkdir() # collision: dir where file expected
|
||||
|
||||
# Must not crash.
|
||||
install_shared_infra(
|
||||
tmp_path,
|
||||
"sh",
|
||||
version="0.0.0",
|
||||
core_pack=None,
|
||||
repo_root=repo_root,
|
||||
console=console,
|
||||
force=False,
|
||||
)
|
||||
|
||||
files = self._read_manifest_files(tmp_path)
|
||||
assert ".specify/scripts/bash/common.sh" not in files, (
|
||||
"directory at script dst must not be recorded in the manifest"
|
||||
)
|
||||
text = output.getvalue()
|
||||
assert "common.sh" in text, (
|
||||
"directory-at-script-dst path must surface in the skipped warning"
|
||||
)
|
||||
|
||||
def test_install_shared_infra_handles_directory_at_template_destination(
|
||||
self, tmp_path
|
||||
):
|
||||
"""Symmetric coverage for the templates loop: a directory at a
|
||||
template's destination must NOT crash install nor be recorded."""
|
||||
from io import StringIO
|
||||
from rich.console import Console
|
||||
from specify_cli.shared_infra import install_shared_infra
|
||||
|
||||
repo_root = Path(__file__).resolve().parents[2]
|
||||
output = StringIO()
|
||||
console = Console(file=output, force_terminal=False, width=200)
|
||||
|
||||
templates_dir = tmp_path / ".specify" / "templates"
|
||||
templates_dir.mkdir(parents=True)
|
||||
|
||||
src_templates = repo_root / "templates"
|
||||
real_template = next(
|
||||
(
|
||||
p.name
|
||||
for p in src_templates.iterdir()
|
||||
if p.is_file()
|
||||
and not p.name.startswith(".")
|
||||
and p.name != "vscode-settings.json"
|
||||
),
|
||||
None,
|
||||
)
|
||||
assert real_template, (
|
||||
"no real template found in repo to collide against"
|
||||
)
|
||||
(templates_dir / real_template).mkdir() # collision
|
||||
|
||||
install_shared_infra(
|
||||
tmp_path,
|
||||
"sh",
|
||||
version="0.0.0",
|
||||
core_pack=None,
|
||||
repo_root=repo_root,
|
||||
console=console,
|
||||
force=False,
|
||||
)
|
||||
|
||||
files = self._read_manifest_files(tmp_path)
|
||||
template_rel = f".specify/templates/{real_template}"
|
||||
assert template_rel not in files, (
|
||||
"directory at template dst must not be recorded in manifest"
|
||||
)
|
||||
text = output.getvalue()
|
||||
assert real_template in text, (
|
||||
"directory-at-template-dst path must surface in the skipped warning"
|
||||
)
|
||||
|
||||
@@ -34,6 +34,57 @@ class TestManifestRecordFile:
|
||||
assert m.files["existing.txt"] == _sha256(f)
|
||||
|
||||
|
||||
class TestManifestRecordExistingErrors:
|
||||
"""Error-case coverage for ``record_existing`` symlink + non-file guards.
|
||||
|
||||
Added in #2483 — Copilot review flagged these as un-tested regressions
|
||||
after the ``is_symlink``/``is_file`` guards were introduced.
|
||||
"""
|
||||
|
||||
def test_rejects_symlink_target(self, tmp_path):
|
||||
target = tmp_path / "target.txt"
|
||||
target.write_text("target content", encoding="utf-8")
|
||||
link = tmp_path / "link.txt"
|
||||
link.symlink_to(target)
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
with pytest.raises(ValueError, match="symlinked"):
|
||||
m.record_existing("link.txt")
|
||||
|
||||
def test_rejects_dangling_symlink(self, tmp_path):
|
||||
# A symlink pointing nowhere should still be rejected before the
|
||||
# ``is_file()`` check (which would itself be False on a dangler).
|
||||
link = tmp_path / "dangler.txt"
|
||||
link.symlink_to(tmp_path / "no-such-target.txt")
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
with pytest.raises(ValueError, match="symlinked"):
|
||||
m.record_existing("dangler.txt")
|
||||
|
||||
def test_rejects_directory_path(self, tmp_path):
|
||||
(tmp_path / "a_dir").mkdir()
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
with pytest.raises(ValueError, match="not a regular file"):
|
||||
m.record_existing("a_dir")
|
||||
|
||||
def test_rejects_missing_path(self, tmp_path):
|
||||
# ``is_file()`` is False for non-existent paths too; the same error
|
||||
# surface keeps callers from having to distinguish "missing" from
|
||||
# "wrong kind" — both mean "cannot hash this".
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
with pytest.raises(ValueError, match="not a regular file"):
|
||||
m.record_existing("never-existed.txt")
|
||||
|
||||
def test_lexical_prevalidation_for_absolute_path(self, tmp_path):
|
||||
# ``record_existing`` must reject absolute paths via the lexical
|
||||
# pre-check, NOT via the filesystem-touching ``is_symlink()`` call.
|
||||
# Verified by passing an absolute path that points to a directory
|
||||
# outside the project root — the canonical "Absolute paths" error
|
||||
# must surface before any stat on the absolute path.
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
abs_path = "C:\\tmp\\escape.txt" if sys.platform == "win32" else "/tmp/escape.txt"
|
||||
with pytest.raises(ValueError, match="Absolute paths"):
|
||||
m.record_existing(abs_path)
|
||||
|
||||
|
||||
class TestManifestPathTraversal:
|
||||
def test_record_file_rejects_parent_traversal(self, tmp_path):
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
@@ -245,3 +296,160 @@ class TestManifestLoadValidation:
|
||||
path.write_text("{not valid json", encoding="utf-8")
|
||||
with pytest.raises(ValueError, match="invalid JSON"):
|
||||
IntegrationManifest.load("bad", tmp_path)
|
||||
|
||||
def test_load_filters_recovered_files_not_in_files(self, tmp_path):
|
||||
# Finding B (Round-9): a recovered_files entry referencing a path
|
||||
# not present in files indicates an internally-inconsistent manifest
|
||||
# (e.g. external edit). load() filters those entries silently so the
|
||||
# manifest self-heals on next save(); is_recovered then returns the
|
||||
# truthful False for the orphan.
|
||||
path = tmp_path / ".specify" / "integrations" / "test.manifest.json"
|
||||
path.parent.mkdir(parents=True)
|
||||
path.write_text(json.dumps({
|
||||
"integration": "test",
|
||||
"files": {"kept.txt": "abc123"},
|
||||
"recovered_files": ["kept.txt", "orphan.txt"],
|
||||
}), encoding="utf-8")
|
||||
m = IntegrationManifest.load("test", tmp_path)
|
||||
assert m.recovered_files == {"kept.txt"}
|
||||
assert m.is_recovered("kept.txt") is True
|
||||
assert m.is_recovered("orphan.txt") is False
|
||||
|
||||
|
||||
class TestManifestRecoveredFiles:
|
||||
"""Coverage for the ``recovered_files`` channel added in #2483.
|
||||
|
||||
When ``shared_infra`` skips an existing file (because the user already has
|
||||
it on disk) it now records the file with ``recovered=True``. The path
|
||||
appears in ``manifest.recovered_files`` and ``is_recovered(path)`` returns
|
||||
True. ``refresh_managed`` (out of scope for this PR) consults this list
|
||||
before treating the recorded hash as a managed baseline, defending against
|
||||
silent overwrite of user customizations after manifest loss.
|
||||
"""
|
||||
|
||||
def test_record_existing_default_is_not_recovered(self, tmp_path):
|
||||
(tmp_path / "f.txt").write_text("x", encoding="utf-8")
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
m.record_existing("f.txt")
|
||||
assert m.is_recovered("f.txt") is False
|
||||
assert m.recovered_files == set()
|
||||
|
||||
def test_record_existing_with_recovered_flag(self, tmp_path):
|
||||
(tmp_path / "f.txt").write_text("x", encoding="utf-8")
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
m.record_existing("f.txt", recovered=True)
|
||||
assert m.is_recovered("f.txt") is True
|
||||
assert m.recovered_files == {"f.txt"}
|
||||
# File still hashed normally so check_modified/uninstall keep working
|
||||
assert m.files["f.txt"] == _sha256(tmp_path / "f.txt")
|
||||
|
||||
def test_recovered_files_round_trips_through_save_load(self, tmp_path):
|
||||
(tmp_path / "a.txt").write_text("aaa", encoding="utf-8")
|
||||
(tmp_path / "b.txt").write_text("bbb", encoding="utf-8")
|
||||
m = IntegrationManifest("test", tmp_path, version="9.9")
|
||||
m.record_existing("a.txt", recovered=True)
|
||||
m.record_existing("b.txt") # not recovered
|
||||
m.save()
|
||||
loaded = IntegrationManifest.load("test", tmp_path)
|
||||
assert loaded.is_recovered("a.txt") is True
|
||||
assert loaded.is_recovered("b.txt") is False
|
||||
assert loaded.recovered_files == {"a.txt"}
|
||||
|
||||
def test_save_omits_empty_recovered_files(self, tmp_path):
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
m.record_file("f.txt", "x")
|
||||
path = m.save()
|
||||
data = json.loads(path.read_text(encoding="utf-8"))
|
||||
assert "recovered_files" not in data
|
||||
|
||||
def test_load_rejects_non_list_recovered_files(self, tmp_path):
|
||||
path = tmp_path / ".specify" / "integrations" / "bad.manifest.json"
|
||||
path.parent.mkdir(parents=True)
|
||||
path.write_text(
|
||||
json.dumps({"files": {}, "recovered_files": "not-a-list"}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
with pytest.raises(ValueError, match="recovered_files"):
|
||||
IntegrationManifest.load("bad", tmp_path)
|
||||
|
||||
def test_is_recovered_absolute_path_returns_false(self, tmp_path):
|
||||
# Copilot round-5 finding: passing an absolute path silently returned
|
||||
# False because the stored keys are relative POSIX strings. Round-7
|
||||
# made this explicit: ``is_recovered`` now rejects absolute paths
|
||||
# up front via a lexical ``rel.is_absolute()`` guard and returns
|
||||
# False without calling ``_validate_rel_path`` at all — matching
|
||||
# ``record_existing``'s canonical-key guard so the two methods
|
||||
# agree on which inputs can ever be stored keys.
|
||||
(tmp_path / "f.txt").write_text("x", encoding="utf-8")
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
m.record_existing("f.txt", recovered=True)
|
||||
import sys
|
||||
abs_input = "C:\\tmp\\f.txt" if sys.platform == "win32" else "/tmp/f.txt"
|
||||
assert m.is_recovered(abs_input) is False
|
||||
|
||||
def test_is_recovered_escaping_path_returns_false(self, tmp_path):
|
||||
# A relative path containing ``..`` segments cannot be a stored key:
|
||||
# Round-7 added the same lexical ``".." in rel.parts`` guard to
|
||||
# ``is_recovered`` that ``record_existing`` already enforces, so the
|
||||
# method returns False immediately without reaching
|
||||
# ``_validate_rel_path``. The try/except around ``_validate_rel_path``
|
||||
# remains as defense-in-depth for paths that pass the lexical guard
|
||||
# but still resolve outside the project root via a symlinked
|
||||
# ancestor.
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
# Don't record anything — the path is impossible to record anyway.
|
||||
assert m.is_recovered("../escape.txt") is False
|
||||
|
||||
def test_record_existing_clears_recovered_when_false(self, tmp_path):
|
||||
# Finding A: re-recording the same path with recovered=False must
|
||||
# drop the prior recovered marker (transition to managed baseline).
|
||||
f = tmp_path / "x.txt"
|
||||
f.write_text("v1", encoding="utf-8")
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
m.record_existing("x.txt", recovered=True)
|
||||
assert m.is_recovered("x.txt") is True
|
||||
m.record_existing("x.txt", recovered=False)
|
||||
assert m.is_recovered("x.txt") is False
|
||||
|
||||
def test_record_file_clears_recovered(self, tmp_path):
|
||||
# Finding A: record_file writes produced content; the path can no
|
||||
# longer be considered "merely observed" once we wrote bytes.
|
||||
(tmp_path / "y.txt").write_text("observed", encoding="utf-8")
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
m.record_existing("y.txt", recovered=True)
|
||||
assert m.is_recovered("y.txt") is True
|
||||
m.record_file("y.txt", "produced")
|
||||
assert m.is_recovered("y.txt") is False
|
||||
|
||||
def test_is_recovered_rejects_dotdot_segment(self, tmp_path):
|
||||
# Finding B: record_existing rejects ``..`` segments via the lexical
|
||||
# pre-check; is_recovered must match that behavior and return False
|
||||
# without raising, mirroring the canonicalization guard.
|
||||
(tmp_path / "z.txt").write_text("v1", encoding="utf-8")
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
m.record_existing("z.txt", recovered=True)
|
||||
# Same file via dotdot-normalizing path — must be False, not raise.
|
||||
assert m.is_recovered("subdir/../z.txt") is False
|
||||
|
||||
|
||||
class TestRecordExistingNewGuards:
|
||||
"""Coverage for the two new guards added by Copilot's 2026-05-18 review."""
|
||||
|
||||
def test_rejects_symlinked_ancestor(self, tmp_path):
|
||||
real_dir = tmp_path / "real_dir"
|
||||
real_dir.mkdir()
|
||||
(real_dir / "file.txt").write_text("payload", encoding="utf-8")
|
||||
(tmp_path / "linked_dir").symlink_to(real_dir, target_is_directory=True)
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
with pytest.raises(ValueError, match="symlinked"):
|
||||
m.record_existing("linked_dir/file.txt")
|
||||
|
||||
def test_rejects_inside_root_dotdot_with_explicit_message(self, tmp_path):
|
||||
# ``dir/../file.txt`` normalizes inside root, so the old "escapes
|
||||
# project root" message was misleading. The new message names the
|
||||
# actual reason: canonicalization.
|
||||
(tmp_path / "dir").mkdir()
|
||||
(tmp_path / "file.txt").write_text("x", encoding="utf-8")
|
||||
m = IntegrationManifest("test", tmp_path)
|
||||
with pytest.raises(ValueError, match=r"canonical|'\.\.' segments"):
|
||||
m.record_existing("dir/../file.txt")
|
||||
|
||||
Reference in New Issue
Block a user