Files
github-spec-kit/tests/integrations/test_manifest.py
Eldar Shlomi 39921ddd3b fix(shared-infra): record skipped files in speckit.manifest.json (#2483)
* fix(shared-infra): record skipped files in speckit.manifest.json

`install_shared_infra` skipped files that already existed on disk
when `force=False`, but the skip branches in both the scripts loop
and the templates loop only appended to `skipped_files` without
calling `manifest.record_existing`. So when the function ran with a
fresh manifest against an already-populated `.specify/` tree (e.g.
after the manifest was deleted, corrupted, or extracted out of band),
every file went down the skip path, `planned_copies` /
`planned_templates` stayed empty, and `manifest.save()` wrote an
empty `files` field — leaving the integration believing nothing was
installed.

Record every skipped file in the manifest, but only when it is not
already tracked. This preserves the original hash for files that
were previously recorded so `check_modified()` (used by
`integration use` to decide whether a user has customized a
template) keeps working correctly.

Add `TestSpeckitManifestRecordsSkippedFiles` in
`tests/integrations/test_integration_claude.py` covering both the
fresh-skip path and the recover-after-lost-manifest path.

Fixes #2107

* fix(shared-infra): guard manifest.record_existing against non-file dst

Address Copilot review feedback on PR #2483. The previous fix called
``manifest.record_existing(rel_skip)`` from the skip branch of both
loops in ``install_shared_infra``, which would crash with
``IsADirectoryError`` (or another ``OSError``) if a directory or other
non-regular-file happened to exist at the expected destination path —
since ``record_existing`` opens the file to compute its SHA-256.

Three coordinated fixes:

1. ``IntegrationManifest.record_existing`` now validates its
   precondition: it raises ``ValueError`` if the path is a symlink or
   is not a regular file. The docstring already promised "an
   already-existing file"; this enforces it. The symlink check runs on
   the un-resolved path because ``_validate_rel_path`` calls
   ``resolve()``, which would silently follow the symlink. Mirrors the
   existing ``_ensure_safe_manifest_destination`` precedent in the
   same module.

2. In ``install_shared_infra``'s scripts and templates skip branches,
   guard the ``record_existing`` call with ``dst.is_file()`` and wrap
   it in ``try/except (OSError, ValueError)``. A directory collision,
   permission error, or TOCTOU race no longer aborts the whole
   install — the user gets a per-path warning, the path still
   surfaces in ``skipped_files``, and the rest of the install
   continues.

3. ``_read_manifest_files`` in the regression test no longer falls
   back to ``data.get("_files")`` (Copilot's low-confidence finding):
   the silent fallback could mask a schema regression where the
   public ``files`` key is renamed. It now asserts ``"files" in data``
   and that the value is a dict.

Add two regression tests in ``TestSpeckitManifestRecordsSkippedFiles``
covering the directory-at-destination edge case for both the scripts
loop and the templates loop. Both verify (a) install does not crash,
(b) the non-file path is not recorded in the manifest, and (c) the
path still surfaces in the user-visible warning.

The "shared infrastructure file(s)" warning text is changed to
"path(s)" so it remains accurate when non-file entries appear in the
list.

Refs #2107

* fix(manifest): lexical pre-check for record_existing + add error-case tests

Address Copilot review (2026-05-11, review id 4266902103):

1. `record_existing` was calling `(self.project_root / rel).is_symlink()`
   BEFORE validating containment. For absolute paths or paths containing
   `..`, this performed a filesystem stat outside the project root before
   `_validate_rel_path()` raised. Add a cheap lexical pre-check that
   delegates to `_validate_rel_path()` for the canonical error messages,
   so the symlink stat only ever runs on paths that are already lexically
   inside the project root.

2. Add focused unit tests in `tests/integrations/test_manifest.py` for
   the symlink and non-regular-file error paths, including:
     - symlink target rejection
     - dangling symlink rejection (caught by the symlink guard before
       the is_file check)
     - directory path rejection (is_file == False)
     - missing-path rejection (is_file == False)
     - absolute-path lexical pre-check
   The Copilot reviewer noted these guards had no focused coverage in
   `test_manifest.py`, only via the `test_integration_claude.py`
   regression test.

3. The third Copilot finding (repeated `dict(self._files)` copies via
   `manifest.files` in the skip branches) is already resolved on this
   branch by using `prior_hashes` — the function-scope snapshot taken at
   the top of `install_shared_infra` — for the membership check, instead
   of `manifest.files`.

AI disclosure: drafted with assistance from Claude (Opus 4.7).

* fix(manifest): track recovered files separately + symlink-ancestor + canonical-path guards

Address Copilot review id 4309888722 (2026-05-18) on PR #2483:

1. Recovery semantics (shared_infra.py:371, 412) — install_shared_infra
   now passes ``recovered=True`` when re-recording a skipped existing
   file. This flag funnels into a new ``recovered_files`` array in the
   manifest JSON, so a future ``refresh_managed`` run can distinguish
   "hash I produced" from "hash I observed on a file that may be a user
   customization" and avoid silent overwrite without ``--refresh-shared-infra``.
   Schema is purely additive: ``files: dict[str, str]`` is unchanged; the
   new ``recovered_files: list[str]`` is omitted when empty.

2. Symlinked ancestor (manifest.py:172) — ``record_existing`` now walks
   every component of the rel path and rejects any symlinked ancestor,
   not just a symlinked leaf. Catches ``linked_dir/file.txt`` where
   ``linked_dir`` is a symlink, which previously slipped past the leaf-only
   ``is_symlink()`` check and was resolved through by ``_validate_rel_path``.
   Mirrors the component-walk pattern in ``_ensure_safe_manifest_directory``.

3. Misleading "escapes project root" message (manifest.py:168) — paths
   like ``dir/../file.txt`` normalize inside the project, so the old
   message lied about what was wrong. New message: "Manifest paths must
   be canonical; '..' segments are not allowed". Still rejects (canonical
   keys are required so ``check_modified``/``uninstall`` cannot key the
   same file under two paths).

Tests: 7 new test methods across TestManifestRecoveredFiles and
TestRecordExistingNewGuards covering all 4 Copilot findings. Full suite
passes locally.

🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).

* fix(manifest): normalize is_recovered input through _validate_rel_path

Address Copilot review comment id 4309888722 round-5 (2026-05-21) on PR #2483:

``is_recovered()`` previously checked ``self._recovered_files`` membership
with bare ``Path(rel).as_posix()``, while ``record_existing()`` stores keys
via ``_validate_rel_path(rel, root).relative_to(root).as_posix()``. The two
normalizations disagreed on absolute paths and paths that escape the
project root — ``is_recovered`` would silently return False for inputs that
``record_existing`` would have refused entirely.

The fix routes ``is_recovered`` through the same ``_validate_rel_path``
pipeline; ``ValueError`` from the validator is caught and converted to
False so query semantics stay exception-free (Python ``__contains__``
convention).

Tests: 2 new methods in ``TestManifestRecoveredFiles``:
- ``test_is_recovered_absolute_path_returns_false``
- ``test_is_recovered_escaping_path_returns_false``

🤖 AI disclosure: drafted with assistance from Claude (Opus 4.7).

* fix(manifest): clear recovered marker on managed re-record + reject '..' in is_recovered

Address Copilot Round-7 review comments on PR #2483:

1. record_existing(recovered=False) and record_file now BOTH discard the
   path from _recovered_files. The marker is meant to flag "we observed
   this file but cannot vouch it's a managed baseline" — once the same
   path is re-recorded as managed (either explicitly or by writing fresh
   bytes), the marker is stale and must clear so refresh_managed and
   future is_recovered queries return the truthful answer.

2. is_recovered now applies the same canonical-key guard as record_existing
   (rejects absolute paths and '..' segments lexically before delegating
   to _validate_rel_path). Such paths can never be stored keys, so the
   query correctly returns False without depending on _validate_rel_path
   semantics that diverged from record_existing's stricter contract.

record_file docstring updated to mention the side-effect on recovered
markers.

Tests: 3 new methods in TestManifestRecoveredFiles covering
record_existing(false) clearing, record_file clearing, and is_recovered
dotdot rejection.

* test(manifest): update is_recovered comments to reflect Round-7 lexical guard

Round 8 — addresses Copilot review comment on tests/integrations/test_manifest.py:362.

After Round-7 (1dbf0c2), is_recovered() rejects absolute paths and '..' segments
up front via a lexical guard, returning False without calling _validate_rel_path
at all. The test comments still described the prior "_validate_rel_path raises;
we catch" code path, which is misleading for readers.

Updated comments in both:
  - test_is_recovered_absolute_path_returns_false (Copilot's exact target)
  - test_is_recovered_escaping_path_returns_false (same comment-class issue;
    fixed preemptively to avoid a Round-9 finding on the same drift)

Pure documentation change. Test assertions and behavior unchanged; all manifest
tests still green.

* fix(manifest): document OS errors on record_existing + filter orphan recovered_files on load

Round 9 — addresses Copilot review on PR #2483:

1. record_existing's docstring now documents OSError/PermissionError as
   possible raises (in addition to ValueError) — the implementation has
   always been able to raise them from is_symlink, is_file, or the
   file-read used to hash, but the contract did not reflect that.
   Callers should be prepared for both surfaces.

2. load() now filters recovered_files entries that don't correspond to
   keys in files. An externally-edited or partially-corrupted manifest
   can deserialize with orphan recovered paths; rather than reject the
   whole manifest (too strict on the upgrade path), we drop the orphans
   and let the inconsistency self-correct on the next save(). is_recovered
   then returns the truthful False for the orphan.

Tests: new test_load_filters_recovered_files_not_in_files asserting an
orphan recovered entry is dropped on load.
2026-06-02 08:06:31 -05:00

456 lines
20 KiB
Python

"""Tests for IntegrationManifest — record, hash, save, load, uninstall, modified detection."""
import hashlib
import json
import sys
import pytest
from specify_cli.integrations.manifest import IntegrationManifest, _sha256
class TestManifestRecordFile:
def test_record_file_writes_and_hashes(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
content = "hello world"
abs_path = m.record_file("a/b.txt", content)
assert abs_path == tmp_path / "a" / "b.txt"
assert abs_path.read_text(encoding="utf-8") == content
expected_hash = hashlib.sha256(content.encode()).hexdigest()
assert m.files["a/b.txt"] == expected_hash
def test_record_file_bytes(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
data = b"\x00\x01\x02"
abs_path = m.record_file("bin.dat", data)
assert abs_path.read_bytes() == data
assert m.files["bin.dat"] == hashlib.sha256(data).hexdigest()
def test_record_existing(self, tmp_path):
f = tmp_path / "existing.txt"
f.write_text("content", encoding="utf-8")
m = IntegrationManifest("test", tmp_path)
m.record_existing("existing.txt")
assert m.files["existing.txt"] == _sha256(f)
class TestManifestRecordExistingErrors:
"""Error-case coverage for ``record_existing`` symlink + non-file guards.
Added in #2483 — Copilot review flagged these as un-tested regressions
after the ``is_symlink``/``is_file`` guards were introduced.
"""
def test_rejects_symlink_target(self, tmp_path):
target = tmp_path / "target.txt"
target.write_text("target content", encoding="utf-8")
link = tmp_path / "link.txt"
link.symlink_to(target)
m = IntegrationManifest("test", tmp_path)
with pytest.raises(ValueError, match="symlinked"):
m.record_existing("link.txt")
def test_rejects_dangling_symlink(self, tmp_path):
# A symlink pointing nowhere should still be rejected before the
# ``is_file()`` check (which would itself be False on a dangler).
link = tmp_path / "dangler.txt"
link.symlink_to(tmp_path / "no-such-target.txt")
m = IntegrationManifest("test", tmp_path)
with pytest.raises(ValueError, match="symlinked"):
m.record_existing("dangler.txt")
def test_rejects_directory_path(self, tmp_path):
(tmp_path / "a_dir").mkdir()
m = IntegrationManifest("test", tmp_path)
with pytest.raises(ValueError, match="not a regular file"):
m.record_existing("a_dir")
def test_rejects_missing_path(self, tmp_path):
# ``is_file()`` is False for non-existent paths too; the same error
# surface keeps callers from having to distinguish "missing" from
# "wrong kind" — both mean "cannot hash this".
m = IntegrationManifest("test", tmp_path)
with pytest.raises(ValueError, match="not a regular file"):
m.record_existing("never-existed.txt")
def test_lexical_prevalidation_for_absolute_path(self, tmp_path):
# ``record_existing`` must reject absolute paths via the lexical
# pre-check, NOT via the filesystem-touching ``is_symlink()`` call.
# Verified by passing an absolute path that points to a directory
# outside the project root — the canonical "Absolute paths" error
# must surface before any stat on the absolute path.
m = IntegrationManifest("test", tmp_path)
abs_path = "C:\\tmp\\escape.txt" if sys.platform == "win32" else "/tmp/escape.txt"
with pytest.raises(ValueError, match="Absolute paths"):
m.record_existing(abs_path)
class TestManifestPathTraversal:
def test_record_file_rejects_parent_traversal(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
with pytest.raises(ValueError, match="outside"):
m.record_file("../escape.txt", "bad")
def test_record_file_rejects_absolute_path(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
abs_path = "C:\\tmp\\escape.txt" if sys.platform == "win32" else "/tmp/escape.txt"
with pytest.raises(ValueError, match="Absolute paths"):
m.record_file(abs_path, "bad")
def test_record_existing_rejects_parent_traversal(self, tmp_path):
escape = tmp_path.parent / "escape.txt"
escape.write_text("evil", encoding="utf-8")
try:
m = IntegrationManifest("test", tmp_path)
with pytest.raises(ValueError, match="outside"):
m.record_existing("../escape.txt")
finally:
escape.unlink(missing_ok=True)
def test_uninstall_skips_traversal_paths(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("safe.txt", "good")
m._files["../outside.txt"] = "fakehash"
m.save()
removed, skipped = m.uninstall()
assert len(removed) == 1
assert removed[0].name == "safe.txt"
class TestManifestCheckModified:
def test_unmodified_file(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "original")
assert m.check_modified() == []
def test_modified_file(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "original")
(tmp_path / "f.txt").write_text("changed", encoding="utf-8")
assert m.check_modified() == ["f.txt"]
def test_deleted_file_not_reported(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "original")
(tmp_path / "f.txt").unlink()
assert m.check_modified() == []
def test_symlink_treated_as_modified(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "original")
target = tmp_path / "target.txt"
target.write_text("target", encoding="utf-8")
(tmp_path / "f.txt").unlink()
(tmp_path / "f.txt").symlink_to(target)
assert m.check_modified() == ["f.txt"]
class TestManifestUninstall:
def test_removes_unmodified(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("d/f.txt", "content")
m.save()
removed, skipped = m.uninstall()
assert len(removed) == 1
assert not (tmp_path / "d" / "f.txt").exists()
assert not (tmp_path / "d").exists()
assert skipped == []
def test_skips_modified(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "original")
m.save()
(tmp_path / "f.txt").write_text("modified", encoding="utf-8")
removed, skipped = m.uninstall()
assert removed == []
assert len(skipped) == 1
assert (tmp_path / "f.txt").exists()
def test_force_removes_modified(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "original")
m.save()
(tmp_path / "f.txt").write_text("modified", encoding="utf-8")
removed, skipped = m.uninstall(force=True)
assert len(removed) == 1
assert skipped == []
def test_already_deleted_file(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "content")
m.save()
(tmp_path / "f.txt").unlink()
removed, skipped = m.uninstall()
assert removed == []
assert skipped == []
def test_removes_manifest_file(self, tmp_path):
m = IntegrationManifest("test", tmp_path, version="1.0")
m.record_file("f.txt", "content")
m.save()
assert m.manifest_path.exists()
m.uninstall()
assert not m.manifest_path.exists()
def test_cleans_empty_parent_dirs(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("a/b/c/f.txt", "content")
m.save()
m.uninstall()
assert not (tmp_path / "a").exists()
def test_preserves_nonempty_parent_dirs(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("a/b/tracked.txt", "content")
(tmp_path / "a" / "b" / "other.txt").write_text("keep", encoding="utf-8")
m.save()
m.uninstall()
assert not (tmp_path / "a" / "b" / "tracked.txt").exists()
assert (tmp_path / "a" / "b" / "other.txt").exists()
def test_symlink_skipped_without_force(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "original")
m.save()
target = tmp_path / "target.txt"
target.write_text("target", encoding="utf-8")
(tmp_path / "f.txt").unlink()
(tmp_path / "f.txt").symlink_to(target)
removed, skipped = m.uninstall()
assert removed == []
assert len(skipped) == 1
def test_symlink_removed_with_force(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "original")
m.save()
target = tmp_path / "target.txt"
target.write_text("target", encoding="utf-8")
(tmp_path / "f.txt").unlink()
(tmp_path / "f.txt").symlink_to(target)
removed, skipped = m.uninstall(force=True)
assert len(removed) == 1
assert target.exists()
class TestManifestPersistence:
def test_save_and_load_roundtrip(self, tmp_path):
m = IntegrationManifest("myagent", tmp_path, version="2.0.1")
m.record_file("dir/file.md", "# Hello")
m.save()
loaded = IntegrationManifest.load("myagent", tmp_path)
assert loaded.key == "myagent"
assert loaded.version == "2.0.1"
assert loaded.files == m.files
def test_manifest_path(self, tmp_path):
m = IntegrationManifest("copilot", tmp_path)
assert m.manifest_path == tmp_path / ".specify" / "integrations" / "copilot.manifest.json"
def test_load_missing_raises(self, tmp_path):
with pytest.raises(FileNotFoundError):
IntegrationManifest.load("nonexistent", tmp_path)
def test_save_creates_directories(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "content")
path = m.save()
assert path.exists()
data = json.loads(path.read_text(encoding="utf-8"))
assert data["integration"] == "test"
def test_save_preserves_installed_at(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "content")
m.save()
first_ts = m._installed_at
m.save()
assert m._installed_at == first_ts
class TestManifestLoadValidation:
def test_load_non_dict_raises(self, tmp_path):
path = tmp_path / ".specify" / "integrations" / "bad.manifest.json"
path.parent.mkdir(parents=True)
path.write_text('"just a string"', encoding="utf-8")
with pytest.raises(ValueError, match="JSON object"):
IntegrationManifest.load("bad", tmp_path)
def test_load_bad_files_type_raises(self, tmp_path):
path = tmp_path / ".specify" / "integrations" / "bad.manifest.json"
path.parent.mkdir(parents=True)
path.write_text(json.dumps({"files": ["not", "a", "dict"]}), encoding="utf-8")
with pytest.raises(ValueError, match="mapping"):
IntegrationManifest.load("bad", tmp_path)
def test_load_bad_files_values_raises(self, tmp_path):
path = tmp_path / ".specify" / "integrations" / "bad.manifest.json"
path.parent.mkdir(parents=True)
path.write_text(json.dumps({"files": {"a.txt": 123}}), encoding="utf-8")
with pytest.raises(ValueError, match="mapping"):
IntegrationManifest.load("bad", tmp_path)
def test_load_invalid_json_raises(self, tmp_path):
path = tmp_path / ".specify" / "integrations" / "bad.manifest.json"
path.parent.mkdir(parents=True)
path.write_text("{not valid json", encoding="utf-8")
with pytest.raises(ValueError, match="invalid JSON"):
IntegrationManifest.load("bad", tmp_path)
def test_load_filters_recovered_files_not_in_files(self, tmp_path):
# Finding B (Round-9): a recovered_files entry referencing a path
# not present in files indicates an internally-inconsistent manifest
# (e.g. external edit). load() filters those entries silently so the
# manifest self-heals on next save(); is_recovered then returns the
# truthful False for the orphan.
path = tmp_path / ".specify" / "integrations" / "test.manifest.json"
path.parent.mkdir(parents=True)
path.write_text(json.dumps({
"integration": "test",
"files": {"kept.txt": "abc123"},
"recovered_files": ["kept.txt", "orphan.txt"],
}), encoding="utf-8")
m = IntegrationManifest.load("test", tmp_path)
assert m.recovered_files == {"kept.txt"}
assert m.is_recovered("kept.txt") is True
assert m.is_recovered("orphan.txt") is False
class TestManifestRecoveredFiles:
"""Coverage for the ``recovered_files`` channel added in #2483.
When ``shared_infra`` skips an existing file (because the user already has
it on disk) it now records the file with ``recovered=True``. The path
appears in ``manifest.recovered_files`` and ``is_recovered(path)`` returns
True. ``refresh_managed`` (out of scope for this PR) consults this list
before treating the recorded hash as a managed baseline, defending against
silent overwrite of user customizations after manifest loss.
"""
def test_record_existing_default_is_not_recovered(self, tmp_path):
(tmp_path / "f.txt").write_text("x", encoding="utf-8")
m = IntegrationManifest("test", tmp_path)
m.record_existing("f.txt")
assert m.is_recovered("f.txt") is False
assert m.recovered_files == set()
def test_record_existing_with_recovered_flag(self, tmp_path):
(tmp_path / "f.txt").write_text("x", encoding="utf-8")
m = IntegrationManifest("test", tmp_path)
m.record_existing("f.txt", recovered=True)
assert m.is_recovered("f.txt") is True
assert m.recovered_files == {"f.txt"}
# File still hashed normally so check_modified/uninstall keep working
assert m.files["f.txt"] == _sha256(tmp_path / "f.txt")
def test_recovered_files_round_trips_through_save_load(self, tmp_path):
(tmp_path / "a.txt").write_text("aaa", encoding="utf-8")
(tmp_path / "b.txt").write_text("bbb", encoding="utf-8")
m = IntegrationManifest("test", tmp_path, version="9.9")
m.record_existing("a.txt", recovered=True)
m.record_existing("b.txt") # not recovered
m.save()
loaded = IntegrationManifest.load("test", tmp_path)
assert loaded.is_recovered("a.txt") is True
assert loaded.is_recovered("b.txt") is False
assert loaded.recovered_files == {"a.txt"}
def test_save_omits_empty_recovered_files(self, tmp_path):
m = IntegrationManifest("test", tmp_path)
m.record_file("f.txt", "x")
path = m.save()
data = json.loads(path.read_text(encoding="utf-8"))
assert "recovered_files" not in data
def test_load_rejects_non_list_recovered_files(self, tmp_path):
path = tmp_path / ".specify" / "integrations" / "bad.manifest.json"
path.parent.mkdir(parents=True)
path.write_text(
json.dumps({"files": {}, "recovered_files": "not-a-list"}),
encoding="utf-8",
)
with pytest.raises(ValueError, match="recovered_files"):
IntegrationManifest.load("bad", tmp_path)
def test_is_recovered_absolute_path_returns_false(self, tmp_path):
# Copilot round-5 finding: passing an absolute path silently returned
# False because the stored keys are relative POSIX strings. Round-7
# made this explicit: ``is_recovered`` now rejects absolute paths
# up front via a lexical ``rel.is_absolute()`` guard and returns
# False without calling ``_validate_rel_path`` at all — matching
# ``record_existing``'s canonical-key guard so the two methods
# agree on which inputs can ever be stored keys.
(tmp_path / "f.txt").write_text("x", encoding="utf-8")
m = IntegrationManifest("test", tmp_path)
m.record_existing("f.txt", recovered=True)
import sys
abs_input = "C:\\tmp\\f.txt" if sys.platform == "win32" else "/tmp/f.txt"
assert m.is_recovered(abs_input) is False
def test_is_recovered_escaping_path_returns_false(self, tmp_path):
# A relative path containing ``..`` segments cannot be a stored key:
# Round-7 added the same lexical ``".." in rel.parts`` guard to
# ``is_recovered`` that ``record_existing`` already enforces, so the
# method returns False immediately without reaching
# ``_validate_rel_path``. The try/except around ``_validate_rel_path``
# remains as defense-in-depth for paths that pass the lexical guard
# but still resolve outside the project root via a symlinked
# ancestor.
m = IntegrationManifest("test", tmp_path)
# Don't record anything — the path is impossible to record anyway.
assert m.is_recovered("../escape.txt") is False
def test_record_existing_clears_recovered_when_false(self, tmp_path):
# Finding A: re-recording the same path with recovered=False must
# drop the prior recovered marker (transition to managed baseline).
f = tmp_path / "x.txt"
f.write_text("v1", encoding="utf-8")
m = IntegrationManifest("test", tmp_path)
m.record_existing("x.txt", recovered=True)
assert m.is_recovered("x.txt") is True
m.record_existing("x.txt", recovered=False)
assert m.is_recovered("x.txt") is False
def test_record_file_clears_recovered(self, tmp_path):
# Finding A: record_file writes produced content; the path can no
# longer be considered "merely observed" once we wrote bytes.
(tmp_path / "y.txt").write_text("observed", encoding="utf-8")
m = IntegrationManifest("test", tmp_path)
m.record_existing("y.txt", recovered=True)
assert m.is_recovered("y.txt") is True
m.record_file("y.txt", "produced")
assert m.is_recovered("y.txt") is False
def test_is_recovered_rejects_dotdot_segment(self, tmp_path):
# Finding B: record_existing rejects ``..`` segments via the lexical
# pre-check; is_recovered must match that behavior and return False
# without raising, mirroring the canonicalization guard.
(tmp_path / "z.txt").write_text("v1", encoding="utf-8")
m = IntegrationManifest("test", tmp_path)
m.record_existing("z.txt", recovered=True)
# Same file via dotdot-normalizing path — must be False, not raise.
assert m.is_recovered("subdir/../z.txt") is False
class TestRecordExistingNewGuards:
"""Coverage for the two new guards added by Copilot's 2026-05-18 review."""
def test_rejects_symlinked_ancestor(self, tmp_path):
real_dir = tmp_path / "real_dir"
real_dir.mkdir()
(real_dir / "file.txt").write_text("payload", encoding="utf-8")
(tmp_path / "linked_dir").symlink_to(real_dir, target_is_directory=True)
m = IntegrationManifest("test", tmp_path)
with pytest.raises(ValueError, match="symlinked"):
m.record_existing("linked_dir/file.txt")
def test_rejects_inside_root_dotdot_with_explicit_message(self, tmp_path):
# ``dir/../file.txt`` normalizes inside root, so the old "escapes
# project root" message was misleading. The new message names the
# actual reason: canonicalization.
(tmp_path / "dir").mkdir()
(tmp_path / "file.txt").write_text("x", encoding="utf-8")
m = IntegrationManifest("test", tmp_path)
with pytest.raises(ValueError, match=r"canonical|'\.\.' segments"):
m.record_existing("dir/../file.txt")