Files
CherryHQ-cherry-studio/v2-refactor-temp/docs/file-manager/handler-mapping.md
Phantom d2c568e349 feat(file): Add schema and foundation for new file module (#13451)
### What this PR does

Adds the **Phase 1a contract surface** for the file module — types, DB
schema, DataApi + File IPC contracts, FileManager skeleton, and
architecture docs.

**Phase 1b.1 (Read Path & Repository), 1b.2 (Write Path & Lifecycle),
1b.3 (Watcher & DanglingCache), and 1b.4 (OrphanSweep &
FileRefCheckerRegistry) are now all landed on top of 1a in this same
PR.** This is the complete Phase 1b runtime — reviewers see the full
read + write + watcher + orphan-sweep picture in one place.

Design, contracts, and decision rationale live in the architecture docs:

-
[`docs/references/file/architecture.md`](docs/references/file/architecture.md)
— module boundaries, type system, IPC/DataApi contracts, layered
architecture, service lifecycle, mutation propagation
-
[`docs/references/file/file-manager-architecture.md`](docs/references/file/file-manager-architecture.md)
— FileManager internals (storage, version detection, atomic writes,
reference cleanup, DirectoryWatcher, orphan sweep, DanglingCache state
machine, key design decisions)

#### Phase 1a deliverables

- Types (`FileEntry` / `FileInfo` / `FileHandle` /
`CanonicalExternalPath` brand)
- DB schema (`file_entry` + `file_ref`) with per-origin CHECK
constraints
- DataApi schemas + stub handlers
- File IPC contract (polymorphic `FileHandle` dispatch;
`batchGetMetadata` included)
- FileManager skeleton + `internal/*` + `ops/*` + `watcher/` +
`DanglingCache` + `versionCache`
- Mutation propagation design (three typed events + prefix-based
queryKey invalidation)

#### Phase 1b.1 deliverables (read-path runtime)

- Shared utilities: `getFileTypeByExt`, `sanitizeFilename`,
`validateFileName` extracted to `@shared/file/types`
- Path utilities: `canonicalizeExternalPath` (NFC + null-byte guard +
trailing-sep strip), `isPathInside`, `isUnderInternalStorage`,
`canWrite`
- FS read primitives: `stat`, `exists`, `read` (text/base64/binary
overloads), `hash` (initial MD5; swapped to xxhash-h64 in 1b.2)
- Metadata utilities: `getFileType(path)`, `isTextFile`, `mimeToExt`
- Repositories: `FileEntryService` + `FileRefService` read methods
(Drizzle-backed, Zod-branded outputs)
- Pure-function modules: `internal/content/{read,hash}`,
`internal/dispatch.ts` (FileHandle dispatcher)
- `toFileInfo(entry)` projection
- `FileManager` class as `BaseService` (`@Injectable('FileManager')`
`@ServicePhase(WhenReady)`); read methods only
- `DanglingCache` + `VersionCache` minimal viable singletons (full impls
in 1b.3 / 1b.2)
- DataApi `/files/*` read handlers fully implemented (entries / single /
ref-counts / refs-by-source)
- 60+ TDD tests (unit + boundary + setupTestDatabase integration)

#### Phase 1b.2 deliverables (write-path runtime)

- **FS atomic primitives** (open to non-file-module consumers per
architecture §5.3): `atomicWriteFile` (tmp + fsync + rename +
fsync(dir)), `atomicWriteIfUnchanged` (re-stat OCC + content-hash
fallback for second-precision mtime), `createAtomicWriteStream`
(Writable wrapper, abort/destroy unlinks tmp)
- **FS general primitives**: `write` (delegates to atomicWriteFile),
`copy` (atomic dest), `move` (rename + EXDEV → copy+unlink fallback),
`remove` (idempotent ENOENT), `mkdir` / `ensureDir` / `removeDir`,
`download` (fetch → atomic stream)
- **Hash swap**: `hash()` migrated MD5 → `xxhash-wasm` h64 streaming;
legacy `md5` dep retained for KnowledgeService loaders
- **VersionCache LRU**: capacity-bounded (default 2000) with
re-insert-on-touch recency
- **Repository mutations**: `FileEntryService.create/update/delete`
(auto UUIDv7 default id, raw DB CHECK errors propagate);
`FileRefService.create/createMany/cleanupBySource/cleanupBySourceBatch`
(`onConflictDoNothing` for batch upsert)
- **internal/entry/**: `create.createInternal` (4 source variants: bytes
/ base64 / path / url) + `create.ensureExternal` (canonicalize + stat +
idempotent upsert + duplicate-suspect peer warn);
`lifecycle.trash/restore/permanentDelete` + batch variants (DB+FS
decoupled, internal best-effort unlink); `rename` (internal DB-only,
external fs.move + canonical externalPath); `copy` (pipes through
createInternal with rollback)
- **internal/content/write.ts**: `write` / `writeIfUnchanged`
(cache-not-trusted re-stat OCC, `StaleVersionError` rewrap from
`PathStaleVersionError`) / `createWriteStream` / `*ByPath` variants
- **internal/system/**: `shell.open` / `shell.showInFolder` (electron
`shell` wrappers); `tempCopy.withTempCopy` (isolated tmp dir; cleanup on
throw)
- **FileManager facade**: every IFileManager mutation method now
delegates to its `internal/*` counterpart (`createInternalEntry` /
`ensureExternalEntry` / `batchCreate*` / `batchEnsure*` / `write` /
`writeIfUnchanged` / `createWriteStream` / `createReadStream` / `trash`
/ `restore` / `permanentDelete` + batch / `rename` / `copy` /
`withTempCopy` / `open` / `showInFolder`); no method throws
notImplemented anymore
- ~60 new TDD tests (each behavior unit = one RED→GREEN→REFACTOR
commit); end-to-end integration scenarios via `setupTestDatabase` cover
atomic-rollback zero-residue, OCC second-precision-mtime
no-false-positive, trash-external CHECK enforcement, full
create→write→read→trash→restore→permanentDelete round-trip, and external
permanentDelete-leaves-user-file-untouched

#### Phase 1b.3 deliverables (watcher + DanglingCache observability)

- **DanglingCache class** (replaces 1a const-literal skeleton):
`byEntryId: Map<entryId, CachedState>` + `pathToEntryIds:
Map<canonicalPath, Set<entryId>>` reverse index, lazy TTL expiration
(default 30 min per architecture §11.2), `forceRecheck` escape hatch,
`Emitter<DanglingStateChangedEvent>` firing only on genuine state
transitions (same-state observations are silent). Injectable `now` /
`statProbe` / `ttlMs` / `fileEntryService` seams for deterministic
tests.
- **createDirectoryWatcher** chokidar v4 wrapper: `add` / `unlink` /
`change` / `ready` / `error` events; built-in OS-junk basename ignores
(`.DS_Store` / `.localized` / `Thumbs.db` / `desktop.ini`); idempotent
`close()`. Factory auto-wires `add` →
`danglingCache.onFsEvent(path,'present')` and `unlink` → `'missing'`.
(Architecture §8.2's richer `onAddDir`/`onUnlinkDir`/`onRename` events
deferred — no consumer needs them in scope.)
- **Reverse-index maintenance from mutation flows**: `ensureExternal`
calls `addEntry` + `onFsEvent('present','ops')` on insert (no-op on
reuse); `permanentDelete(external)` calls `removeEntry`;
`rename(external)` swaps `removeEntry(oldPath) + addEntry(newPath) +
onFsEvent(newPath,'present','ops')`.
- **FileManager surface**: `getDanglingState({id})` (internal →
'present', external → cache check, unknown id → 'unknown');
`batchGetDanglingStates({ids})` (parallel fan-out, unknown ids mapped to
'unknown'); `subscribeDangling({id}, listener)` (in-process per-entry
filter; renderer fan-out via `file-manager-event` IPC channel deferred
to Phase 2).
- **FileManager.onInit**: awaits `danglingCache.initFromDb()` (populates
reverse index from non-trashed external entries; no startup stat probe
per architecture §10.6); registers `File_GetDanglingState` /
`File_BatchGetDanglingStates` IPC handlers via `this.ipcHandle`
(auto-disposed on stop).
- New `IpcChannel` constants: `File_GetDanglingState`,
`File_BatchGetDanglingStates`.
- ~30 new TDD tests across DanglingCache (18 unit) + watcher (6 real-FS)
+ FileManager integration (INT-7..INT-10).

#### Phase 1b.4 deliverables (orphan sweep + FileRefCheckerRegistry)

- **FileRefCheckerRegistry**: `Record<FileRefSourceType,
SourceTypeChecker<...>>` typed registry forces exhaustive coverage at
compile time — adding a new variant to `FileRefSourceType` without a
checker triggers a TS build error. Phase 1 ships `FileRefSourceType =
'temp_session' | 'knowledge_item'`: real DB-backed checker for
`knowledge_item` (Drizzle `inArray` against `knowledge_item`);
`temp_session` checker treats every sourceId as gone (sessions are
in-memory only). `chat_message` / `painting` / `note` are **deliberately
not in the union yet** — each will be added in lockstep (tuple entry in
`allSourceTypes` + `createRefSchema` variant + `SourceTypeChecker`) by
the PR that migrates the owning domain's DB tables to v2. Stray writes
during the migration window fail fast at `FileRefSchema.parse` rather
than being silently persisted under a no-op stub.
- **OrphanRefScanner** (RFC §6.4): `scanOneType(sourceType)` enumerates
distinct `file_ref.sourceId` per type, asks the checker which are alive,
deletes the rest via `cleanupBySourceBatch`. `scanAll()` aggregates
across every registered sourceType. Backed by new
`FileRefService.listDistinctSourceIds` to keep all SQL inside the repo.
- **Report-only orphan-entry pass** (architecture §7.1 default policy is
"preserve"): `scanOrphanEntries` groups active entries with zero
`file_ref` rows by origin. **No deletion** — surfaced via
`getOrphanReport()` for the cleanup-UI consumer. Backed by new
`FileEntryService.findUnreferenced` LEFT JOIN-based query.
- **Startup file sweep** (architecture §10): `runStartupFileSweep`
snapshots `file_entry.id` (active + trashed) into a `Set` via new
`FileEntryService.listAllIds`, walks `{userData}/files/`, plans unlink
for (a) UUID-named files whose id is not in the snapshot and (b)
`*.tmp-<UUID>` atomic-write residue. Applies the `mtime > 5min`
freshness gate (§10.3) — files newer than that are presumed in-flight
and preserved. Plan-then-execute with the `50% / 20-count-floor /
10MB-floor` safety threshold (§10.4); aborts emit
`abortReason='count-fraction'|'byte-fraction'`. Single structured
`orphan-file-sweep` log per run (info / warn / error per outcome,
§10.5).
- **DB-sweep umbrella + observability** (`runDbSweep`): runs scanAll +
scanOrphanEntries, emits one `orphan-sweep` structured record
summarising both passes; failure path returns `outcome='failed'` +
`errorMessage` so callers don't throw on background fire-and-forget.
- **FileManager integration**: `onInit` schedules a fire-and-forget
`runStartupSweeps` that runs the FS-level + DB-level sweeps in parallel;
failures of either are logged but never block ready. `getOrphanReport()`
exposes the most recent `DbSweepReport` (orphan-ref counts already
cleaned + orphan-entry counts preserved) + `lastRunAt` for the cleanup
UI surface.
- ~30 new TDD tests across registry (14 unit) + orphan sweep (16 unit +
integration) + FileManager integration (INT-11/INT-12) + repo
(`findUnreferenced`, `listAllIds`, `listDistinctSourceIds`).

**Out of scope (deferred to Phase 2)**:
- Architecture §7.2 dangling-external auto-cleanup (external + missing +
0-ref + >30d retention) — narrow extension shipping with the cleanup UI.
- Adding `chat_message` / `painting` / `note` as `FileRefSourceType`
variants (tuple entry + schema + checker added together) — gated on each
domain's v2 batch migration.
- Cleanup-UI surface that consumes `getOrphanReport()` — Phase 2
renderer work.

Renderer-side File IPC bridge for write/dangling methods stays deferred
to Phase 2 alongside the consumer-batch migrations. The Phase 1b runtime
is consumable from main-side business services through
`application.get('FileManager')`.

### Why we need it and why it was done in this way

Contract-first concentrates design review in one place; Phase 1b.x then
becomes pure "honor the contracts". Each 1b.x phase keeps strict TDD
(RED → GREEN → REFACTOR per behavior, ~one commit per cycle); each phase
ends with a verification gate (push → CI green) before the next phase
begins.

Core decisions (origin two-state, `FileEntry`/`FileInfo` split, DataApi
SQL-only, external `permanentDelete` DB-only, TTL `DanglingCache`, OCC
trust boundary, atomic write fsync default, etc.) and their rationale
are recorded in `file-manager-architecture.md §12 Key Design Decisions`
— not duplicated here.

### Breaking changes

None — purely additive (read+write paths are new, no existing callers
replaced yet).

### Special notes for your reviewer

- Review focus: contracts (1a) + read-path runtime (1b.1) + write-path
runtime (1b.2) + watcher/DanglingCache (1b.3) + orphan sweep / registry
(1b.4). Phase 1b is now complete on this branch.
- **Phase 1a contract stability policy** (architecture.md top) is
binding — any 1b.x PR that finds a contract mismatch PRs the doc
revision first.
- Deferred (Phase 2): renderer-side File IPC bridge for
write/dangling/orphan methods (alongside consumer migration); cleanup-UI
surface consuming `getOrphanReport()`; architecture §7.2
dangling-external auto-cleanup (>30d retention); adding `chat_message` /
`painting` / `note` as `FileRefSourceType` variants (each adds tuple
entry + schema + checker in lockstep, gated on the owning domain's v2
batch migration); DanglingCache periodic snapshot logger (architecture
§11.8); `listDirectory` ripgrep wrapper; `compressImage`
(KnowledgeService consumer); FileUploadService + `file_upload` table
(Vercel AI SDK Files API).

### Checklist

- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: Write code that humans can understand and Keep it simple
- [x] Refactor: You have left the code cleaner than you found it (Boy
Scout Rule)
- [ ] Upgrade: N/A — purely additive
- [ ] Documentation: Internal architecture docs included
(`docs/references/file/`); no user-facing docs change
- [x] Self-review: I have reviewed my own code before requesting review
from others

### Release note

```release-note
NONE
```

---------

Signed-off-by: icarus <eurfelux@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-05-13 16:06:10 +08:00

9.4 KiB
Raw Permalink Blame History

FileManager Handler Mapping

⚠️ OUTDATED / SUPERSEDED2026-04-21

本文档的"新 IPC"列捕获的是早期设计(createEntry({origin}) 未拆分),已被后续评审推翻。关键调整:

  • createEntry({origin:'internal',...})createInternalEntry(...)
  • createEntry({origin:'external',...})ensureExternalEntry(...)(纯 upsert by path无 restore 分支)
  • External entry 不进入 trash 生命周期;permanentDelete 对 external 只动 DB 行

实现准绳docs/references/file/file-manager-architecture.mdrfc-file-manager.mdfile-arch-problems-response.md

本文档保留用于 v1→v2 handler 迁移阶段的对照参考;不要据此固化新 IPC 命名。


v1 IPC → v2 IPC 的映射关系,以及 FileManager handler 层的分派逻辑。

相关文档:

架构

FileManager 统管所有 IPC handler 注册handler 内部按 target 类型分派:

Renderer
  → FileManager.registerIpcHandlers() (统一入口)
    ├── target: FileEntryId → FileManager 方法 (entry 协调: resolve → DB + FS)
    │     ├── ops.ts
    │     ├── FileTreeService (DB)
    │     └── FileRefService (DB)
    └── target: FilePath    → ops.ts (直接 FS/路径操作)

FileManager 的 public 方法只认 FileEntryId纯 path 操作在 handler 层直接委托 ops.ts不污染 public API。

Main process 其他 service 可根据实际需求直接调 ops.ts 或 FileManager不经过 IPC。

方法分类

一、纯 Entry 操作 → FileManager

这些方法的参数永远是 FileEntryId,必须经过 FileManager 协调 DB + FS。

v2 IPC 方法 替代的 v1 IPC 备注
createEntry(params) File_Upload, File_SaveBase64Image, File_SavePastedImage, File_Download, File_Mkdir 按 content 类型分派不同 ops 函数
batchCreateEntries(params) File_BatchUploadMarkdown 事务包裹
trash({ id }) File_Delete, File_DeleteDir
restore({ id }) (v1 无) 需 ensureAncestors
permanentDelete({ id }) File_DeleteExternalFile, File_DeleteExternalDir FS 删除 + DB 级联
batchTrash/Restore/PermanentDelete (v1 无) 事务包裹
move(params) File_Move, File_MoveDir, File_Rename, File_RenameDir FS move + DB update
batchMove(params) (v1 无) 事务包裹
copy(params) File_Copy 树内复制创建新条目;导出到外部不创建

二、纯 Path 操作 → ops.ts

这些方法的参数永远是 FilePath,直接委托 ops.ts不涉及 entry 系统。

v2 IPC 方法 替代的 v1 IPC 备注
select(options) File_Select, File_SelectFolder Electron dialog
save(options) File_Save, File_SaveImage Electron dialog + ops.write
listDirectory(dirPath, options?) File_ListDirectory
validateNotesPath(dirPath) File_ValidateNotesDirectory
canWrite(dirPath) App_HasWritePermission ops.canWrite
resolvePath(filePath) App_ResolvePath ops.resolvePath (path.resolve + untildify)
isPathInside(child, parent) App_IsPathInside ops.isPathInside
isNotEmptyDir(dirPath) App_IsNotEmptyDir ops.isNotEmptyDir

三、双态方法 → handler 按 target 类型分派

这些方法接受 FileEntryId | FilePathhandler 判断 target 类型后分派:

v2 IPC 方法 替代的 v1 IPC FileEntryId → FilePath →
read(target, options?) File_Read, File_ReadExternal, Fs_Read, Fs_ReadText, File_Base64Image, File_BinaryImage, File_Base64File FileManager.read ops.read
getMetadata(target) File_Get, File_GetPdfInfo, File_IsTextFile, File_IsDirectory FileManager.getMetadata ops.stat + getFileType
write(target, data) File_Write, File_WriteWithId FileManager.write ops.write
open(target) File_OpenPath, File_OpenWithRelativePath, Open_Path FileManager.open ops.open
showInFolder(target) File_ShowInFolder FileManager.showInFolder ops.showInFolder

已移除的 v1 IPC

v1 IPC 原因
File_Open renderer 自行组合 select + read
File_Clear 危险操作
File_CreateTempFile createEntry({ parentId: 'mount_temp', ... })
File_CheckFileName sanitize → shared 纯函数,冲突 → service 内部
File_GetDirectoryStructure → DataApi GET /files/entries/:id/children
File_StartWatcher/Stop/Pause/Resume FileManager 内部管理,不暴露 IPC

不属于 File Module 的 IPC

v1 IPC v2 归属 说明
getPathForFile preload utils 同步方法,不经过 IPC
Open_Website App 层 shell.openExternal(url)
Pdf_ExtractText 保持独立 纯内容处理(传 buffer不依赖文件系统
App_Copy 数据迁移模块 userData 递归复制,专用场景
FileService_* Provider 模块 AI Provider 远程文件 API
Gemini_*File Provider 模块 Gemini 专用
Export_Word Export 模块
Zip_Compress/Decompress Backup 模块
Webview_PrintToPDF/SaveAsHTML Webview 模块
Skill_ReadFile/ListFiles Skill 模块

汇总

v1 v2
文件相关 IPC 总数 52 22
其中纯 Entry 操作 9 → FileManager
其中纯 Path 操作 8 → ops.ts
其中双态handler 分派) 5 → FileManager or ops.ts
已移除 10
归属其他模块 10
新增v1 无) 7 (trash/restore/batch + 路径工具)