62 Commits

Author SHA1 Message Date
槑囿脑袋
2fbc7bda1c feat(knowledge): optional embedding model with BM25-only fallback (#16553)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-07-02 20:19:21 +08:00
fullex
9b9570116a refactor(db): replace libsql with better-sqlite3 + sqlite-vec (#16626) 2026-07-02 13:21:13 +08:00
Phantom
4ef2889fd3 refactor(file-ref): split persistent ref ownership (#16532)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eurfelux <eurfelux@gmail.com>
2026-06-30 18:44:44 +08:00
jd
99337fe585 fix(agent-session): align auto naming with topics (#16497)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: jd <59188306+zhangjiadi225@users.noreply.github.com>
2026-06-29 15:54:37 +08:00
槑囿脑袋
f938422c7d feat(knowledge): add custom chunk strategy and separator (#16298)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-06-24 13:54:10 +08:00
Zihao
0ae8ea4758 fix(mcp): cascade-remove agent MCP references on MCP server deletion (#16040)
### What this PR does

Before this PR:

Deleting a global MCP server via the UI removed the row from
`mcp_server` and (via FK cascade) the `assistant_mcp_server` junction
table, but each agent stored its MCP servers as an `agent.mcps` JSON
`string[]` column with no foreign key. The deleted MCP's ID lingered in
that JSON array, so agents kept trying to connect to a non-existent MCP
during startup/heartbeat — generating thousands of "Failed to connect to
MCP" error logs daily.

After this PR:

The agent↔MCP relation is **normalized**. The `agent.mcps` JSON column
is dropped and replaced by an `agent_mcp_server` junction table with `ON
DELETE CASCADE` on both sides (mirroring the existing
`assistant_mcp_server` table). Deleting an MCP server — or an agent —
now cascades structurally instead of leaving dangling JSON references.

`McpServerService.delete()` removes the server inside a single
`withWriteTx` transaction; `AgentService.removeMcpFromAllAgentsTx()`
first captures the affected agent IDs and explicitly deletes their
junction rows (the FK cascade is a safety net), so that after the
transaction commits `AgentService.emitAgentUpdatedForIds()` fires
`onAgentUpdated` for each affected agent and lets warm sessions refresh
their tool policy live.

The v1→v2 migration backfills the new table: `migrateAgentMcps()` reads
the legacy `agents.mcps` JSON arrays, remaps the old MCP IDs to their
new UUIDs (via `McpServerMigrator`'s `mcpServerIdMapping`), drops
dangling references, and writes `agent_mcp_server` rows.

Fixes #15986

### Why we need it and why it was done in this way

`agent.mcps` stored MCP server IDs as an unconstrained JSON `string[]`,
so SQLite could not cascade-delete them and the data model allowed
agents to reference MCP servers that no longer existed. Normalizing the
relation into a junction table with FK `ON DELETE CASCADE` is the
root-cause fix: referential integrity is now enforced by the database,
deletes cascade structurally, and no application-side JSON rewriting is
needed on every MCP delete.

All agent↔MCP writes (create / update / delete) go through
`AgentService` inside `withWriteTx` and roll back atomically;
`onAgentUpdated` events fire only after commit so sessions never refresh
against an uncommitted state. Reads project the junction rows back into
the agent DTO's `mcps` field, so the public API shape is unchanged for
consumers.

The following alternatives were considered:

- **Keep the JSON column and filter it in JS / `json_remove()` on
delete**: leaves the unconstrained column (and the dangling-reference
class of bugs) in place, and requires rewriting every agent row on each
MCP delete. Rejected in favor of normalizing the relation, which fixes
the invariant at the schema level.

### Breaking changes

None for users. Internally, the `agent.mcps` column is dropped in favor
of the `agent_mcp_server` junction table; the v1→v2 migrator
(`migrateAgentMcps`) carries existing associations forward, remapping
legacy MCP IDs to their new UUIDs.

### Special notes for your reviewer

- `agent_mcp_server` mirrors `assistant_mcp_server` exactly (schema,
naming, FK cascade); the `assistant` side was already covered by FK
cascade and is unchanged.
- `AgentService.emitAgentUpdatedForIds()` uses a batch `inArray` query
(not N+1 `getAgent()` calls); events fire only after the write
transaction commits, and the emission is wrapped so a post-commit
refresh failure cannot misreport a successful delete.
- `removeMcpFromAllAgentsTx()` captures affected agent IDs before
deleting the junction rows so events can be emitted post-commit; the
MCP-server delete's FK cascade is a redundant safety net.
- v1→v2: `migrateAgentMcps()` runs while `agents_legacy` is attached and
before `remapAgentPrefixIds()`, following the same `mcpServerIdMapping`
remap + dangling-drop pattern as `AssistantMigrator`.

### Checklist

- [x] Branch: This PR targets the correct branch — `main` for active
development
- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: Write code that humans can understand and Keep it simple
- [x] Refactor: You have left the code cleaner than you found it (Boy
Scout Rule)
- [x] Upgrade: Impact of this change on upgrade flows was considered and
addressed if required
- [ ] Documentation: A user-guide update was considered and is present
(link) or not required. Check this only when the PR introduces or
changes a user-facing feature or behavior.
- [x] Self-review: I have reviewed my own code (e.g., via gh-pr-review,
gh pr diff, or GitHub UI) before requesting review from others

### Release note

```release-note
Bug fix: Deleting a global MCP server now cascade-removes its references from all agents, preventing stale "Failed to connect to MCP" errors.
```

---------

Signed-off-by: suyao <sy20010504@gmail.com>
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-06-18 23:16:45 +08:00
fullex
34de6a209b fix(data-db): key chat FTS5 on a stable fts_rowid column, not the implicit rowid
message_fts and agent_session_message_fts keyed their FTS5 external-content index on SQLite's implicit rowid. A drizzle table rebuild (INSERT...SELECT drops the rowid) or VACUUM reshuffles it, silently desyncing the index (wrong/missing hits, no error; the default integrity-check does not detect it).

Add a real fts_rowid integer-unique column to each table, set content_rowid to it, and assign it in the AFTER INSERT trigger (MAX+1, O(log N) via the unique index, race-free under withWriteTx). A real column is carried verbatim through rebuilds and untouched by VACUUM, so the index stays aligned by construction. Update both search joins to base.fts_rowid = fts.rowid, convert the agent-session triggers to DROP+CREATE, and add a regression test reproducing a rowid-reshuffling rebuild that asserts integrity-check,1 stays clean (with a NULL fts_rowid negative control). Note the same latent hazard on knowledge search_text_fts (deferred; no rebuild/VACUUM path today).
2026-06-17 05:51:35 -07:00
SuYao
6c61f33c18 feat(data-api): per-topic virtual-root message tree (structural single-root invariant) (#15951)
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: jdzhang <625013594@qq.com>
Co-authored-by: jd <59188306+zhangjiadi225@users.noreply.github.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: suyao <sy20010504@gmail.com>
Signed-off-by: jdzhang <625013594@qq.com>
2026-06-17 18:20:35 +08:00
SuYao
fdbc0c31e7 feat(chat-page): misc backend tail (home-relative paths, chat-context, seeder, schemas) (#16068)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-06-16 18:07:25 +08:00
槑囿脑袋
20035a83ff feat(knowledge): engine-portable per-base index store and retrieval cutover (#15973)
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-06-15 19:43:10 +08:00
SuYao
eae9716ffa feat(agent-data): agent resource API + agent workspace DataApi workflow (#15941)
Co-authored-by: jd <59188306+zhangjiadi225@users.noreply.github.com>
Co-authored-by: jdzhang <625013594@qq.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: suyao <sy20010504@gmail.com>
Signed-off-by: jdzhang <625013594@qq.com>
2026-06-11 23:52:31 +08:00
SuYao
a181517a77 feat(ai-trace): trace capture + container trace DataApi + chat trace pane (#15942)
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: suyao <sy20010504@gmail.com>
2026-06-11 18:10:50 +08:00
Gu JiaMing
8191c13c8e feat(provider-settings): auto-enable providers when models are available (#15686)
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Asurada <43401755+ousugo@users.noreply.github.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: gujiaming <52187003+AtomsH4@users.noreply.github.com>
2026-06-10 12:11:11 +08:00
jd
29286cad38 refactor(agent-session): make workspace binding explicit (#15736)
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: jdzhang <625013594@qq.com>
Signed-off-by: zhangjiadi225 <625013594@qq.com>
2026-06-09 17:50:31 +08:00
槑囿脑袋
e818c374e6 refactor(knowledge): remove sitemap source from v2 (#15682)
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-06-06 16:26:15 +08:00
fullex
6ca0615e1d perf(message): index status to back boot pending-reconcile lookup
The boot reconcile of crash-orphaned pending assistant turns (findPendingAssistantMessageIds) full-scanned the message and agent_session_message tables. Add a plain status index on each so the lookup is a SEARCH, not a SCAN.

Plain, not partial: Drizzle binds status = ?, which SQLite cannot match against a partial (literal-predicate) index. Also select only id, since the reconcile just flips matched rows to error.
2026-06-05 22:52:33 -07:00
SuYao
5706307451 refactor(ai-service): consolidate AI runtime to main process (#14911)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: suyao <sy20010504@gmail.com>
2026-06-05 00:06:51 +08:00
Yiran
61c013bd5b feat(knowledge-base): redesign knowledge workspace (#15518)
Co-authored-by: eeee0717 <chentao020717Work@outlook.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Co-authored-by: 槑囿脑袋 <70054568+eeee0717@users.noreply.github.com>
Signed-off-by: akazaakari950718-dev <akazaakari950718@gmail.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-06-02 16:03:37 +08:00
亢奋猫
26508591f8 refactor(paintings): migrate to v2 data layer and UI (#15154)
Co-authored-by: jidan745le <420511176@qq.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: jidan745le <420511176@qq.com>
Signed-off-by: suyao <sy20010504@gmail.com>
2026-06-02 15:18:53 +08:00
Phantom
05111349f3 fix(db-migrations): resolve snapshot chain collision (#15438) and gate against recurrence (#15440)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: icarus <eurfelux@gmail.com>
2026-05-30 15:30:49 +08:00
槑囿脑袋
adefbb7efb refactor(knowledge): knowledge v2 rfc and impletent round 1 (#15309)
Co-authored-by: Phantom <eurfelux@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
Signed-off-by: icarus <eurfelux@gmail.com>
2026-05-27 11:54:25 +08:00
亢奋猫
e145298d7c refactor(notes): migrate notes UI and data to v2 (#15186)
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: kangfenmao <kangfenmao@qq.com>
Signed-off-by: gujiaming <52187003+AtomsH4@users.noreply.github.com>
Signed-off-by: akazaakari950718-dev <akazaakari950718@gmail.com>
2026-05-25 19:52:52 +08:00
Phantom
6ec914cf0f refactor(file-entry): rename trashedAt to deletedAt (#15246)
### What this PR does

Before this PR:

- `file_entry` table used `trashed_at` for the soft-delete timestamp,
diverging from every other soft-deletable table in the schema (`agent`,
`assistant`, `message`, `topic`), which all use `deleted_at`.

After this PR:

- `file_entry.deleted_at` (and BO field `deletedAt`) — naming is
consistent across the entire schema.
- Renamed identifiers:
  - Schema field: `trashedAt` → `deletedAt`
  - SQL column: `trashed_at` → `deleted_at`
  - Index: `fe_trashed_at_idx` → `fe_deleted_at_idx`
  - CHECK constraint: `fe_external_no_trash` → `fe_external_no_delete`
- Updated all source files, tests, and architecture docs (including
`v2-refactor-temp/docs/file-manager/`).
- **Intentionally NOT renamed** (out of scope — these are API surface /
concept names, not the column name): `moveToTrash`, `restoreFromTrash`,
`inTrash` (query flag), `isTrashed`, `batchTrash`, `internalTrash`, and
"Trash" as a concept in comments/docs.

Fixes #

### Why we need it and why it was done in this way

The following tradeoffs were made:

- **Scope discipline**: kept the rename strictly at the
column-identifier layer (4 identifiers). Did not change API names or
concept words — switching the "Trash" concept to "Delete" is a larger
semantic change that deserves its own PR.
- **Migration 0026 contains a manual SQL patch.**
drizzle-orm/drizzle-kit issue
[#3653](https://github.com/drizzle-team/drizzle-orm/issues/3653) causes
the SQLite rebuild-table path to drop the leading `ALTER TABLE … RENAME
COLUMN` statement. The generated `INSERT … SELECT "deleted_at" FROM
file_entry` would fail because the source table still has `trashed_at`.
The migration manually prepends an explicit `ALTER TABLE file_entry
RENAME COLUMN trashed_at TO deleted_at;` before the rebuild. Upstream
fix landed in `drizzle-kit@1.0.0-beta`/`rc` but is not backported to the
`0.31.x` stable line we depend on.
- **Why keeping the manual patch is acceptable**: per `CLAUDE.md` § v2
Refactoring, `migrations/sqlite-drizzle/` is throwaway during v2 — it
will be wiped and regenerated as a single clean initial migration from
the final schemas before release. Mid-development DB drift is explicitly
acceptable, and the manual SQL only needs to survive until that
regeneration.

The following alternatives were considered:

- Selecting `create column` in `drizzle-kit generate` instead of `rename
column`: also produces invalid SQL (same root cause — the rebuild path
puts the new column name in the `SELECT` list regardless of the rename
mapping). Rejected.
- Skipping the `0026` migration entirely and relying on `db:push` / DB
reset during dev: pollutes `_journal.json` divergence and makes the next
schema change confusing. Rejected.
- Upgrading to `drizzle-kit@1.0.0-beta`/`rc` to get the fix: v1 is a
major rewrite with significant breaking changes (alternation engine
rewrite, ORM type system rewrite, migration folder layout change). Out
of scope for this PR. Rejected.

Links to places where the discussion took place: N/A

### Breaking changes

None. Dev-only DB column rename during v2 refactor. No user-visible
behavior change. No public API surface change. v1 data never reaches
this branch except through migrators in `src/main/data/migration/v2/`.

### Special notes for your reviewer

- The single manual edit to drizzle-generated SQL is in
`migrations/sqlite-drizzle/0026_sturdy_aqueduct.sql` — look for the
`MANUAL PATCH` comment block at the top. Without it the migration will
fail to apply.
- "Trash" concept words still appear throughout the file-manager
codebase by design (function names, comments, docs section headings). If
we later want to migrate the whole concept to "Delete", that should be a
follow-up PR.

### Checklist

This checklist is not enforcing, but it's a reminder of items that could
be relevant to every PR.
Approvers are expected to review this list.

- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: [Write code that humans can
understand](https://en.wikiquote.org/wiki/Martin_Fowler#code-for-humans)
and [Keep it simple](https://en.wikipedia.org/wiki/KISS_principle)
- [x] Refactor: You have [left the code cleaner than you found it (Boy
Scout
Rule)](https://learning.oreilly.com/library/view/97-things-every/9780596809515/ch08.html)
- [x] Upgrade: Impact of this change on upgrade flows was considered and
addressed if required
- [ ] Documentation: A [user-guide update](https://docs.cherry-ai.com)
was considered and is present (link) or not required. Check this only
when the PR introduces or changes a user-facing feature or behavior.
- [x] Self-review: I have reviewed my own code (e.g., via
[`/gh-pr-review`](/.claude/skills/gh-pr-review/SKILL.md), `gh pr diff`,
or GitHub UI) before requesting review from others

### Release note

```release-note
NONE
```

---------

Signed-off-by: icarus <eurfelux@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 23:04:36 +08:00
fullex
5b929d2f87 feat(job): rework schedule lifecycle and startup recovery
Four backbone changes that close known gaps in the schedule subsystem before the next round of business handler work:

* DB-enforced singleton: drop the app-layer Mutex in JobScheduleService and let UNIQUE(type, name) enforce "one singleton per type" via a '' sentinel. rowToSnapshot maps '' back to null so the external snapshot contract stays `string | null`.
* updateJobSchedule public API: writes the DB row and re-arms the in-process cron entry when trigger or enabled changes. Field-presence check avoids JSON.stringify brittleness; the one-turn race is an accepted last-writer-wins limitation.
* Startup recovery gating: move runStartupRecovery + armSchedule from onReady to a new onAllReady behind a 60s delay so business services have a window to register handlers before recovery runs. onStop flips a stop flag for clean teardown; per-step try/catch so one failure does not zero the session.
* Schedule control API tests: cover pause/resume/triggerNow/unregister × by-id/by-name plus updateJobSchedule branch matrix; JobScheduleService gets its own unit suite for singleton/sentinel behavior.

Side effects: add JOB_SCHEDULE_NAME_INVALID code; fix pre-existing arg order on DataApiErrorFactory.conflict at unique handlers (message, then resource); tighten getByTypeAndName signature to (type, name: string); listNamesForType filters the singleton sentinel and returns string[]; sync existing integration/smoke fixtures to await _doAllReady() with fake timers; add "Schedule identity: (type, name) model" section in handler-authoring.md.
2026-05-19 22:49:24 -07:00
fullex
344c58eb8e feat(job): introduce JobManager + SchedulerService backbone (Phase 1)
cherry-studio v2 had 6+ ad-hoc queue/scheduler implementations (Knowledge,
FileProcessing, agent SchedulerService, TopicQueue, NotificationQueue,
protocol heartbeats) with no shared registry, inconsistent cancel and
progress semantics, and no cross-restart recovery outside agent_task.
This commit lands the unified replacement: JobManager owns Job lifecycle
and SchedulerService owns time-only scheduling, both reusable independently.

Phase 1 ships the backbone only: jobTable + jobScheduleTable, entity
services, 6-state machine with per-handler recovery (abandon/retry/
singleton), catch-up policy (skip-missed/after-startup), retry backoff,
GC, idempotencyKey dedup, useJob renderer hook, and 4 reference docs.
Four-layer lock model addresses libsql client-ts issue #288 (Layer 0
global dispatch mutex serializes all transactions).

croner@^10 is introduced for cron expressions (zero-dep, Electron-friendly).
Application.SHUTDOWN_TIMEOUT_MS is promoted to public for JobManager reuse.
cron-parser stays until Phase 2 agent task migration.

No business migrations in this commit — Knowledge/FileProcessing/agent
SchedulerService remain untouched and migrate in Phase 2-4 per
docs/references/job-and-scheduler/migration-checklist.md.

Follow-ups for Phase 1 completion: DataApi GET /jobs handler, dummy.echo
smoke test, integration tests, migration feasibility report.
2026-05-19 03:57:18 -07:00
jidan745le
e59c7aa7f7 refactor(provider-settings): migrate to v2 data layer and UI (#14631)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: jidan745le <420511176@qq.com>
Signed-off-by: suyao <sy20010504@gmail.com>
2026-05-14 17:01:33 +08:00
Phantom
d2c568e349 feat(file): Add schema and foundation for new file module (#13451)
### What this PR does

Adds the **Phase 1a contract surface** for the file module — types, DB
schema, DataApi + File IPC contracts, FileManager skeleton, and
architecture docs.

**Phase 1b.1 (Read Path & Repository), 1b.2 (Write Path & Lifecycle),
1b.3 (Watcher & DanglingCache), and 1b.4 (OrphanSweep &
FileRefCheckerRegistry) are now all landed on top of 1a in this same
PR.** This is the complete Phase 1b runtime — reviewers see the full
read + write + watcher + orphan-sweep picture in one place.

Design, contracts, and decision rationale live in the architecture docs:

-
[`docs/references/file/architecture.md`](docs/references/file/architecture.md)
— module boundaries, type system, IPC/DataApi contracts, layered
architecture, service lifecycle, mutation propagation
-
[`docs/references/file/file-manager-architecture.md`](docs/references/file/file-manager-architecture.md)
— FileManager internals (storage, version detection, atomic writes,
reference cleanup, DirectoryWatcher, orphan sweep, DanglingCache state
machine, key design decisions)

#### Phase 1a deliverables

- Types (`FileEntry` / `FileInfo` / `FileHandle` /
`CanonicalExternalPath` brand)
- DB schema (`file_entry` + `file_ref`) with per-origin CHECK
constraints
- DataApi schemas + stub handlers
- File IPC contract (polymorphic `FileHandle` dispatch;
`batchGetMetadata` included)
- FileManager skeleton + `internal/*` + `ops/*` + `watcher/` +
`DanglingCache` + `versionCache`
- Mutation propagation design (three typed events + prefix-based
queryKey invalidation)

#### Phase 1b.1 deliverables (read-path runtime)

- Shared utilities: `getFileTypeByExt`, `sanitizeFilename`,
`validateFileName` extracted to `@shared/file/types`
- Path utilities: `canonicalizeExternalPath` (NFC + null-byte guard +
trailing-sep strip), `isPathInside`, `isUnderInternalStorage`,
`canWrite`
- FS read primitives: `stat`, `exists`, `read` (text/base64/binary
overloads), `hash` (initial MD5; swapped to xxhash-h64 in 1b.2)
- Metadata utilities: `getFileType(path)`, `isTextFile`, `mimeToExt`
- Repositories: `FileEntryService` + `FileRefService` read methods
(Drizzle-backed, Zod-branded outputs)
- Pure-function modules: `internal/content/{read,hash}`,
`internal/dispatch.ts` (FileHandle dispatcher)
- `toFileInfo(entry)` projection
- `FileManager` class as `BaseService` (`@Injectable('FileManager')`
`@ServicePhase(WhenReady)`); read methods only
- `DanglingCache` + `VersionCache` minimal viable singletons (full impls
in 1b.3 / 1b.2)
- DataApi `/files/*` read handlers fully implemented (entries / single /
ref-counts / refs-by-source)
- 60+ TDD tests (unit + boundary + setupTestDatabase integration)

#### Phase 1b.2 deliverables (write-path runtime)

- **FS atomic primitives** (open to non-file-module consumers per
architecture §5.3): `atomicWriteFile` (tmp + fsync + rename +
fsync(dir)), `atomicWriteIfUnchanged` (re-stat OCC + content-hash
fallback for second-precision mtime), `createAtomicWriteStream`
(Writable wrapper, abort/destroy unlinks tmp)
- **FS general primitives**: `write` (delegates to atomicWriteFile),
`copy` (atomic dest), `move` (rename + EXDEV → copy+unlink fallback),
`remove` (idempotent ENOENT), `mkdir` / `ensureDir` / `removeDir`,
`download` (fetch → atomic stream)
- **Hash swap**: `hash()` migrated MD5 → `xxhash-wasm` h64 streaming;
legacy `md5` dep retained for KnowledgeService loaders
- **VersionCache LRU**: capacity-bounded (default 2000) with
re-insert-on-touch recency
- **Repository mutations**: `FileEntryService.create/update/delete`
(auto UUIDv7 default id, raw DB CHECK errors propagate);
`FileRefService.create/createMany/cleanupBySource/cleanupBySourceBatch`
(`onConflictDoNothing` for batch upsert)
- **internal/entry/**: `create.createInternal` (4 source variants: bytes
/ base64 / path / url) + `create.ensureExternal` (canonicalize + stat +
idempotent upsert + duplicate-suspect peer warn);
`lifecycle.trash/restore/permanentDelete` + batch variants (DB+FS
decoupled, internal best-effort unlink); `rename` (internal DB-only,
external fs.move + canonical externalPath); `copy` (pipes through
createInternal with rollback)
- **internal/content/write.ts**: `write` / `writeIfUnchanged`
(cache-not-trusted re-stat OCC, `StaleVersionError` rewrap from
`PathStaleVersionError`) / `createWriteStream` / `*ByPath` variants
- **internal/system/**: `shell.open` / `shell.showInFolder` (electron
`shell` wrappers); `tempCopy.withTempCopy` (isolated tmp dir; cleanup on
throw)
- **FileManager facade**: every IFileManager mutation method now
delegates to its `internal/*` counterpart (`createInternalEntry` /
`ensureExternalEntry` / `batchCreate*` / `batchEnsure*` / `write` /
`writeIfUnchanged` / `createWriteStream` / `createReadStream` / `trash`
/ `restore` / `permanentDelete` + batch / `rename` / `copy` /
`withTempCopy` / `open` / `showInFolder`); no method throws
notImplemented anymore
- ~60 new TDD tests (each behavior unit = one RED→GREEN→REFACTOR
commit); end-to-end integration scenarios via `setupTestDatabase` cover
atomic-rollback zero-residue, OCC second-precision-mtime
no-false-positive, trash-external CHECK enforcement, full
create→write→read→trash→restore→permanentDelete round-trip, and external
permanentDelete-leaves-user-file-untouched

#### Phase 1b.3 deliverables (watcher + DanglingCache observability)

- **DanglingCache class** (replaces 1a const-literal skeleton):
`byEntryId: Map<entryId, CachedState>` + `pathToEntryIds:
Map<canonicalPath, Set<entryId>>` reverse index, lazy TTL expiration
(default 30 min per architecture §11.2), `forceRecheck` escape hatch,
`Emitter<DanglingStateChangedEvent>` firing only on genuine state
transitions (same-state observations are silent). Injectable `now` /
`statProbe` / `ttlMs` / `fileEntryService` seams for deterministic
tests.
- **createDirectoryWatcher** chokidar v4 wrapper: `add` / `unlink` /
`change` / `ready` / `error` events; built-in OS-junk basename ignores
(`.DS_Store` / `.localized` / `Thumbs.db` / `desktop.ini`); idempotent
`close()`. Factory auto-wires `add` →
`danglingCache.onFsEvent(path,'present')` and `unlink` → `'missing'`.
(Architecture §8.2's richer `onAddDir`/`onUnlinkDir`/`onRename` events
deferred — no consumer needs them in scope.)
- **Reverse-index maintenance from mutation flows**: `ensureExternal`
calls `addEntry` + `onFsEvent('present','ops')` on insert (no-op on
reuse); `permanentDelete(external)` calls `removeEntry`;
`rename(external)` swaps `removeEntry(oldPath) + addEntry(newPath) +
onFsEvent(newPath,'present','ops')`.
- **FileManager surface**: `getDanglingState({id})` (internal →
'present', external → cache check, unknown id → 'unknown');
`batchGetDanglingStates({ids})` (parallel fan-out, unknown ids mapped to
'unknown'); `subscribeDangling({id}, listener)` (in-process per-entry
filter; renderer fan-out via `file-manager-event` IPC channel deferred
to Phase 2).
- **FileManager.onInit**: awaits `danglingCache.initFromDb()` (populates
reverse index from non-trashed external entries; no startup stat probe
per architecture §10.6); registers `File_GetDanglingState` /
`File_BatchGetDanglingStates` IPC handlers via `this.ipcHandle`
(auto-disposed on stop).
- New `IpcChannel` constants: `File_GetDanglingState`,
`File_BatchGetDanglingStates`.
- ~30 new TDD tests across DanglingCache (18 unit) + watcher (6 real-FS)
+ FileManager integration (INT-7..INT-10).

#### Phase 1b.4 deliverables (orphan sweep + FileRefCheckerRegistry)

- **FileRefCheckerRegistry**: `Record<FileRefSourceType,
SourceTypeChecker<...>>` typed registry forces exhaustive coverage at
compile time — adding a new variant to `FileRefSourceType` without a
checker triggers a TS build error. Phase 1 ships `FileRefSourceType =
'temp_session' | 'knowledge_item'`: real DB-backed checker for
`knowledge_item` (Drizzle `inArray` against `knowledge_item`);
`temp_session` checker treats every sourceId as gone (sessions are
in-memory only). `chat_message` / `painting` / `note` are **deliberately
not in the union yet** — each will be added in lockstep (tuple entry in
`allSourceTypes` + `createRefSchema` variant + `SourceTypeChecker`) by
the PR that migrates the owning domain's DB tables to v2. Stray writes
during the migration window fail fast at `FileRefSchema.parse` rather
than being silently persisted under a no-op stub.
- **OrphanRefScanner** (RFC §6.4): `scanOneType(sourceType)` enumerates
distinct `file_ref.sourceId` per type, asks the checker which are alive,
deletes the rest via `cleanupBySourceBatch`. `scanAll()` aggregates
across every registered sourceType. Backed by new
`FileRefService.listDistinctSourceIds` to keep all SQL inside the repo.
- **Report-only orphan-entry pass** (architecture §7.1 default policy is
"preserve"): `scanOrphanEntries` groups active entries with zero
`file_ref` rows by origin. **No deletion** — surfaced via
`getOrphanReport()` for the cleanup-UI consumer. Backed by new
`FileEntryService.findUnreferenced` LEFT JOIN-based query.
- **Startup file sweep** (architecture §10): `runStartupFileSweep`
snapshots `file_entry.id` (active + trashed) into a `Set` via new
`FileEntryService.listAllIds`, walks `{userData}/files/`, plans unlink
for (a) UUID-named files whose id is not in the snapshot and (b)
`*.tmp-<UUID>` atomic-write residue. Applies the `mtime > 5min`
freshness gate (§10.3) — files newer than that are presumed in-flight
and preserved. Plan-then-execute with the `50% / 20-count-floor /
10MB-floor` safety threshold (§10.4); aborts emit
`abortReason='count-fraction'|'byte-fraction'`. Single structured
`orphan-file-sweep` log per run (info / warn / error per outcome,
§10.5).
- **DB-sweep umbrella + observability** (`runDbSweep`): runs scanAll +
scanOrphanEntries, emits one `orphan-sweep` structured record
summarising both passes; failure path returns `outcome='failed'` +
`errorMessage` so callers don't throw on background fire-and-forget.
- **FileManager integration**: `onInit` schedules a fire-and-forget
`runStartupSweeps` that runs the FS-level + DB-level sweeps in parallel;
failures of either are logged but never block ready. `getOrphanReport()`
exposes the most recent `DbSweepReport` (orphan-ref counts already
cleaned + orphan-entry counts preserved) + `lastRunAt` for the cleanup
UI surface.
- ~30 new TDD tests across registry (14 unit) + orphan sweep (16 unit +
integration) + FileManager integration (INT-11/INT-12) + repo
(`findUnreferenced`, `listAllIds`, `listDistinctSourceIds`).

**Out of scope (deferred to Phase 2)**:
- Architecture §7.2 dangling-external auto-cleanup (external + missing +
0-ref + >30d retention) — narrow extension shipping with the cleanup UI.
- Adding `chat_message` / `painting` / `note` as `FileRefSourceType`
variants (tuple entry + schema + checker added together) — gated on each
domain's v2 batch migration.
- Cleanup-UI surface that consumes `getOrphanReport()` — Phase 2
renderer work.

Renderer-side File IPC bridge for write/dangling methods stays deferred
to Phase 2 alongside the consumer-batch migrations. The Phase 1b runtime
is consumable from main-side business services through
`application.get('FileManager')`.

### Why we need it and why it was done in this way

Contract-first concentrates design review in one place; Phase 1b.x then
becomes pure "honor the contracts". Each 1b.x phase keeps strict TDD
(RED → GREEN → REFACTOR per behavior, ~one commit per cycle); each phase
ends with a verification gate (push → CI green) before the next phase
begins.

Core decisions (origin two-state, `FileEntry`/`FileInfo` split, DataApi
SQL-only, external `permanentDelete` DB-only, TTL `DanglingCache`, OCC
trust boundary, atomic write fsync default, etc.) and their rationale
are recorded in `file-manager-architecture.md §12 Key Design Decisions`
— not duplicated here.

### Breaking changes

None — purely additive (read+write paths are new, no existing callers
replaced yet).

### Special notes for your reviewer

- Review focus: contracts (1a) + read-path runtime (1b.1) + write-path
runtime (1b.2) + watcher/DanglingCache (1b.3) + orphan sweep / registry
(1b.4). Phase 1b is now complete on this branch.
- **Phase 1a contract stability policy** (architecture.md top) is
binding — any 1b.x PR that finds a contract mismatch PRs the doc
revision first.
- Deferred (Phase 2): renderer-side File IPC bridge for
write/dangling/orphan methods (alongside consumer migration); cleanup-UI
surface consuming `getOrphanReport()`; architecture §7.2
dangling-external auto-cleanup (>30d retention); adding `chat_message` /
`painting` / `note` as `FileRefSourceType` variants (each adds tuple
entry + schema + checker in lockstep, gated on the owning domain's v2
batch migration); DanglingCache periodic snapshot logger (architecture
§11.8); `listDirectory` ripgrep wrapper; `compressImage`
(KnowledgeService consumer); FileUploadService + `file_upload` table
(Vercel AI SDK Files API).

### Checklist

- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: Write code that humans can understand and Keep it simple
- [x] Refactor: You have left the code cleaner than you found it (Boy
Scout Rule)
- [ ] Upgrade: N/A — purely additive
- [ ] Documentation: Internal architecture docs included
(`docs/references/file/`); no user-facing docs change
- [x] Self-review: I have reviewed my own code before requesting review
from others

### Release note

```release-note
NONE
```

---------

Signed-off-by: icarus <eurfelux@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-05-13 16:06:10 +08:00
fullex
a3498dc4a0 fix(data-api): enforce NOT NULL on boolean columns and drop dead fallbacks
Boolean columns without .notNull() infer as boolean | null, forcing every
reader into the row.x ?? default fabricated-fallback pattern that R3
forbids. Sweep all schemas and pair .notNull() with the existing default
on columns whose NULL carries no domain meaning:

- user_provider.isEnabled (NOT NULL DEFAULT true)
- user_model.{isEnabled, isHidden, isDeprecated} (NOT NULL with existing
  defaults)
- mini_app.bordered (NOT NULL DEFAULT true)

Booleans kept nullable on purpose, because NULL carries a real third-state
meaning: mcp_server.{longRunning, shouldConfig, isTrusted} and
user_model.supportsStreaming.

Drop the now-dead fallbacks:

- ProviderService.rowToRuntimeProvider: row.isEnabled ?? true
- ModelService.rowToRuntimeModel: row.isEnabled ?? true, row.isHidden ?? false
- modelMerger.mergeModelWithUser: userModel.isEnabled ?? !(catalogOverride?.disabled ?? false)
  and userModel.isHidden ?? false

The merger change is a design adjustment, not just a refactor:
user_model.isEnabled no longer falls back to catalogOverride.disabled.
Each model row now carries its own authoritative isEnabled, and the
create path (mergePresetModel + mergedModelToNewUserModel) writes the
catalog-derived value at INSERT time.

Tighten modelMerger.UserModelRowSchema to require these fields; use
satisfies NewUserModel in dtoToNewUserModel so the constructed object
type narrows to a concrete shape without a manual intersection patch.

Regenerate drizzle migration; tables rebuilt in-place under v2 throwaway
policy.
2026-05-10 22:20:19 -07:00
hello_world
3a508c987b refactor(miniapp): migrate to v2 data layer (#14049)
### What this PR does

Migrates the MiniApp feature from v1 (Redux + sidecar
`custom-minapps.json`) to the v2 data architecture (DataApi + Preference
+ Cache), and integrates it into the v2 AppShell tab system.

**Before this PR**
- App lists lived in three Redux arrays (`enabled` / `disabled` /
`pinned`); custom-app logos were stripped before persistence and
recovered at runtime from `{userData}/Data/Files/custom-minapps.json`.
- Settings (`region`, `max_keep_alive`, `open_link_external`,
`show_opened_in_sidebar`) lived in legacy redux/electron-store.
- Runtime keep-alive used a module-level `lru-cache` singleton, mirrored
into v2 cache via `onInsert` / `disposeAfter` (two sources of truth —
already a known race).
- Routes were `/app/minapp/*`; sidebar icon literal was `'minapp'`.
- Sidebar mode used the legacy popup container; top-navbar mode was
non-functional.

**After this PR**
- A single `mini_app` SQLite table owns every row (preset + custom).
Preset rows are seeded by `MiniAppSeeder` from `PRESETS_MINI_APPS` on
every boot; custom rows come in via `POST /mini-apps`. The seeder uses
`setWhere isNotNull(presetMiniappId)` so refreshing preset display
fields can never overwrite a custom row whose `appId` happens to collide
with a preset.
- `MiniAppMigrator` imports v1 Redux state and reads
`custom-minapps.json` (path resolved through
`MigrationPaths.customMiniAppsFile`) to recover stripped logos.
- Settings live under typed Preference keys
(`feature.mini_app.{region,max_keep_alive,open_link_external}`); sidebar
icon literal renamed `'minapp'` → `'mini_app'` with a complex preference
transform that rewrites existing user arrays in-place.
- API: `GET/POST/PATCH/DELETE /mini-apps` + `POST
/mini-apps/order:batch`, Zod-validated, fractional-indexing ordering
scoped by `status` (cross-status batches are rejected with
`VALIDATION_ERROR` per the data-ordering-guide contract). Status
transitions reassign `orderKey` to the tail of the target partition
inside a transaction.
- Renderer hook `useMiniApps` exposes **command-style** writes only:
`updateAppStatus(id, status)` and `setAppStatusBulk([{id, status}])`.
The legacy declarative `updateMiniApps(list)` /
`updateDisabledMiniApps(list)` / `updatePinnedMiniApps(list)` are gone —
they took region-filtered subsets and silently disabled rows the caller
never saw.
- Keep-alive list is stored solely in
`useCache('mini_app.opened_keep_alive')`. Cap eviction respects AppShell
pin status: `useMiniAppPopup` reads pinned mini-app routes from
`useTabs` and skips them in eviction. `MiniAppTabsPool` renders webviews
in a stable `appId`-sorted order so LRU reorders never move `<webview>`
DOM nodes (Electron `<webview>` loses its guest WebContents on
detach/reattach).
- **Unified launch path**: clicking any miniapp (from the launcher grid
or a top tab bar entry) calls `openTab('/app/mini-app/<id>', { title,
icon: app.logo })`. A globally-mounted `<MiniAppTabsPool>` in `AppShell`
keeps a `<webview>` alive per opened app, regardless of sidebar vs
top-navbar layout.
- Settings UI rewritten as a `PageSidePanel` drawer composed of
`MiniAppListPair` (visible / hidden columns with drag-drop) and
`MiniAppDisplaySettings` (region / cache slider). New custom-app form is
a separate `NewMiniAppPanel` drawer.
- Sidebar's running-mini-apps strip removed — opened apps live
exclusively in the top tab bar (per #3198804265). Companion preference
`feature.mini_app.show_opened_in_sidebar` deleted from the schema.

### Why we need it and why it was done in this way

Part of the broader v2 data-layer migration (Redux/Dexie/ElectronStore →
DataApi + Preference + Cache).

**Architecture**
- DataApi for entity rows (preserves user content); Preference for
atomic settings; Cache (Memory tier) for runtime ephemera.
- Layered preset pattern (`best-practice-layered-preset-pattern.md`):
preset and custom rows share the same table, discriminated by
`presetMiniappId`. Seeder refreshes preset display fields on re-run;
custom rows are immutable to the seeder.
- Region filtering is a **view-only** concern (read path); the write
path is command-style and never references region. This eliminated a
class of bugs where editing the visible (filtered) list caused
region-hidden rows to drift.
- AppShell tab pinning is the canonical "keep this loaded" signal. The
keep-alive cap respects it; pinned mini-app tabs never get evicted
regardless of cap. Render-order independence in `MiniAppTabsPool`
ensures LRU touches don't move `<webview>` nodes around.
- Per-app icon resolution: `app.logo` is a `CompoundIcon` id (e.g.
`"Moonshot"`) for presets and a URL for custom apps. UI consumers (tab
bar, sidebar entry, settings list) call `getMiniAppsLogo` to resolve the
id to a `CompoundIcon` before rendering, with `<img>` fallback for URL
strings.
- Per-entity tab icons are cleared on internal navigation, sidebar
reuse, and the top-bar settings button — three call sites that all flip
the active tab's URL now consistently reset `icon: undefined` so a
mini-app logo never sticks onto an unrelated route.

**Tradeoffs**
- `useMiniApps` still exposes `miniapps` (region-filtered
enabled+pinned) and `disabled` (region-filtered). These are display-only
views. Renamed/typed wrappers were considered but deferred — the
refactor to command-style writes already eliminated the bug class that
motivated the rename.
- The `applyReorderedList` integration test for
`reorderMiniAppsByStatus` was dropped — `MockUseDataApiUtils` doesn't
fill the SWR cache that `useReorder.readCurrent` reads. Splice logic is
straightforward and the server-side `applyScopedMoves` test covers the
contract.
- Sidebar primitives in `@cherrystudio/ui`-adjacent layout still accept
`miniAppTabs` / `onMiniAppTabClick` props (defensive defaults — render
nothing without a producer). Removing these from the primitive's API is
a separate refactor not in scope.

### Breaking changes

User-visible changes are auto-migrated by the v2 migration framework —
no manual user action required:
- Sidebar icon literal `'minapp'` → `'mini_app'` (rewritten by the
`sidebar_icons_rename` complex preference transform)
- Preference key rename `feature.minapp.*` → `feature.mini_app.*`
(auto-migrated via `classification.json`)
- Custom-app logos stripped from v1 Redux are recovered from
`custom-minapps.json` during migration

One product-shape change is documented under
`v2-refactor-temp/docs/breaking-changes/`:
- `2026-05-07-miniapp-sidebar-running-list-removed.md` — the sidebar no
longer surfaces opened mini-apps under the mini-app entry. Open apps are
accessed exclusively via the top tab bar; pin a tab to keep its state
across switches.

The legacy v1 preference `showOpenedMinappsInSidebar` is reclassified as
`status: deleted` in the migration pipeline; v1 values are dropped
during v1→v2 migration with no v2 destination.

### Special notes for your reviewer

**Verified end-to-end on a real dev profile**: v1 Redux state +
`custom-minapps.json` → v2 SQLite, including pinned-app cross-group
dedup (a v1 pinned app appears in both `pinned` and `enabled` Redux
arrays; the migrator counts duplicates as skipped so the engine's
`targetCount >= sourceCount - skippedCount` invariant holds — without
this, any user with pinned miniapps was blocked from migrating).

**Drizzle migrations** are throwaway in dev per `CLAUDE.md`.
`migrations/sqlite-drizzle/0020_even_hulk.sql` is the single regenerated
migration; it will be wiped to a clean initial migration before release.

**Review history**: 28 line-comments across multiple formal review
rounds. All resolved. The most consequential fixes:
- `applyScopedMoves` in `MiniAppService.reorder` — rejects cross-status
batches with `VALIDATION_ERROR` instead of silently splitting them.
- `update()` reassigns `orderKey` to a fresh tail in the target
partition on status change.
- Empty-string substitution in migrator mappings is now caught by the
post-transform validity check; bad rows are skipped + warned, never
inserted.
- Migrator validation switched from `limit(5)` sample to full `count(*)
WHERE empty-fields` — bad rows can no longer pass validation by virtue
of being beyond the sample window.
- Keep-alive cap exempts pinned tabs (#3198809321 + the kangfenmao
keepalive review); render order in `MiniAppTabsPool` is `appId`-stable
so LRU touches don't move `<webview>` nodes (this was the root cause of
"switching tabs reloads the webview").

**Out of scope**:
- The remaining `@renderer/store/tabs` import in
`PaintingsRoutePage.tsx` is pre-existing v1 residual (not introduced or
touched by this PR).

### Checklist

- [x] PR: description rewritten to reflect the final architecture +
integration with the AppShell tab system
- [x] Code: command-style writes (`updateAppStatus` /
`setAppStatusBulk`); see `useMiniApps`, `MiniAppService`,
`MiniAppMigrator`, `MiniAppTabsPool`, `useMiniAppPopup` for the main
entry points
- [x] Refactor: ~1500 lines of dead/legacy code removed
(`Tab/TabContainer`, `TabsService`, `MiniAppPopupContainer`,
`TopViewMiniAppContainer`, legacy LRU singleton, `PinnedMiniApps`, dead
`userOverrides` / `MiniAppRegistryService`, unused `Signal.ts`)
- [x] Upgrade: v1 → v2 migration verified end-to-end on a real dev
instance
- [x] Documentation: architecture covered by `docs/references/data/`;
one user-visible behavior change documented in
`v2-refactor-temp/docs/breaking-changes/`
- [x] Self-review: multi-agent review via `/gh-pr-review` (twice); all
28 review comments resolved

### Release note

```release-note
NONE - Internal v2 data refactor. User-facing renames (route, sidebar icon, preference keys) are auto-migrated. The sidebar no longer shows a running-mini-apps strip; opened apps live in the top tab bar.
```

---------

Signed-off-by: suyao <sy20010504@gmail.com>
Signed-off-by: chengcheng84 <hello_world0000@outlook.com>
Co-authored-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>
2026-05-07 20:45:20 +08:00
jd
bfa25bc83c refactor(prompt-management): simplify prompt management (#13430)
Co-authored-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-05-07 19:43:48 +08:00
LiuVaayne
28df7e872b refactor(agent-schemas): migrate agent/session/task IDs to UUID v4 and remove builtin-agent special logic (#14669)
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 20:34:09 +08:00
槑囿脑袋
434d4a938f refactor(knowledge-data): adjust knowledge v2 data and service (#14719)
Co-authored-by: fullex <0xfullex@gmail.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-05-01 19:24:48 +08:00
SuYao
1ccb306b30 feat(topics): migrate ordering + pin to canonical fractional-indexing pattern (#14627)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-30 21:18:25 +08:00
fullex
5677ab62bd refactor(data-api): tighten R1/R3 violations in agent/topic/message data
Apply NOT NULL constraints where NULL has no domain meaning:
- agent / agentSession: description, instructions, mcps, allowedTools,
  configuration, accessiblePaths, slashCommands (session only)
- agentGlobalSkill.tags
- message.searchableText, message.siblingsGroupId
- topic.name, topic.{isNameManuallyEdited, sortOrder, isPinned, pinnedOrder}
- miniapp.sortOrder

Drop rowMapper "?? fallback" patterns; preserve genuine T|null contracts
(agentSessionMessage.agentSessionId now passes NULL through, with the Zod
entity tightened to .nullable() to match).

Migrate product-chosen DB DEFAULTs to the service layer:
- agentTask.status DB DEFAULT removed; service was already supplying 'active'
- agentGlobalSkill.isEnabled DB DEFAULT flipped from true to false to match
  SkillService.install behavior

Drop Zod .default([]) from CreateAgentSchema.accessiblePaths so the
service-layer computeWorkspacePaths() is the single runtime default source.

Update FTS5 triggers to COALESCE the group_concat result to '' so
messages with no main_text blocks don't violate the new NOT NULL on
searchable_text.

Refs: docs/references/data/best-practice-default-values-and-nullability.md
2026-04-29 08:35:46 -07:00
jd
2e0f7579a7 fix(assistant-api): align patch defaults with entity schema (#14689)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-29 20:19:03 +08:00
fullex
12e4e60005 feat(data-api): add group and pin resources with polymorphic pin table
Two sibling resources land together because their migration SQL is
generated as one unit and their API-schema / handler wiring has to go
in the same commit to keep the repo compiling.

group: upgrade the existing group table from `sort_order INT` to
`order_key TEXT` (`scopedOrderKeyIndex('group', 'entityType')`) and
expose it as a first-class resource under /groups. Each entityType
owns an independent orderKey sequence. Reorder delegates to the new
applyScopedMoves helper.

pin: brand-new polymorphic table `(id, entityType, entityId, orderKey,
timestamps)` with UNIQUE(entityType, entityId) enforcing idempotency
at the DB layer. One table serves arbitrarily many consumers — same
precedent as entity_tag. No FK to consumer tables; every consumer
service's delete path must call `pinService.purgeForEntity(tx,
entityType, entityId)` to keep the two tables in sync (signature is
tx-first, the mainstream ORM convention; tagService.removeEntityTags
is the project's historical tx-last outlier and is left as-is).

PinService.pin is idempotent and concurrent-safe: a fast-path SELECT
returns existing rows, and a UNIQUE collision under concurrent INSERT
is caught, classified, and re-SELECTed so the caller never sees a
constraint error.

Handler layer is a thin Zod-parse shell — all scope inference, row
lookup, and orderKey computation live in the services.
2026-04-21 23:28:14 -07:00
fullex
11be90175f fix(drizzle-config): exclude test files from schema glob
The previous glob `./src/main/data/db/schemas/*` matched the
`__tests__` directory, and drizzle-kit's `prepareFilenames` expands any
matched directory by reading its immediate files regardless of
extension. That pulled `_columnHelpers.test.ts` into the schema load
path, and its `import ... from 'vitest'` blew up because drizzle-kit
loads schemas via CJS `require()` (vitest is ESM-only).

Switch to a recursive pattern that also excludes `*.test.ts` by name,
so the config tolerates future subdirectory-organised schemas without
re-introducing the same class of bug.
2026-04-21 23:11:25 -07:00
fullex
375478371c refactor(data-schema): tighten createUpdateTimestamps with NOT NULL
Add `.notNull()` to `createdAt` / `updatedAt` in the shared
`createUpdateTimestamps` helper so Drizzle `$inferSelect` produces
`number` instead of the misleading `number | null`. `deletedAt` in
`createUpdateDeleteTimestamps` stays nullable (soft-delete semantics).

Generated migration 0013 rebuilds 26 affected tables via the standard
SQLite table-recreation pattern; FK / CHECK / INDEX constraints are
preserved across rebuild. No backfill is added (project is in the
development phase; null pre-existing rows are accepted as a "wipe DB"
signal rather than engineered around).

Fix upstream in `KnowledgeMappings.toTimestamp` so it returns a
`Date.now()` fallback instead of `undefined` — otherwise future Dexie
-> v2 migrator runs would try to insert undefined into NOT NULL
columns. Three test assertions updated from `undefined` to
`expect.any(Number)`.

Sweep 18 downstream call sites across 9 functions that were carrying a
dead `?? new Date().toISOString()` fallback:

- AssistantService, KnowledgeBaseService, KnowledgeItemService,
  MessageService, TopicService, TranslateHistoryService,
  TranslateLanguageService (the original Pattern A set)
- McpServerService and MiniAppService.rowToMiniApp (reclassified from
  Pattern B: the domain types stay `optional` to accommodate builtin
  literals in the renderer, but `string` assigns legally into
  `string | undefined`, so the switch is safe)

Keep `MiniAppService.builtinToMiniApp` on `timestampToISOOrUndefined`
— its `dbRow?: MiniAppSelect` semantics ("the preference row may not
exist at all") is genuinely optional, not a disguised "nullable column".

Also remove a Pattern C that neither the plan nor the grep audit
caught: `TagService.ensureTagTimestamp` was a self-rolled defense
layer that threw INTERNAL_SERVER_ERROR on null timestamps. The DB
now refuses to produce such rows, so the defense — and the test
named "should surface timestamp anomalies instead of masking them" —
are dead code. Removed both.

Update three docs to reflect the new defaults:

- `services/utils/README.md` — drop the "DB still nullable" table row
  and the predictive paragraph; reframe Pattern B around
  "whole row may not exist"
- `services/utils/rowMappers.ts` JSDoc — same reframing
- `docs/references/data/data-api-in-main.md` — delete the fallback
  code samples and simplify Convention §3
2026-04-21 03:21:05 -07:00
LiuVaayne
9f3be4bc19 feat: migrate agents storage to v2 main database (#14159)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Co-authored-by: SuYao <sy20010504@gmail.com>
2026-04-20 21:43:49 +08:00
SuYao
0c6e62181b refactor(data): unify legacy model ID conversion and add FK references to user_model (#14257)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 21:50:48 +08:00
SuYao
13b333e9b3 feat(v2): decouple assistant from topic with dedicated table and migration (#13851)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-13 12:55:06 +08:00
槑囿脑袋
798a15e919 feat(v2): knowledge service backend (#14090)
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-10 21:24:06 +08:00
jidan745le
780a884c67 feat(data): provider/model data migration and registry service (backend only) (#14115)
### What this PR does

Before this PR:
- v2 migration did not include provider/model data migration from legacy
`llm` state.
- Provider/model data APIs and handlers were incomplete.
- `@cherrystudio/provider-registry` (formerly provider-catalog) package
was not integrated into the data layer.

After this PR:
- Add provider/model migration path (`ProviderModelMigrator` + mappings)
and register it in v2 migrator flow.
- Add `@cherrystudio/provider-registry` package with JSON-based registry
data, Zod-validated schemas, and lifecycle-managed
`ProviderRegistryService`.
- Complete provider/model schemas, services, handlers, shared API
schemas/types, and model merger utility.
- Complete provider API endpoints (`registry-models`, `auth-config`,
`api-keys`) aligned with lifecycle DI patterns.

**Note:** This PR is intentionally scoped to backend/data-layer only.
Renderer consumer migration will be submitted in a separate PR to
maintain domain separation.

Fixes #

N/A

### Why we need it and why it was done in this way

The following tradeoffs were made:
- Kept migration and data API implementation within current v2
architecture (handler -> service -> db schema) instead of adding
temporary compatibility layers.
- Replaced protobuf toolchain with JSON + Zod validation for simpler
data pipeline and better debuggability.
- Converted all numeric enums to string-valued `as-const` objects
(EndpointType, ModelCapability, Modality, etc.) for runtime
debuggability.
- Unified separate `baseUrls`, `modelsApiUrls`, `reasoningFormatTypes`
fields into a single `endpointConfigs` map keyed by EndpointType.

The following alternatives were considered:
- Keep protobuf-based registry data; rejected due to complexity of proto
toolchain and poor debuggability of binary data.
- Include renderer consumer migration in same PR; deferred to separate
PR for cleaner domain boundaries.

Links to places where the discussion took place:
- Original combined PR: #14034

### Breaking changes

None.

### Special notes for your reviewer

- This is a backend-only extraction from #14034, which contained both
backend and renderer consumer code. The renderer migration will follow
in a separate PR.
- Please focus review on migration flow (`ProviderModelMigrator`),
provider/model service contracts, and the registry package design.
- The `@cherrystudio/provider-registry` package was renamed from
`provider-catalog` and uses JSON data files instead of protobuf.

### Checklist

- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: Write code that humans can understand and Keep it simple
- [x] Refactor: You have left the code cleaner than you found it (Boy
Scout Rule)
- [ ] Upgrade: Impact of this change on upgrade flows was considered and
addressed if required
- [ ] Documentation: Not required (internal data layer, no user-facing
changes)
- [x] Self-review: I have reviewed my own code

### Release note

```release-note
NONE
```

---------

Signed-off-by: jidan745le <420511176@qq.com>
Signed-off-by: suyao <sy20010504@gmail.com>
Co-authored-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-10 19:12:33 +08:00
hello_world
2851f4e68c rerefactor(miniapp): Miniapp V2 (#13468)
Co-authored-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-04 16:19:58 +08:00
槑囿脑袋
e08016c5ef Refactor(v2): add knowledge db schema (#13640)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-03-30 16:47:39 +08:00
SuYao
24dc1ce663 feat(v2): add MCP Server data API service and migrate renderer to v2 hooks (#13734)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 23:08:05 +08:00
Phantom
4ca6f56ec9 feat(data): add translate history and language backend API with langCode PK (#13739)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 20:29:14 +08:00
Phantom
3c90efcc56 refactor(data): migrate translate data to v2 cache/preference/dataapi architecture (#13264)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: suyao <sy20010504@gmail.com>
2026-03-23 12:24:29 +08:00
SuYao
4abf143fc7 feat(migration): add McpServerMigrator for v2 data migration (#13303)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 11:03:57 +08:00
fullex
3dfd5c7c2b feat: add custom SQL handling for triggers and virtual tables
- Introduced a new method `runCustomMigrations` in `DbService` to execute custom SQL statements that Drizzle cannot manage, such as triggers and virtual tables.
- Updated `database-patterns.md` and `README.md` to document the handling of custom SQL and its importance in maintaining database integrity during migrations.
- Refactored `messageFts.ts` to define FTS5 virtual table and associated triggers as idempotent SQL statements for better migration management.
2026-01-04 01:07:04 +08:00