11 Commits

Author SHA1 Message Date
fullex
9b9570116a refactor(db): replace libsql with better-sqlite3 + sqlite-vec (#16626) 2026-07-02 13:21:13 +08:00
fullex
714c1a3dd3 docs(data): add database-construction reference and consolidate migration/custom-SQL/FTS docs
Add docs/references/data/database-construction.md as the single home for how the SQLite DB is built and evolved: boot init order, drizzle migrations (regenerate-never-rename, CI gates, additive-vs-rebuild), the CUSTOM_SQL_STATEMENTS every-boot replay (~0.1ms O(1)), and the FTS5 fts_rowid rowid-stability rule, plus a gotchas table. Move the Migrations and Custom SQL sections out of database-patterns.md into it (left as pointers), and index it from data/README.md and src/main/data/db/README.md.

Fix stale references found while consolidating: wrong generate command, customSql.ts vs customSqls.ts, columnHelpers.ts vs _columnHelpers.ts, a nonexistent messageFts.ts, yarn vs pnpm, the v2-todo single-0000 claim, the generated-column wording, and v1 data.blocks vocabulary in the testing doc.
2026-06-17 05:52:01 -07:00
fullex
fd81de8a32 feat(db-service): add withWriteTx for serialized writes (libsql #288)
libsql client-ts upstream issue #288 makes PRAGMA busy_timeout ineffective
for async transactions, so concurrent db.transaction() calls reliably surface
SQLITE_BUSY. Introduce DbService.withWriteTx as a serialized write helper:

- Process-wide FIFO mutex (async-mutex) serializes write transactions.
- libsql client's default BEGIN IMMEDIATE protects against read-then-write
  tx upgrade failures (no override needed at the drizzle layer).
- Single 50ms BUSY retry guards against transient external locks.

Reads do NOT need this — WAL gives readers snapshot isolation that is never
blocked by writers.

Includes unit tests (FIFO ordering, finally release on throw, single BUSY
retry, persistent BUSY rethrow, non-BUSY passthrough) plus a real-libsql
integration test. Updates the DbService test mock with a passthrough
withWriteTx so dependent services do not throw "is not a function" in
tests. Documents the API in database-patterns.md and points
CLAUDE.md / data-api-overview.md at the new pattern.
2026-05-21 04:54:43 -07:00
fullex
e1ba04463e docs(data-api): codify registry sub-resource & nullability conventions
- data-api-in-main: add Registry Sub-Resource Endpoints section -
  GET-only for stateless reads, AIP-136 colon notation for derived
  views, registry packages are main-only (bundle waste + merge
  already in main)
- best-practice-layered-preset-pattern: preset-only static fields
  must merge in rowToEntity rather than via parallel endpoint;
  document acceptable exceptions for catalog and specialised
  surfaces
- data-ordering-guide section 2: drop user_provider.isEnabled from
  the Live partition list - the table is whole-table ordered
  (already correct in section 7)
- database-patterns: flag boolean columns without .notNull() as a
  common R3 offender, with concrete wrong/right example
2026-05-10 21:56:51 -07:00
fullex
ee15daeef2 docs(data-api): correct drizzle-kit DEFAULT-change rationale
The "DB defaults are near-permanent" guidance previously claimed
drizzle-kit cannot auto-generate the SQLite table rebuild for DEFAULT
changes. That's incorrect: drizzle-kit emits the full PRAGMA / CREATE
__new_xxx / INSERT SELECT / DROP / RENAME / re-create-indexes sequence
automatically.

Rewrite the supporting argument from "tooling can't do it" to:
- every change forces a full-table rebuild at runtime (schema lock,
  ~2x temporary disk, slow on multi-GB tables);
- DEFAULT changes never touch existing rows;
- legacy NULL backfill must be hand-written into the rebuild's
  INSERT SELECT line via COALESCE - drizzle-kit will not synthesize
  that.

Conclusion (near-permanent) and Safe bias remain unchanged - only the
underlying mechanics are corrected.

Drop the drizzle-orm#2489 reference and the "drizzle-kit generate
--custom" workaround it implied.
2026-04-29 08:36:03 -07:00
fullex
c21eb8139e docs(data-api): elevate "DB DEFAULT is near-permanent" guidance
Codify what was implicit: putting a value into a DB column DEFAULT for the first time costs nothing, but changing it later in SQLite is expensive and asymmetric (no ALTER COLUMN SET DEFAULT; drizzle-kit emits only an explanatory comment without naming the affected column — drizzle-orm#2489). So the first write is effectively the final one, and the placement bias should flip from "DB by default" to "service unless certain".

Verified via drizzle-orm maintainer (Andrii Sherman) Medium article, drizzle-orm GitHub issues #2489 / #5360 / #1313, and Drizzle docs via context7. Empirical: this repo's 7 existing migrations are all ALTER TABLE ADD COLUMN — zero ALTER COLUMN — confirming the team has so far avoided the manual rebuild path organically.

Spec changes in best-practice-default-values-and-nullability.md:
- New section "DB defaults are near-permanent" between DM2 and Quick chooser
- DM2 → DB DEFAULT row note tightened, links to new section
- Quick chooser flipped: "unsure" now → Service ?? (was "try DB first")
- Standard Layered Design: emoji moved from DB DEFAULT to service ?? (product-chosen value)
- Anti-patterns: emoji-mask row's Correct column updated; new row for "speculative DB DEFAULT thinking I can tune later"
- Case Study A's Fix description aligned with new bias
- Related References: drizzle-orm#2489 added

Companion update in database-patterns.md: same precision for the DB .default('X') cell in "Where the default value lives".
2026-04-29 03:35:23 -07:00
fullex
69e9cb6184 docs(data-api): codify default values & nullability rules
Establish team standards for placing default values across the data stack (DB DEFAULT / Drizzle $defaultFn / Zod .default() / service ??) and judging column nullability.

Originating context: PR #14689 fixed a PATCH leakage bug rooted in defaults living in three places at once (DB, Zod Create schema, row mapper) for the assistant entity. The follow-up discussion recovered general principles that other entities (agent, message) also violate; this doc captures them as a reference for future schema/service work.

- New: best-practice-default-values-and-nullability.md — Five rules (R1-R5), decision matrices for nullability and default placement, standard layered design example, anti-patterns table, case studies (assistant, modelId, agent.accessiblePaths)
- api-design-guidelines.md: refine Rule C Update derivation guidance; add Rule E discouraging Zod .default() on entity / Create / Update schemas
- data-api-in-main.md: upgrade row-mapper ?? fallbacks from "needs hand-write" tolerance to anti-pattern; add Write-path defaults section codifying R4
- database-patterns.md: add Column Nullability and Defaults section; add R3 no-fabricated-fallbacks bullet to Row → Entity Mapping
- README.md: index entry under Reference Guides

No code changes. Implementation follow-up will land in separate PRs that apply these rules entity by entity.
2026-04-29 02:49:11 -07:00
fullex
f8f0e4b19a refactor(data-services): consolidate row-to-entity mapping utilities
Introduce shared `services/utils/rowMappers.ts` exporting three helpers:
`nullsToUndefined` (renamed from the previous `stripNulls`, with a
corrected type signature that preserves `notNull()` columns unchanged),
`timestampToISO` for guaranteed-present timestamps, and
`timestampToISOOrUndefined` for nullable ones.

Migrate 9 services off hand-rolled null/date handling:

- Eliminate the `stripNulls` duplicate in MiniAppService
- Replace 18 repetitive createdAt/updatedAt ternaries with helper calls
- Fix McpServerService misuse where an optional domain field was being
  forced to a synthesized "now" value; restore honest undefined semantics
- Simplify KnowledgeBaseService.rowToKnowledgeBase via the spread idiom,
  bypassing `clean` for the `T | null` embeddingModelId field

Document the paradigm in docs/references/data/data-api-in-main.md with
standard and advanced skeletons, plus an explicit "when NOT to spread"
list covering services with field renaming, custom fallbacks, computed
fields, or sensitive-data sanitization. Per-helper design decisions
(shallow vs. recursive, rejected alternatives) live in
services/utils/README.md.
2026-04-21 03:21:05 -07:00
fullex
40719bd77c fix(MessageService): map snake_case columns from raw SQL recursive CTEs
Drizzle's `casing: 'snake_case'` config only applies to the ORM channel.
Raw SQL via `db.all(sql`...`)` returns SQLite's native snake_case columns
with no runtime mapping — the TypeScript generic on `db.all<T>()` is a
compile-time assertion only. The recursive CTEs in `getTree`,
`getBranchMessages`, and `getPathToNode` used `SELECT *` and asserted
results to `messageTable.$inferSelect` (camelCase), so downstream code
silently read `undefined` from `parentId` and `siblingsGroupId`. This
broke multi-model message grouping whenever the renderer rebuilt
branches from the database (page refresh, topic switch).

Switch to the "CTE for IDs, ORM for rows" pattern: the recursive CTE
collects IDs only (single-word columns are casing-safe), then full rows
are fetched via `db.select().from(messageTable).where(inArray(...))`
where Drizzle applies camelCase mapping automatically. CTE order is
restored via a Map since SQL IN-list does not preserve order. Rename
`tree_depth` to `treeDepth` for naming consistency.

Document the pattern as the project standard in
`docs/references/data/database-patterns.md` and add regression tests
covering all three methods plus the multi-model siblings scenario.
2026-04-14 22:49:06 -07:00
fullex
3c140fc3be refactor(data): redesign database seeding architecture with SeedRunner and versioned seeders
Replace the ad-hoc seeding system with a journal-based architecture that
tracks seed versions via app_state table and skips unchanged seeds on startup.

- Introduce ISeeder interface with name/version/description/run() contract
- Add SeedRunner orchestrator with journal-based version tracking
- Rename ISeed -> ISeeder, migrate() -> run() (align with industry conventions)
- Rename *Seed -> *Seeder classes, *Seeding.ts -> *Seeder.ts files
- Move seeders into seeding/seeders/ subdirectory for better organization
- Add hashObject utility for auto-computing version from static data sources
- PreferenceSeeder/TranslateLanguageSeeder: auto checksum via hashObject()
- PresetProviderSeeder: lazy getter using RegistryLoader.getProvidersVersion()
- Simplify DbService.onInit() to single SeedRunner.runAll() call
- Add SeedRunner tests and PreferenceSeeder tests
- Add database-seeding-guide.md with version strategy documentation

Signed-off-by: fullex <0xfullex@gmail.com>
2026-04-10 06:29:11 -07:00
亢奋猫
a83f98fd24 docs: consolidate bilingual docs, add link checker and architecture overview (#14138)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-09 16:01:40 +08:00