Files
CherryHQ-cherry-studio/docs/references/data/data-api-in-main.md
fullex 25fe3343fc docs(data): add canonical pagination guide and wire references
Pagination docs were scattered across api-types.md (types + cursor
semantics), data-api-in-renderer.md (hooks), data-api-in-main.md (offset
example + keyset note), api-design-guidelines.md (query params), and
data-ordering-guide.md (cache shapes + determinism), with no single
discoverable home for the offset-vs-cursor model.

Add docs/references/data/data-pagination-guide.md as the canonical hub
(mirrors data-ordering-guide.md): two modes, four-layer quickstart, wire
contract, server impl (offset + keyset cursor + multi-band caveat),
renderer consumption, FTS pagination, gotchas, and a see-also map. Other
docs keep their authoritative slice and link to the guide; the migrated
conceptual prose is removed from api-types.md to avoid duplication.

Also fix two pre-existing broken anchors found while verifying links
(database-patterns withWriteTx; ordering guide section number).
2026-06-22 06:41:27 -07:00

470 lines
21 KiB
Markdown

# DataApi in Main Process
This guide covers how to implement API handlers and services in the Main process.
## Architecture Layers
```
Handlers → Services → Database
```
- **Handlers**: Thin layer, extract params, call service, transform response
- **Services**: Business logic, validation, transaction coordination, data access via Drizzle ORM
- **Database**: Drizzle ORM + SQLite
## Transport Adapters
ApiServer is transport-agnostic. Adapters in `api/core/adapters/` bridge specific transports (IPC, HTTP) to ApiServer. Each adapter implements `Disposable` for automatic lifecycle cleanup. See `IpcAdapter.ts` JSDoc for design rationale and extension guide.
## Implementing Handlers
### Location
`src/main/data/api/handlers/`
### Handler Responsibilities
- Extract parameters from request
- Delegate to business service
- Transform response for IPC
- **NO business logic here**
### Handler Type Annotation
Every per-module handler record **MUST** be annotated with `HandlersFor<XxxSchemas>`. This is the canonical shape for all files in `src/main/data/api/handlers/` — not a convention to choose among alternatives.
`HandlersFor<XxxSchemas>` enforces two invariants:
- **Paths are narrowed to the module's own schema.** Path strings outside `XxxSchemas` (typos, cross-module leaks) produce a compile error.
- **Methods are exhaustive.** Every `path + method` declared in `XxxSchemas` must have a handler; adding an endpoint to the schema without a matching handler is a compile error.
### Example Handler
```typescript
// handlers/topics.ts
import { topicService } from '@data/services/TopicService'
import type { HandlersFor } from '@shared/data/api/apiTypes'
import type { TopicSchemas } from '@shared/data/api/schemas/topics'
export const topicHandlers: HandlersFor<TopicSchemas> = {
'/topics': {
GET: async ({ query }) => {
const { page = 1, limit = 20 } = query ?? {}
return await topicService.list({ page, limit })
},
POST: async ({ body }) => {
return await topicService.create(body)
}
},
'/topics/:id': {
GET: async ({ params }) => {
return await topicService.getById(params.id)
},
PUT: async ({ params, body }) => {
return await topicService.replace(params.id, body)
},
PATCH: async ({ params, body }) => {
return await topicService.update(params.id, body)
},
DELETE: async ({ params }) => {
await topicService.delete(params.id)
}
}
}
```
### Register Handlers
```typescript
// handlers/index.ts
import { topicHandlers } from './topic'
import { messageHandlers } from './message'
export const allHandlers: ApiImplementation = {
...topicHandlers,
...messageHandlers
}
```
## Implementing Services
### Location
`src/main/data/services/`
### Service Responsibilities
- Business validation
- Transaction coordination
- Domain workflows
- Data access via Drizzle ORM
**Scope limit:** A DataApi service is the **data** business-logic layer — its domain workflows orchestrate **SQLite reads/writes only**, never fs/network/process/external-service side effects, even alongside a legitimate DB write and no matter how deeply nested. See [Hard Rule: No Non-Data Side Effects](./api-design-guidelines.md#hard-rule-no-non-data-side-effects).
### Cross-Service Table Access
Each table has exactly **one owning service** — the rule is split by access kind:
- **Writes (`insert` / `update` / `delete`) to a table you do not own: forbidden.** Call the owner's method (pass `tx` for transactional writes — owners' mutation methods accept `Pick<DbType, 'delete' | 'insert' | ...>` as the first arg). If a needed shape is missing, add a method on the owner; bulk needs get a bulk method (e.g. `purgeForEntitiesTx`).
- **Reads from a table you do not own: allowed when inlining is the simpler path.** A cross-table JOIN that combines the owner's table into your query in one round-trip is fine; reach for the owner's read API only when the read needs business logic the owner already encapsulates.
Why writes are strict: the owning service is the single source of truth for the table's invariants (unique indices, `orderKey` semantics, soft-delete, audit timestamps) and emits its mutation logs. Foreign writes split that knowledge across every caller and silence the log narrative.
`ProviderService.delete``pinService.purgeForEntitiesTx(tx, 'model', ids)`
`AssistantService.list` JOINs `entity_tag` + `tag` inline to load tags per assistant
`ProviderService.delete``tx.delete(pinTable).where(...)` directly
If you're tempted to write "going through `XxxService` would be over-engineering" — stop. A 5-line method on the owner is not over-engineering; a foreign service writing to its table is.
#### Breaking a circular dependency (`dataServiceRegistry`)
When two services call **each other** (A→B and B→A), a top-level `import { bService } from './BService'` forms a value-level import cycle the bundler cannot order. Do **not** paper over it with `await import('./BService')` at the call site — that infects the caller with `async`, hides the edge from static tooling, and is easy to reintroduce.
Resolve the sibling lazily through `dataServiceRegistry` instead:
- the sibling **self-registers** at the bottom of its module: `registerDataService('BService', bService)`
- the caller **resolves at call time**: `const bService = getDataService('BService')`
The registry imports services only as `import type`, so it stays a sink in the import graph and no value cycle can form. **Only the services that form a cycle are added to the registry and self-register; every other data service stays a plain direct-import singleton and never touches it.** Acyclic cross-calls keep using a plain direct import (e.g. `pinService` above) — reach for the registry **only** when a real cycle exists.
**Tests:** the registry is populated by module load — in production each service is loaded by its DataApi handler before any call runs. A unit test that drives a cross-service path must load the sibling so it self-registers, via a side-effect import:
```ts
import '@data/services/BService' // self-registers; otherwise getDataService throws "not registered yet"
```
Contract and rationale: `src/main/data/services/dataServiceRegistry.ts`.
### Example Service
> The `list()` below shows the **offset** pagination shape for illustration. The
> real `TopicService` is cursor-paginated (`listByCursor`); a production offset
> list looks like `AssistantService.list`. See the
> [Pagination Guide](./data-pagination-guide.md) for both server patterns.
```typescript
// services/TopicService.ts
import { eq, desc, sql } from 'drizzle-orm'
import { application } from '@application'
import { topicTable } from '@data/db/schemas/topic'
import { DataApiErrorFactory } from '@shared/data/api'
export class TopicService {
private get db() {
return application.get('DbService').getDb()
}
async list(options: { page: number; limit: number }) {
const { page, limit } = options
const offset = (page - 1) * limit
const [items, countResult] = await Promise.all([
this.db.select().from(topicTable)
.orderBy(desc(topicTable.updatedAt))
.limit(limit).offset(offset),
this.db.select({ count: sql<number>`count(*)` }).from(topicTable)
])
return { items, total: countResult[0].count, page, limit }
}
async getById(id: string) {
const [topic] = await this.db.select().from(topicTable)
.where(eq(topicTable.id, id)).limit(1)
if (!topic) {
throw DataApiErrorFactory.notFound('Topic', id)
}
return topic
}
async create(data: CreateTopicDto) {
this.validateTopicData(data)
const [topic] = await this.db.insert(topicTable).values(data).returning()
return topic
}
async update(id: string, data: Partial<UpdateTopicDto>) {
await this.getById(id) // Throws if not found
const [topic] = await this.db.update(topicTable)
.set(data).where(eq(topicTable.id, id)).returning()
return topic
}
async delete(id: string) {
await this.getById(id) // Throws if not found
await this.db.delete(topicTable).where(eq(topicTable.id, id))
}
private validateTopicData(data: CreateTopicDto) {
if (!data.name?.trim()) {
throw DataApiErrorFactory.validation({ name: ['Name is required'] })
}
}
}
export const topicService = new TopicService()
```
### Write-path defaults
`service.create()` passes a value into `db.insert(...).values({...})` **only** for columns that are `NOT NULL`, have neither a DB `DEFAULT` nor a `$defaultFn`, and are not already supplied by the DTO:
```ts
async create(dto: CreateXxxDto) {
return await this.db.insert(xxxTable).values({
...dto,
settings: dto.settings ?? DEFAULT_XXX_SETTINGS // service-owned default for a tunable product value
}).returning()
}
```
For everything else — fields with DB DEFAULTs, `$defaultFn` columns, or genuinely nullable columns — **omit the field from `values({...})`**. Drizzle leaves it out of the SQL; the DB applies its own default (or NULL for nullable columns). Restating the DB's knowledge in app code creates drift risk when defaults later change.
For the cross-layer placement decision tree, see [Default Values & Nullability](./best-practice-default-values-and-nullability.md).
### Row → Entity Mapping
Each Entity Service provides a `rowToEntity` function that bridges a Drizzle row to its domain entity. Use `nullsToUndefined` (from `services/utils/rowMappers.ts`) for the SQLite NULL → TypeScript `undefined` translation.
**Standard skeleton:**
```ts
import { nullsToUndefined, timestampToISO } from './utils/rowMappers'
function rowToMcpServer(row: typeof mcpServerTable.$inferSelect): McpServer {
const clean = nullsToUndefined(row)
return {
...clean,
type: clean.type as McpServer['type'], // narrow enum
installSource: clean.installSource as McpServer['installSource'],
createdAt: timestampToISO(row.createdAt),
updatedAt: timestampToISO(row.updatedAt)
}
}
```
Audit columns generated by `createUpdateTimestamps` are DB-level `.notNull()`, so `row.createdAt` / `row.updatedAt` narrow to `number` and `timestampToISO` is the default. `timestampToISOOrUndefined` is reserved for construction paths where the entire source row may be absent (e.g. `MiniAppService.builtinToMiniApp` merging a builtin definition with an optional `dbRow`).
**Advanced skeleton — preserving `T | null` fields:**
When the domain type declares a field as `T | null` (e.g. `KnowledgeBaseSchema.embeddingModelId: z.string().nullable()`), bypass `clean` for that field and reference `row` directly. `nullsToUndefined` narrows all top-level `null`s to `undefined` and would break the `T | null` contract if the field came from `clean`.
```ts
function rowToKnowledgeBase(row: typeof knowledgeBaseTable.$inferSelect): KnowledgeBase {
const clean = nullsToUndefined(row)
return {
...clean,
// Preserve `string | null` contract — bypass clean (which would narrow null → undefined)
embeddingModelId: row.embeddingModelId,
createdAt: timestampToISO(row.createdAt),
updatedAt: timestampToISO(row.updatedAt)
}
}
```
Rule of thumb: **domain field typed `T | null` → use `row.x`; domain field typed `T?` or `T` → use `clean.x` (or `...clean`).**
**When `nullsToUndefined + spread` is NOT a fit:**
Some `rowToEntity` functions do too much to benefit from spread. Keep them hand-written when any of the following apply:
- **Field renaming**: `row.parameters → domain parameterSupport` (ModelService)
- **Computed / merged fields**: `authType` derivation, `apiFeatures` merging from defaults (ProviderService)
- **Sensitive data sanitization**: `apiKeys` stripping — `...clean` would leak unsanitized values
- **Discriminator-driven field stripping with brand validation**: branded discriminated union where each variant declares only its own fields — `nullsToUndefined + spread` would emit absent fields as `undefined` and break the BO shape. Dispatch on the discriminator and call `schema.parse` per variant. Example: `FileEntryService.rowToFileEntry` for `FileEntry` (variants on `origin`); see `src/shared/data/types/file/fileEntry.ts` header (§"DB row vs Business Object") for the full DB-CHECK / BO-narrow rationale.
**Anti-pattern — `??` fallbacks for fabricated defaults:**
`row.x ?? '🌟'` / `row.x ?? []` inside `rowToEntity` is **forbidden**. The presence of such a fallback is reverse evidence that the column should be `NOT NULL` with a DB DEFAULT or `$defaultFn` — see [Default Values & Nullability § R3](./best-practice-default-values-and-nullability.md). The legitimate exception is when the entity field is genuinely `T | null` (e.g. `assistant.modelId`); then bypass `clean` and reference `row.x` directly to preserve the NULL contract — that is the **Advanced skeleton** above, not a `??` fallback.
**Conventions:**
1. **DB NULL ↔ domain `undefined` boundary.** Domain types under `@shared/data/types/*` use optional fields (`?:`) rather than `T | null`, aligning with the [Google TypeScript Style Guide](https://google.github.io/styleguide/tsguide.html) and keeping `null` from leaking to the renderer via IPC. `nullsToUndefined(row)` is the only place this translation happens.
2. **Batch vs single-field null handling.** For processing an entire row, always use `nullsToUndefined(row)` + spread — do NOT hand-write per-field `?? undefined`. For single values that are NOT from a row (DTO fields, computed values, function returns), inline `value ?? undefined` is enough — TypeScript narrows `T | null` to `T | undefined` automatically at the `??` expression. Do NOT wrap the single-field case in a helper.
3. **Date fields: two helpers, clear boundary.** `timestampToISO(value: number | Date): string` is the default for `rowToEntity` — audit columns from `createUpdateTimestamps` are `.notNull()`, so the DB row hands back a real `number`. `timestampToISOOrUndefined(value: number | Date | null | undefined): string | undefined` is reserved for merge paths where the source row itself may be absent (e.g. a builtin/preset definition without a preference row). Do NOT use `timestampToISOOrUndefined` as a "safer default" — if your input is a DB row, it always has these fields.
For function signature details and design-decision history (e.g. why shallow-not-recursive, why not `dnull`), see [services/utils/README.md](../../../src/main/data/services/utils/README.md).
**Cursor (keyset) pagination.** List endpoints that page by a `(sortKey, id)` tuple use the shared codec + ordering helper in `services/utils/keysetCursor.ts``decodeListCursor` / `encodeCursor` for the `<key>:<id>` wire format, and `keysetOrdering(keyCol, idCol, { major, tie })` which returns both the strict-tuple WHERE predicate (`.where(cursor)`) and its matching `orderBy`, derived from one direction spec. Do NOT hand-write cursor encode/decode, the keyset WHERE tuple, or the `ORDER BY` in a service. See [services/utils/README.md](../../../src/main/data/services/utils/README.md) for the list-vs-search decode policy split and boundaries, and the [Pagination Guide](./data-pagination-guide.md) for the end-to-end pagination model (offset + cursor).
### Service with Transaction
```typescript
async createTopicWithMessage(data: CreateTopicWithMessageDto) {
const db = application.get('DbService').getDb()
return await db.transaction(async (tx) => {
const [topic] = await tx.insert(topicTable).values(data.topic).returning()
const [message] = await tx.insert(messageTable).values({
...data.message,
topicId: topic.id
}).returning()
return { topic, message }
})
}
```
### Transaction Method Naming
Service methods accepting a Drizzle transaction:
| Rule | |
|---|---|
| Parameter position | `tx` is the **first** parameter |
| Method name | ends with `Tx` |
| Parameter type | `Pick<DbType, '...'>` with the minimum operations needed |
| Non-Tx wrapper | optional; thin `db.transaction(...)` wrapper, only when a caller needs to own the transaction |
```ts
// ✅
async purgeForEntityTx(tx: Pick<DbType, 'delete'>, entityType: EntityType, entityId: string): Promise<void>
// ❌ tx not first
async purgeForEntity(entityType: EntityType, entityId: string, tx: Pick<DbType, 'delete'>)
// ❌ missing Tx suffix
async purgeForEntity(tx: Pick<DbType, 'delete'>, entityType: EntityType, entityId: string)
// ❌ over-broad type
async purgeForEntityTx(tx: DbType, entityType: EntityType, entityId: string)
```
Optional non-Tx wrapper:
```ts
async purgeForEntity(entityType: EntityType, entityId: string): Promise<void> {
await this.db.transaction((tx) => this.purgeForEntityTx(tx, entityType, entityId))
}
```
## Repository Pattern (Strongly Discouraged)
> **⚠️ Do NOT create Repository files by default.** Services handle both business logic and data access directly via Drizzle ORM. This is an intentional design decision.
>
> Only create a separate Repository when you are **1000% certain** it is absolutely necessary — e.g., extremely complex multi-table queries with joins/CTEs that would make the Service unreadable, AND the query logic is reused across multiple services.
>
> If in doubt, keep it in the Service. The overhead of an extra architectural layer is not justified for this project's scale (Electron desktop app + SQLite).
### Registry Services (Supplementary)
> In rare cases where a handler needs to merge **read-only preset data**
> (shipped JSON/TS) with database data, a Registry Service may be introduced.
> This is uncommon — the vast majority of services are Entity Services.
Registry Services:
- **Do NOT own a database table** and **do NOT access the database directly**
- Obtain DB data by calling the owning Entity Service
- Named `{Domain}RegistryService` (e.g., `ProviderRegistryService`)
- Primary data source is static preset data (JSON files, TS constants)
- All methods are read-only (no inserts, updates, or deletes)
See [Layered Preset Pattern](./best-practice-layered-preset-pattern.md) for the general architecture.
### Registry Sub-Resource Endpoints
Registry data reaches the renderer through sub-resource endpoints on the
owning entity. Three rules govern their shape.
**GET only.** Registry endpoints are stateless reads — preset merged with
DB rows. `POST` is reserved for state changes; using it for reads breaks
SWR caching, request dedup, and retry safety. For composite IDs containing
`/`, use the greedy path form `:id*` (see
[Greedy Path Parameters](./api-design-guidelines.md#greedy-path-parameters)).
For batched lookups exceeding URL limits, split into multiple GETs — DataApi
dedup makes burst reads cheap.
**Colon-notation for derived views.** When the sub-resource name is
ambiguous, disambiguate with AIP-136 colon notation:
| Shape | Use for |
|---|---|
| `GET /:parent/:id/:sub` | List the merged collection |
| `GET /:parent/:id/:sub:action` | Compute a derived view |
| `GET /:parent/:id/:sub/:childId` | Look up one merged item |
**Registry packages are main-only.** Packages like
`@cherrystudio/provider-registry` ship the preset data Registry Services
merge against. **Renderer code must not import them.** Two reasons:
- **Bundle waste.** Registry packages are large (preset catalogs, vendor
metadata, icons). Importing them in the renderer ships the same payload
twice — once in the main bundle, once in every renderer entry that
touches it — for data the renderer already gets via DataApi.
- **Merge already lives in main.** Registry Services merge preset + DB
rows on the main side. Re-doing the merge in the renderer duplicates
logic and re-introduces preset-version drift this layer was designed
to remove.
The merged result reaches the renderer exclusively through these endpoints.
## Error Handling
### Using DataApiErrorFactory
```typescript
import { DataApiErrorFactory } from '@shared/data/api'
// Not found
throw DataApiErrorFactory.notFound('Topic', id)
// Validation error
throw DataApiErrorFactory.validation({
name: ['Name is required', 'Name must be at least 3 characters'],
email: ['Invalid email format']
})
// Database error
try {
await db.insert(table).values(data)
} catch (error) {
throw DataApiErrorFactory.database(error, 'insert topic')
}
// Invalid operation
throw DataApiErrorFactory.invalidOperation(
'delete root message',
'cascade=true required'
)
// Conflict
throw DataApiErrorFactory.conflict('Topic name already exists')
// Timeout
throw DataApiErrorFactory.timeout('fetch topics', 3000)
```
## Adding New Endpoints
### Step-by-Step
1. **Define schema** in `src/shared/data/api/schemas/`
```typescript
// schemas/topic.ts
export type TopicSchemas = {
'/topics': {
GET: { response: PaginatedResponse<Topic> }
POST: { body: CreateTopicDto; response: Topic }
}
}
```
2. **Register schema** in `schemas/index.ts`
```typescript
export type ApiSchemas = AssertValidSchemas<TopicSchemas & MessageSchemas>
```
3. **Create service** in `services/`
4. **Implement handler** in `handlers/`
5. **Register handler** in `handlers/index.ts`
## Best Practices
1. **Keep handlers thin**: Only extract params and call services
2. **Put logic in services**: All business rules and data access belong in services
3. **Do NOT create separate Repository files**: Services own data access directly via Drizzle ORM
4. **Always use `.returning()`**: Get inserted/updated data without re-querying
5. **Support transactions**: Accept optional `tx` parameter in service methods
6. **Validate in services**: Business validation belongs in the service layer
7. **Use error factory**: Consistent error creation with `DataApiErrorFactory`
8. **Use `nullsToUndefined` in `rowToEntity`**: Canonical SQLite NULL → `undefined` translation; shallow, not recursive (see [Row → Entity Mapping](#row--entity-mapping))