17 Commits

Author SHA1 Message Date
槑囿脑袋
2fbc7bda1c feat(knowledge): optional embedding model with BM25-only fallback (#16553)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-07-02 20:19:21 +08:00
fullex
9b9570116a refactor(db): replace libsql with better-sqlite3 + sqlite-vec (#16626) 2026-07-02 13:21:13 +08:00
Phantom
4ef2889fd3 refactor(file-ref): split persistent ref ownership (#16532)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eurfelux <eurfelux@gmail.com>
2026-06-30 18:44:44 +08:00
fullex
ecffd880f0 refactor(store): delete the deprecated v1 Redux store
With ImportService migrated to DataApi (#16415), the v1 Redux store at
src/renderer/store/ has zero runtime consumers. Delete the directory (30
files) along with the already-stubbed main-process ReduxService bridge.

Clean up the now-stale residue:
- Remove dead vi.mock('@renderer/store') factories from 14 test files
  whose code under test no longer imports the store.
- Drop Redux from the CLAUDE.md "Removing" enumerations and refresh the
  renderer-architecture / knowledge-service / v2-todo docs.
- Fix dangling store/ references in the MiniAppMigrator and
  builtinMcpServers comments.

The v2 migration's Redux Persist readers (ReduxStateReader etc.) are left
untouched: they read serialized on-disk data, not this store.
2026-06-26 01:08:20 -07:00
槑囿脑袋
407b07652d feat(knowledge): re-attribute v1 directory vectors to per-file children on v2 migration (#16093) 2026-06-18 14:00:55 +08:00
槑囿脑袋
9e5fb7a071 feat(knowledge): url/note snapshots, note data source, and flat raw material storage (#16062)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-06-16 14:23:22 +08:00
槑囿脑袋
20035a83ff feat(knowledge): engine-portable per-base index store and retrieval cutover (#15973)
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-06-15 19:43:10 +08:00
槑囿脑袋
1382a8dd7c feat(knowledge): route embeddings and reranking through the AI service (#15796)
### What this PR does

Before this PR:

- Knowledge embeddings and reranking ran through the legacy
embedjs-based
knowledgeV1 stack with their own provider clients, independent of the
app's
  AI service.
- File-processing intake accepted several heterogeneous input shapes,
and
knowledge file items were tracked by FileEntry ids, coupling file
content to
  the file-manager entry/cache.

After this PR:

- Embeddings and reranking are routed through the unified `AiService`
(with
cherryin rerank support) and guarded by strict embedding-dimension
validation
  that rejects stale/mismatched vectors.
- File-processing intake is collapsed to a single path-based model;
knowledge
  file items are stored by base-relative path under the knowledge-base
directory, and v1 uploads are copied into the v2 base dir during
migration so
  migrated items stay reindexable/restorable.
- Legacy `knowledgeV1` is removed; the orchestration services were
renamed to
  `KnowledgeService` / `FileProcessingService`.
- Chat -> knowledge attach is temporarily disconnected (tracked TODO)
while the
  v2 file-manager bridge is rebuilt.

Fixes #N/A (no linked issue)

### Why we need it and why it was done in this way

Routing embeddings/rerank through `AiService` unifies provider handling
and
credentials and removes the parallel embedjs client stack and its v1
coupling.
Storing knowledge files by base-relative path (instead of FileEntry ids)
makes
each knowledge base self-contained and portable.

The following tradeoffs were made:

- A large, coordinated refactor plus a migration step that physically
copies v1
uploads into the v2 base dir, in exchange for removing the parallel
client
  stack and making bases self-contained.
- Base-relative path storage required a fail-fast/dedup strategy for
same-named
  files and a guard for blank legacy filenames.

The following alternatives were considered:

- Keeping the embedjs stack behind an adapter — rejected; perpetuates
the
  parallel client and v1 coupling.
- Keeping FileEntry-id storage — rejected; couples knowledge files to
the
  file-manager cache and blocks portability.

### Breaking changes

- `knowledgeV1` is removed. Legacy v1 knowledge data reaches v2 only
through the
  v2 migrators; there is no v1 fallback.
- The v2 knowledge HTTP API (API gateway) now returns v2-native
per-entry fields
(`embeddingModelId`, `createdAt` on base entries; `chunkId`,
`scoreKind`,
  `rank` on search results). The response envelope (`knowledge_bases`,
  `searched_bases`, `total`) is unchanged. See

`v2-refactor-temp/docs/breaking-changes/2026-06-05-knowledge-api-v2.md`.

### Special notes for your reviewer

- This branch went through several rounds of multi-agent code review.
The most
recent 6 commits address review findings: directory-import path
collisions,
migrated-file source copying + blank `relativePath` guard, addItems
rollback
error preservation, eager `document_to_markdown` output-target
validation, a
`CompletedKnowledgeBase` type guard, and breaking-changes doc
corrections.
- Chat -> knowledge attach is intentionally disconnected for now
(tracked in
  `v2-refactor-temp/docs/knowledge/knowledge-todo.md`).
- Local full `pnpm lint`/`pnpm test` was not run per the project's
review
  conventions; please rely on CI / `pnpm build:check`.

### Checklist

- [x] Branch: This PR targets the correct branch — `main` for active
development, `v1` for v1 maintenance fixes
- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: Write code that humans can understand and Keep it simple
- [x] Refactor: You have left the code cleaner than you found it (Boy
Scout Rule)
- [x] Upgrade: Impact of this change on upgrade flows was considered and
addressed if required
- [ ] Documentation: A user-guide update was considered and is present
(link) or not required.
- [x] Self-review: I have reviewed my own code before requesting review
from others

### Release note

```release-note
NONE
```

---------

Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 14:04:29 +08:00
槑囿脑袋
e818c374e6 refactor(knowledge): remove sitemap source from v2 (#15682)
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-06-06 16:26:15 +08:00
Yiran
61c013bd5b feat(knowledge-base): redesign knowledge workspace (#15518)
Co-authored-by: eeee0717 <chentao020717Work@outlook.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Co-authored-by: 槑囿脑袋 <70054568+eeee0717@users.noreply.github.com>
Signed-off-by: akazaakari950718-dev <akazaakari950718@gmail.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-06-02 16:03:37 +08:00
槑囿脑袋
2250ccb52f feat(knowledge): integrate file processing for document ingestion (#15470)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
Signed-off-by: icarus <eurfelux@gmail.com>
2026-05-31 23:10:36 +08:00
槑囿脑袋
9a54295c02 refactor(knowledge): knowledge workflow jobs round2 (#15371)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
2026-05-28 20:33:18 +08:00
槑囿脑袋
adefbb7efb refactor(knowledge): knowledge v2 rfc and impletent round 1 (#15309)
Co-authored-by: Phantom <eurfelux@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
Signed-off-by: icarus <eurfelux@gmail.com>
2026-05-27 11:54:25 +08:00
槑囿脑袋
434d4a938f refactor(knowledge-data): adjust knowledge v2 data and service (#14719)
Co-authored-by: fullex <0xfullex@gmail.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-05-01 19:24:48 +08:00
槑囿脑袋
7f5486ca51 fix(v2): remove knowledge libsql vector indexes (#14280) 2026-04-15 21:42:56 +08:00
槑囿脑袋
798a15e919 feat(v2): knowledge service backend (#14090)
Co-authored-by: SuYao <sy20010504@gmail.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-10 21:24:06 +08:00
亢奋猫
a83f98fd24 docs: consolidate bilingual docs, add link checker and architecture overview (#14138)
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
2026-04-09 16:01:40 +08:00