54 Commits

Author SHA1 Message Date
zyairehhh
9333b31720 docs: add WeChat community QR code (#133) 2026-07-03 10:18:12 +08:00
kero-ly
8087110620 Add optional avatar background removal
Support immesive mode in 视频创作
2026-06-29 21:11:18 +08:00
zyairehhh
5473bb2665 feat: add local F5-TTS provider (#128) 2026-06-28 23:21:10 +08:00
zyairehhh
5516cd5675 docs: reorganize deployment guides (#127) 2026-06-28 13:22:46 +08:00
charm-ch
66bf0e5cd5 docs: add OpenTalking Wechat memory persona import guide 2026-06-25 14:25:21 +08:00
zyairehhh
7f37c3b49d feat: add local CosyVoice TRT sidecar deployment (#119) 2026-06-23 23:06:58 +08:00
cwang10
61f4007965 feat: add local CosyVoice runtime tuning 2026-06-23 18:15:27 +08:00
charm-ch
23429cde77 docs: add OpenTalking Huangshan digital human guide 2026-06-22 11:44:12 +08:00
kero-ly
71f87984c1 Fix missing mem0 pkg & fix hfdownload pkg version 2026-06-20 00:18:23 +08:00
zyairehhh
2128e1d256 docs: split quickstart paths 2026-06-19 18:36:06 +08:00
lyfics
f65f2bb5d0 feat: add LightRAG runtime config and quickstart updates
Squash the branch changes into a single commit.

Includes the LightRAG/memory workflow branch state, runtime-config API/UI, and quickstart service hardening.
2026-06-19 14:07:53 +08:00
charm-ch
e27b1c6501 Improve Mem0-backed memory workflows 2026-06-17 16:39:01 +08:00
Le0der
1f26a5c6d4 docs: add missing resnet18-5c106cde.pth to MuseTalk model directory layout 2026-06-17 16:37:31 +08:00
zyairehhh
61e85527b3 docs: enable versioned docs publishing (#103) 2026-06-16 17:55:29 +08:00
zyairehhh
2e8c9eb371 docs: refresh README and QuickTalk docs (#101) 2026-06-16 09:58:00 +08:00
Le0der
5d7181c214 docs:add WSL2 network mode selection guide for Windows deployment
Add section 1.3 documenting NAT vs Mirrored network mode behavior in WSL2, covering WebRTC ICE connectivity, browser microphone access,  and service startup compatibility based on real-world testing on Windows 11 + WSL2 Ubuntu 24.04 + RTX 3060.
2026-06-13 17:30:43 +08:00
zyairehhh
5cdcd8dd3d fix quicktalk local assets and support QuickTalk on Apple Silicon (#98) 2026-06-12 16:41:50 +08:00
zyairehhh
b6ffab2bb4 feat: improve IndexTTS and QuickTalk video creation (#95) 2026-06-12 00:59:07 +08:00
lyfics
1f42a4e73f feat: add LightRAG knowledge retrieval 2026-06-11 17:36:50 +08:00
cwang0810
3519989ba3 feat: add Persona Package support (#87) 2026-06-10 20:41:18 +08:00
zyairehhh
d7ebea81ab Improve MuseTalk deployment setup (#83) 2026-06-06 15:12:31 +08:00
kero
91b4cc4b13 add en page, optimize homepage for deploy 2026-06-06 15:11:04 +08:00
lyfics
0b301b2ce6 feat: adapt knowledge base asset workflow 2026-06-05 23:01:38 +08:00
zyairehhh
7d29d78b28 docs: place V100 guide in deployment recipes 2026-06-05 10:42:47 +08:00
zyairehhh
5fb0c51ed2 docs: reorganize model deployment guides (#79) 2026-06-05 08:29:52 +08:00
lyfics
566fe3d8b6 feat: add agent knowledge and audio video exports (#78)
* feat: add agent knowledge and audio video exports

* fix: satisfy mypy for video creation config parsing

* fix: avoid sync wait for external audio render sessions

* fix: remove offline bundle export controls from web
2026-06-04 21:54:29 +08:00
lucaszhu-hue
0ef7587aba docs: add Atlas Cloud as an OpenAI-compatible LLM provider option
Atlas Cloud exposes an OpenAI-compatible chat-completions endpoint, so it can
be used as a drop-in LLM backend via the existing OPENTALKING_LLM_* settings.

- README.md / README.en.md: Atlas Cloud blurb, link, and logo in the
  Supported Models section.
- docs/{en,zh}/model-deployment/llm-stt.md: provider table row, .env example,
  and the full Atlas chat model list.
- .env.example: commented Atlas Cloud example.

The example uses deepseek-ai/deepseek-v4-pro (a reasoning model — give it
enough max_tokens).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 21:37:00 +08:00
zyairehhh
5f81d36d59 docs: add WebUI video workflows (#75) 2026-06-04 00:16:55 +08:00
zyairehhh
ca33fdcd26 docs: reorganize model deployment guides (#73) 2026-06-03 23:11:54 +08:00
kero
71abcd5690 homepage design 2026-06-03 21:44:04 +08:00
zyairehhh
d0bf6bea4a feat: add FasterLivePortrait video clone workflow (#70) 2026-06-03 09:12:53 +08:00
lyfics
d700882655 docs: add benchmark guide, WSL2 fix, test results, and windows deployment guide
- Add E2E benchmark run instructions and WSL2 VRAM fix to benchmark.md
- Add benchmark test results table (RTX 3090/4090/3050, NPU 910B2)
- Add windows-deployment.md for WSL2 deployment guide
- Sync all changes to English counterparts
2026-05-28 13:38:46 +08:00
zyairehhh
3c893c522d feat: add local STT/TTS QuickTalk runtime config (#61) 2026-05-26 21:08:13 +08:00
charm-ch
ed6914deb2 feat: add local musetalk backend support 2026-05-26 00:01:52 +08:00
zyairehhh
f3532c19e9 docs: update QuickTalk weight download instructions (#58) 2026-05-22 21:01:07 +08:00
lyfics
58d6e53d8f Feat/docs update (#55)
* add product-demo-live-sales

* modify product-demo-live-sales

* docs: add three new example tutorials
2026-05-22 11:09:24 +08:00
kero
2be1148e69 website structure update 2026-05-21 23:24:52 +08:00
zyairehhh
f16f786889 feat: align avatar cache prewarm flow 2026-05-21 20:07:33 +08:00
zyairehhh
288cdf68cd feat: align local wav2lip runtime parity 2026-05-18 22:54:58 +08:00
keroly
41a3b5a53a readme update & support local quicktalk, wav2lip (#43)
* readme update
* feat: support quicktalk pth local backend
* feat: add local wav2lip adapter
2026-05-17 21:55:25 +08:00
zyairehhh
b1a859e823 feat: add realtime FasterLivePortrait support (#45) 2026-05-17 21:14:22 +08:00
cwang0810
b235184428 docs: restructure documentation navigation (#44) 2026-05-17 11:36:34 +08:00
zyairehhh
e0da9184ef feat: route QuickTalk through OmniRT audio2video (#41)
* feat: route quicktalk through omnirt
* docs: add quicktalk quickstart
2026-05-15 12:03:39 +08:00
charm-ch
7030897f60 适配MuseTalk 2026-05-14 11:49:27 +08:00
cwang0810
ff20603d70 docs: fix install and deployment guidance (#34) 2026-05-13 19:56:28 +08:00
cwang0810
8231d3fbe5 Docs: rebuild model deployment docs and add Pages publishing (#33)
* docs: rebuild model deployment docs and add Pages publishing
* ci: install docs dependencies in workflows
2026-05-13 17:54:11 +08:00
cwang0810
c9ac195708 Docs: restructure model backend documentation (#32)
* docs: restructure model backend documentation

* docs: clarify lightweight evaluation paths

* docs: split model deployment module

* docs: simplify models navigation

* docs: expand deployment runbook

* docs: simplify homepage capabilities

* docs: improve model docs and zh defaults
2026-05-13 17:25:22 +08:00
keroly
3fb0e12020 bugfix & update readme (#30)
Co-authored-by: kero <keroly950928@gmail.com>
2026-05-13 10:47:19 +08:00
keroly
587d1fb16b Refactor/architecture v2 (#27)
* docs: add architecture review and refactor plan

* chore: snapshot baseline test/lint output before refactor

* chore: catalog import sites that depend on deletable code

* refactor: remove src/opentalking/engine (FlashTalk local inference)

* refactor: remove local model implementations (musetalk/wav2lip/flashtalk-local)

* chore: remove demo media, multitalk_utils, duplicate env examples; relocate images

* refactor: consolidate configs to root configs/, drop src duplicate

* refactor: drop OPENTALKING_FLASHTALK_MODE; rebuild model registry shim

* refactor: remove dead src/opentalking/server (superseded by apps/api)

* refactor: drop src/opentalking/cli stubs in favor of apps/cli

* refactor: move worker test to tests/unit; stash bailian_clone for relocation

* chore: temporarily restore bailian_clone (will relocate in phase D)

* fix: stub legacy registry get_adapter/register_model for import compat

* feat: unified .env.example, hardware profiles, omnirt endpoint catalog

* feat: install/up/down/ensure_omnirt scripts + cuda/dev compose

* docs: rewrite README + quickstart + architecture pointer; drop flashtalk-omnirt

* chore: drop refactor scratch files

* refactor: relocate src/opentalking → packages/opentalking (no internal restructure)

* feat: core/registry.py + STTAdapter/SynthesisAdapter interfaces

* fix: untrack 'packages/' from gitignore so packages/opentalking/ commits

* refactor(D): reorganize into providers/{stt,tts,llm,rtc,synthesis} + media; mass-rewrite imports

* refactor(E): avatars→avatar, voices→voice (singular naming)

* refactor(F): worker → pipeline/{session,speak,recording} + runtime/

* refactor(G): drop legacy wav2lip official_runtime imports; remove empty configs/worker dirs

* feat(registry): wire all providers via core.registry decorators + bootstrap

* chore(cli): drop dead generate_video / gradio_app (engine removed)

* refactor(pipeline): extract pure helpers from synthesis_runner (audio_utils/env_helpers/idle_frames/tts_openers)

* docs(env): rebuild .env.example aligned with actual Settings model + .env

* refactor: flatten packages/opentalking → opentalking (drop empty packages/ wrapper)

* docs: refresh architecture diagram for flat opentalking/ layout

* feat: two-path deploy (docker mock-default + python venv) — mock synthesis, opentalking-doctor, run_omnirt.sh

* docs: rewrite Quickstart around 3 paths (mock / lightweight / high-quality); update Project layout to flat layout

* docs(env): rewrite .env.example to match README's 3 paths (progressive complexity)

* fix(mock): wire OPENTALKING_INFERENCE_MOCK end-to-end (Settings + API + MockFlashTalkClient + task_consumer)

* fix: restore idle_generator.py (thin client, deleted by mistake in phase B)

* docs: clarify OPENTALKING_FLASHTALK_WS_URL (active) vs OMNIRT_ENDPOINT (placeholder, not wired)

* feat(omnirt): wire OMNIRT_ENDPOINT end-to-end (URL resolver + WS auth headers + path-based model routing)

* fix: omnirt_auth_headers import (real symbol is auth_headers; aliased only in providers/synthesis/__init__)

* fix(omnirt): align default path template with OmniRT actual routing (/v1/avatar/{model})

* fix: 3 UX issues — avatar/model decouple by input form, drop OPENTALKING_INFERENCE_MOCK, doctor loads .env

* fix(sessions): drop avatar/model compatibility gate entirely (full decoupling)

* fix(tts): decouple Edge voices (whitelist→format check) + silently ignore tts_model on Edge

* fix(runtime): preserve user's chosen model_type (musetalk/wav2lip/mock no longer relabeled to flashtalk)

* chore(config): drop unused OPENTALKING_DEFAULT_MODEL (model always supplied per-session)

* feat(api): SUPPORTED_MODELS allowlist (mock,flashtalk) — reject musetalk/wav2lip with 400 instead of silent FlashTalk fallback

* feat(web): surface backend 400 detail in toast (e.g. 'model not yet supported')

* feat(avatars): allow deleting custom avatars (DELETE /avatars/{id} + frontend × button)

* 适配wav2lip384 (#23)

* feat(wav2lip): integrate avatar metadata for architecture v2
* fix(wav2lip): validate mouth metadata freshness
* docs(wav2lip): add chinese pr summary
* ci: update refactor lint paths
* ci: remove missing worker test path

* feat(wav2lip): add preprocessed video avatar support (#25)

* feat: route avatar models through omnirt audio2video

* feat(wav2lip): route postprocess mode via audio2video

* feat: streamline omnirt audio2video setup

* feat: support configurable wav2lip modes and refresh assets

* Add QuickTalk model adapter

* Fix QuickTalk prefetch type annotation

* Keep QuickTalk init asynchronous

* Document QuickTalk configuration

* Refine FlashHead adapter integration

* Update model status test for QuickTalk

* Update lockfile for QuickTalk dependencies

* Fix QuickTalk OpenCV fourcc typing

---------

Co-authored-by: kero <keroly950928@gmail.com>
Co-authored-by: zyairehhh <zyaireliu@outlook.com>
Co-authored-by: cwang10 <cwang10@mail.ustc.edu.cn>
2026-05-11 23:25:25 +08:00
cwang10
973a51528a Document QuickTalk configuration 2026-05-10 13:58:09 +08:00