datascale-ai-opentalking

mirror of https://github.com/datascale-ai/opentalking.git synced 2026-07-03 15:22:34 +08:00

Author	SHA1	Message	Date
zyairehhh	9333b31720	docs: add WeChat community QR code (#133 )	2026-07-03 10:18:12 +08:00
kero-ly	8087110620	Add optional avatar background removal Support immesive mode in 视频创作	2026-06-29 21:11:18 +08:00
zyairehhh	5473bb2665	feat: add local F5-TTS provider (#128 )	2026-06-28 23:21:10 +08:00
zyairehhh	5516cd5675	docs: reorganize deployment guides (#127 )	2026-06-28 13:22:46 +08:00
charm-ch	66bf0e5cd5	docs: add OpenTalking Wechat memory persona import guide	2026-06-25 14:25:21 +08:00
zyairehhh	7f37c3b49d	feat: add local CosyVoice TRT sidecar deployment (#119 )	2026-06-23 23:06:58 +08:00
cwang10	61f4007965	feat: add local CosyVoice runtime tuning	2026-06-23 18:15:27 +08:00
charm-ch	23429cde77	docs: add OpenTalking Huangshan digital human guide	2026-06-22 11:44:12 +08:00
kero-ly	71f87984c1	Fix missing mem0 pkg & fix hfdownload pkg version	2026-06-20 00:18:23 +08:00
zyairehhh	2128e1d256	docs: split quickstart paths	2026-06-19 18:36:06 +08:00
lyfics	f65f2bb5d0	feat: add LightRAG runtime config and quickstart updates Squash the branch changes into a single commit. Includes the LightRAG/memory workflow branch state, runtime-config API/UI, and quickstart service hardening.	2026-06-19 14:07:53 +08:00
charm-ch	e27b1c6501	Improve Mem0-backed memory workflows	2026-06-17 16:39:01 +08:00
Le0der	1f26a5c6d4	docs: add missing resnet18-5c106cde.pth to MuseTalk model directory layout	2026-06-17 16:37:31 +08:00
zyairehhh	61e85527b3	docs: enable versioned docs publishing (#103 )	2026-06-16 17:55:29 +08:00
zyairehhh	2e8c9eb371	docs: refresh README and QuickTalk docs (#101 )	2026-06-16 09:58:00 +08:00
Le0der	5d7181c214	docs:add WSL2 network mode selection guide for Windows deployment Add section 1.3 documenting NAT vs Mirrored network mode behavior in WSL2, covering WebRTC ICE connectivity, browser microphone access, and service startup compatibility based on real-world testing on Windows 11 + WSL2 Ubuntu 24.04 + RTX 3060.	2026-06-13 17:30:43 +08:00
zyairehhh	5cdcd8dd3d	fix quicktalk local assets and support QuickTalk on Apple Silicon (#98 )	2026-06-12 16:41:50 +08:00
zyairehhh	b6ffab2bb4	feat: improve IndexTTS and QuickTalk video creation (#95 )	2026-06-12 00:59:07 +08:00
lyfics	1f42a4e73f	feat: add LightRAG knowledge retrieval	2026-06-11 17:36:50 +08:00
cwang0810	3519989ba3	feat: add Persona Package support (#87 )	2026-06-10 20:41:18 +08:00
zyairehhh	d7ebea81ab	Improve MuseTalk deployment setup (#83 )	2026-06-06 15:12:31 +08:00
kero	91b4cc4b13	add en page, optimize homepage for deploy	2026-06-06 15:11:04 +08:00
lyfics	0b301b2ce6	feat: adapt knowledge base asset workflow	2026-06-05 23:01:38 +08:00
zyairehhh	7d29d78b28	docs: place V100 guide in deployment recipes	2026-06-05 10:42:47 +08:00
zyairehhh	5fb0c51ed2	docs: reorganize model deployment guides (#79 )	2026-06-05 08:29:52 +08:00
lyfics	566fe3d8b6	feat: add agent knowledge and audio video exports (#78 ) * feat: add agent knowledge and audio video exports * fix: satisfy mypy for video creation config parsing * fix: avoid sync wait for external audio render sessions * fix: remove offline bundle export controls from web	2026-06-04 21:54:29 +08:00
lucaszhu-hue	0ef7587aba	docs: add Atlas Cloud as an OpenAI-compatible LLM provider option Atlas Cloud exposes an OpenAI-compatible chat-completions endpoint, so it can be used as a drop-in LLM backend via the existing OPENTALKING_LLM_* settings. - README.md / README.en.md: Atlas Cloud blurb, link, and logo in the Supported Models section. - docs/{en,zh}/model-deployment/llm-stt.md: provider table row, .env example, and the full Atlas chat model list. - .env.example: commented Atlas Cloud example. The example uses deepseek-ai/deepseek-v4-pro (a reasoning model — give it enough max_tokens). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 21:37:00 +08:00
zyairehhh	5f81d36d59	docs: add WebUI video workflows (#75 )	2026-06-04 00:16:55 +08:00
zyairehhh	ca33fdcd26	docs: reorganize model deployment guides (#73 )	2026-06-03 23:11:54 +08:00
kero	71abcd5690	homepage design	2026-06-03 21:44:04 +08:00
zyairehhh	d0bf6bea4a	feat: add FasterLivePortrait video clone workflow (#70 )	2026-06-03 09:12:53 +08:00
lyfics	d700882655	docs: add benchmark guide, WSL2 fix, test results, and windows deployment guide - Add E2E benchmark run instructions and WSL2 VRAM fix to benchmark.md - Add benchmark test results table (RTX 3090/4090/3050, NPU 910B2) - Add windows-deployment.md for WSL2 deployment guide - Sync all changes to English counterparts	2026-05-28 13:38:46 +08:00
zyairehhh	3c893c522d	feat: add local STT/TTS QuickTalk runtime config (#61 )	2026-05-26 21:08:13 +08:00
charm-ch	ed6914deb2	feat: add local musetalk backend support	2026-05-26 00:01:52 +08:00
zyairehhh	f3532c19e9	docs: update QuickTalk weight download instructions (#58 )	2026-05-22 21:01:07 +08:00
lyfics	58d6e53d8f	Feat/docs update (#55 ) * add product-demo-live-sales * modify product-demo-live-sales * docs: add three new example tutorials	2026-05-22 11:09:24 +08:00
kero	2be1148e69	website structure update	2026-05-21 23:24:52 +08:00
zyairehhh	f16f786889	feat: align avatar cache prewarm flow	2026-05-21 20:07:33 +08:00
zyairehhh	288cdf68cd	feat: align local wav2lip runtime parity	2026-05-18 22:54:58 +08:00
keroly	41a3b5a53a	readme update & support local quicktalk, wav2lip (#43 ) * readme update * feat: support quicktalk pth local backend * feat: add local wav2lip adapter	2026-05-17 21:55:25 +08:00
zyairehhh	b1a859e823	feat: add realtime FasterLivePortrait support (#45 )	2026-05-17 21:14:22 +08:00
cwang0810	b235184428	docs: restructure documentation navigation (#44 )	2026-05-17 11:36:34 +08:00
zyairehhh	e0da9184ef	feat: route QuickTalk through OmniRT audio2video (#41 ) * feat: route quicktalk through omnirt * docs: add quicktalk quickstart	2026-05-15 12:03:39 +08:00
charm-ch	7030897f60	适配MuseTalk	2026-05-14 11:49:27 +08:00
cwang0810	ff20603d70	docs: fix install and deployment guidance (#34 )	2026-05-13 19:56:28 +08:00
cwang0810	8231d3fbe5	Docs: rebuild model deployment docs and add Pages publishing (#33 ) * docs: rebuild model deployment docs and add Pages publishing * ci: install docs dependencies in workflows	2026-05-13 17:54:11 +08:00
cwang0810	c9ac195708	Docs: restructure model backend documentation (#32 ) * docs: restructure model backend documentation * docs: clarify lightweight evaluation paths * docs: split model deployment module * docs: simplify models navigation * docs: expand deployment runbook * docs: simplify homepage capabilities * docs: improve model docs and zh defaults	2026-05-13 17:25:22 +08:00
keroly	3fb0e12020	bugfix & update readme (#30 ) Co-authored-by: kero <keroly950928@gmail.com>	2026-05-13 10:47:19 +08:00
keroly	587d1fb16b	Refactor/architecture v2 (#27 ) * docs: add architecture review and refactor plan * chore: snapshot baseline test/lint output before refactor * chore: catalog import sites that depend on deletable code * refactor: remove src/opentalking/engine (FlashTalk local inference) * refactor: remove local model implementations (musetalk/wav2lip/flashtalk-local) * chore: remove demo media, multitalk_utils, duplicate env examples; relocate images * refactor: consolidate configs to root configs/, drop src duplicate * refactor: drop OPENTALKING_FLASHTALK_MODE; rebuild model registry shim * refactor: remove dead src/opentalking/server (superseded by apps/api) * refactor: drop src/opentalking/cli stubs in favor of apps/cli * refactor: move worker test to tests/unit; stash bailian_clone for relocation * chore: temporarily restore bailian_clone (will relocate in phase D) * fix: stub legacy registry get_adapter/register_model for import compat * feat: unified .env.example, hardware profiles, omnirt endpoint catalog * feat: install/up/down/ensure_omnirt scripts + cuda/dev compose * docs: rewrite README + quickstart + architecture pointer; drop flashtalk-omnirt * chore: drop refactor scratch files * refactor: relocate src/opentalking → packages/opentalking (no internal restructure) * feat: core/registry.py + STTAdapter/SynthesisAdapter interfaces * fix: untrack 'packages/' from gitignore so packages/opentalking/ commits * refactor(D): reorganize into providers/{stt,tts,llm,rtc,synthesis} + media; mass-rewrite imports * refactor(E): avatars→avatar, voices→voice (singular naming) * refactor(F): worker → pipeline/{session,speak,recording} + runtime/ * refactor(G): drop legacy wav2lip official_runtime imports; remove empty configs/worker dirs * feat(registry): wire all providers via core.registry decorators + bootstrap * chore(cli): drop dead generate_video / gradio_app (engine removed) * refactor(pipeline): extract pure helpers from synthesis_runner (audio_utils/env_helpers/idle_frames/tts_openers) * docs(env): rebuild .env.example aligned with actual Settings model + .env * refactor: flatten packages/opentalking → opentalking (drop empty packages/ wrapper) * docs: refresh architecture diagram for flat opentalking/ layout * feat: two-path deploy (docker mock-default + python venv) — mock synthesis, opentalking-doctor, run_omnirt.sh * docs: rewrite Quickstart around 3 paths (mock / lightweight / high-quality); update Project layout to flat layout * docs(env): rewrite .env.example to match README's 3 paths (progressive complexity) * fix(mock): wire OPENTALKING_INFERENCE_MOCK end-to-end (Settings + API + MockFlashTalkClient + task_consumer) * fix: restore idle_generator.py (thin client, deleted by mistake in phase B) * docs: clarify OPENTALKING_FLASHTALK_WS_URL (active) vs OMNIRT_ENDPOINT (placeholder, not wired) * feat(omnirt): wire OMNIRT_ENDPOINT end-to-end (URL resolver + WS auth headers + path-based model routing) * fix: omnirt_auth_headers import (real symbol is auth_headers; aliased only in providers/synthesis/__init__) * fix(omnirt): align default path template with OmniRT actual routing (/v1/avatar/{model}) * fix: 3 UX issues — avatar/model decouple by input form, drop OPENTALKING_INFERENCE_MOCK, doctor loads .env * fix(sessions): drop avatar/model compatibility gate entirely (full decoupling) * fix(tts): decouple Edge voices (whitelist→format check) + silently ignore tts_model on Edge * fix(runtime): preserve user's chosen model_type (musetalk/wav2lip/mock no longer relabeled to flashtalk) * chore(config): drop unused OPENTALKING_DEFAULT_MODEL (model always supplied per-session) * feat(api): SUPPORTED_MODELS allowlist (mock,flashtalk) — reject musetalk/wav2lip with 400 instead of silent FlashTalk fallback * feat(web): surface backend 400 detail in toast (e.g. 'model not yet supported') * feat(avatars): allow deleting custom avatars (DELETE /avatars/{id} + frontend × button) * 适配wav2lip384 (#23) * feat(wav2lip): integrate avatar metadata for architecture v2 * fix(wav2lip): validate mouth metadata freshness * docs(wav2lip): add chinese pr summary * ci: update refactor lint paths * ci: remove missing worker test path * feat(wav2lip): add preprocessed video avatar support (#25) * feat: route avatar models through omnirt audio2video * feat(wav2lip): route postprocess mode via audio2video * feat: streamline omnirt audio2video setup * feat: support configurable wav2lip modes and refresh assets * Add QuickTalk model adapter * Fix QuickTalk prefetch type annotation * Keep QuickTalk init asynchronous * Document QuickTalk configuration * Refine FlashHead adapter integration * Update model status test for QuickTalk * Update lockfile for QuickTalk dependencies * Fix QuickTalk OpenCV fourcc typing --------- Co-authored-by: kero <keroly950928@gmail.com> Co-authored-by: zyairehhh <zyaireliu@outlook.com> Co-authored-by: cwang10 <cwang10@mail.ustc.edu.cn>	2026-05-11 23:25:25 +08:00
cwang10	973a51528a	Document QuickTalk configuration	2026-05-10 13:58:09 +08:00

1 2

54 Commits