mirror of
https://github.com/datascale-ai/opentalking.git
synced 2026-07-03 15:22:34 +08:00
* docs: add architecture review and refactor plan
* chore: snapshot baseline test/lint output before refactor
* chore: catalog import sites that depend on deletable code
* refactor: remove src/opentalking/engine (FlashTalk local inference)
* refactor: remove local model implementations (musetalk/wav2lip/flashtalk-local)
* chore: remove demo media, multitalk_utils, duplicate env examples; relocate images
* refactor: consolidate configs to root configs/, drop src duplicate
* refactor: drop OPENTALKING_FLASHTALK_MODE; rebuild model registry shim
* refactor: remove dead src/opentalking/server (superseded by apps/api)
* refactor: drop src/opentalking/cli stubs in favor of apps/cli
* refactor: move worker test to tests/unit; stash bailian_clone for relocation
* chore: temporarily restore bailian_clone (will relocate in phase D)
* fix: stub legacy registry get_adapter/register_model for import compat
* feat: unified .env.example, hardware profiles, omnirt endpoint catalog
* feat: install/up/down/ensure_omnirt scripts + cuda/dev compose
* docs: rewrite README + quickstart + architecture pointer; drop flashtalk-omnirt
* chore: drop refactor scratch files
* refactor: relocate src/opentalking → packages/opentalking (no internal restructure)
* feat: core/registry.py + STTAdapter/SynthesisAdapter interfaces
* fix: untrack 'packages/' from gitignore so packages/opentalking/ commits
* refactor(D): reorganize into providers/{stt,tts,llm,rtc,synthesis} + media; mass-rewrite imports
* refactor(E): avatars→avatar, voices→voice (singular naming)
* refactor(F): worker → pipeline/{session,speak,recording} + runtime/
* refactor(G): drop legacy wav2lip official_runtime imports; remove empty configs/worker dirs
* feat(registry): wire all providers via core.registry decorators + bootstrap
* chore(cli): drop dead generate_video / gradio_app (engine removed)
* refactor(pipeline): extract pure helpers from synthesis_runner (audio_utils/env_helpers/idle_frames/tts_openers)
* docs(env): rebuild .env.example aligned with actual Settings model + .env
* refactor: flatten packages/opentalking → opentalking (drop empty packages/ wrapper)
* docs: refresh architecture diagram for flat opentalking/ layout
* feat: two-path deploy (docker mock-default + python venv) — mock synthesis, opentalking-doctor, run_omnirt.sh
* docs: rewrite Quickstart around 3 paths (mock / lightweight / high-quality); update Project layout to flat layout
* docs(env): rewrite .env.example to match README's 3 paths (progressive complexity)
* fix(mock): wire OPENTALKING_INFERENCE_MOCK end-to-end (Settings + API + MockFlashTalkClient + task_consumer)
* fix: restore idle_generator.py (thin client, deleted by mistake in phase B)
* docs: clarify OPENTALKING_FLASHTALK_WS_URL (active) vs OMNIRT_ENDPOINT (placeholder, not wired)
* feat(omnirt): wire OMNIRT_ENDPOINT end-to-end (URL resolver + WS auth headers + path-based model routing)
* fix: omnirt_auth_headers import (real symbol is auth_headers; aliased only in providers/synthesis/__init__)
* fix(omnirt): align default path template with OmniRT actual routing (/v1/avatar/{model})
* fix: 3 UX issues — avatar/model decouple by input form, drop OPENTALKING_INFERENCE_MOCK, doctor loads .env
* fix(sessions): drop avatar/model compatibility gate entirely (full decoupling)
* fix(tts): decouple Edge voices (whitelist→format check) + silently ignore tts_model on Edge
* fix(runtime): preserve user's chosen model_type (musetalk/wav2lip/mock no longer relabeled to flashtalk)
* chore(config): drop unused OPENTALKING_DEFAULT_MODEL (model always supplied per-session)
* feat(api): SUPPORTED_MODELS allowlist (mock,flashtalk) — reject musetalk/wav2lip with 400 instead of silent FlashTalk fallback
* feat(web): surface backend 400 detail in toast (e.g. 'model not yet supported')
* feat(avatars): allow deleting custom avatars (DELETE /avatars/{id} + frontend × button)
* 适配wav2lip384 (#23)
* feat(wav2lip): integrate avatar metadata for architecture v2
* fix(wav2lip): validate mouth metadata freshness
* docs(wav2lip): add chinese pr summary
* ci: update refactor lint paths
* ci: remove missing worker test path
* feat(wav2lip): add preprocessed video avatar support (#25)
* feat: route avatar models through omnirt audio2video
* feat(wav2lip): route postprocess mode via audio2video
* feat: streamline omnirt audio2video setup
* feat: support configurable wav2lip modes and refresh assets
* Add QuickTalk model adapter
* Fix QuickTalk prefetch type annotation
* Keep QuickTalk init asynchronous
* Document QuickTalk configuration
* Refine FlashHead adapter integration
* Update model status test for QuickTalk
* Update lockfile for QuickTalk dependencies
* Fix QuickTalk OpenCV fourcc typing
---------
Co-authored-by: kero <keroly950928@gmail.com>
Co-authored-by: zyairehhh <zyaireliu@outlook.com>
Co-authored-by: cwang10 <cwang10@mail.ustc.edu.cn>
24 lines
676 B
YAML
24 lines
676 B
YAML
# Override applied with `docker compose --profile gpu up`. Wires api/worker
|
|
# to talk to the real omnirt service instead of the in-process mock.
|
|
#
|
|
# Usage:
|
|
# docker compose --profile gpu \
|
|
# -f docker-compose.yml -f docker-compose.gpu.yml up
|
|
#
|
|
# (the `up.sh` shortcut handles this; users only need to remember --profile gpu)
|
|
|
|
services:
|
|
api:
|
|
environment:
|
|
OPENTALKING_INFERENCE_MOCK: "0"
|
|
OMNIRT_ENDPOINT: http://omnirt:9000
|
|
depends_on:
|
|
omnirt: { condition: service_healthy }
|
|
|
|
worker:
|
|
environment:
|
|
OPENTALKING_INFERENCE_MOCK: "0"
|
|
OMNIRT_ENDPOINT: http://omnirt:9000
|
|
depends_on:
|
|
omnirt: { condition: service_healthy }
|