Skip to content

QuickTalk Local Deployment

Local mode loads the QuickTalk adapter inside the OpenTalking process. Use it for single-machine CUDA validation, avatar-cache debugging, and confirming the Web/API pipeline before introducing OmniRT.

Use Cases

  • You have already validated mock and now need real talking-head output.
  • GPU, WebUI, and API run on the same machine.
  • You need to prewarm QuickTalk cache for commonly used shared avatars with opentalking-prepare-cache.

Weight Preparation

Place weights under repository-root models/quicktalk/. Set HF_ENDPOINT when Hugging Face access is slow.

Terminal
cd "$DIGITAL_HUMAN_HOME/opentalking"
mkdir -p models/quicktalk/checkpoints

uv pip install -U "huggingface_hub[cli]"
export HF_ENDPOINT="${HF_ENDPOINT:-https://hf-mirror.com}"

hf download datascale-ai/quicktalk \
  quicktalk.pth \
  repair.npy \
  chinese-hubert-large/config.json \
  chinese-hubert-large/preprocessor_config.json \
  chinese-hubert-large/pytorch_model.bin \
  --local-dir models/quicktalk/checkpoints

Prepare InsightFace buffalo_l separately:

Terminal
mkdir -p /tmp/opentalking-insightface models/quicktalk/checkpoints/auxiliary/models
curl -L \
  -o /tmp/opentalking-insightface/buffalo_l.zip \
  https://github.com/deepinsight/insightface/releases/download/v0.7/buffalo_l.zip
unzip -q -o /tmp/opentalking-insightface/buffalo_l.zip \
  -d /tmp/opentalking-insightface
rsync -a /tmp/opentalking-insightface/buffalo_l/ \
  models/quicktalk/checkpoints/auxiliary/models/buffalo_l/

Start Command

Terminal
cd "$DIGITAL_HUMAN_HOME/opentalking"
uv sync --extra dev --extra models --extra quicktalk-cuda --python 3.11

export OPENTALKING_TORCH_DEVICE=cuda:0
export OPENTALKING_QUICKTALK_ASSET_ROOT="$DIGITAL_HUMAN_HOME/opentalking/models/quicktalk"
export OPENTALKING_QUICKTALK_WORKER_CACHE=1

bash scripts/start_unified.sh --backend local --model quicktalk --api-port 8210 --web-port 5280

Open http://localhost:5280, select a shared avatar, and choose the quicktalk model. If a fixed template video is required, confirm the template asset is reachable from the session or deployment configuration.

Verification

Terminal
curl -fsS http://127.0.0.1:8210/health
curl -s http://127.0.0.1:8210/models | jq '.statuses[] | select(.id=="quicktalk")'

Expect backend=local and connected=true. To prepare cache ahead of time:

Terminal
opentalking-prepare-cache \
  --model quicktalk \
  --avatars-root examples/avatars \
  --quicktalk-model-root models/quicktalk \
  --device cuda:0 \
  --model-backend pth \
  --verify

Common Errors

Symptom Action
connected=false Check OPENTALKING_QUICKTALK_ASSET_ROOT, the CUDA device, and models/quicktalk/checkpoints.
Long first turn Enable OPENTALKING_QUICKTALK_WORKER_CACHE=1 or run opentalking-prepare-cache in advance.
Avatar load failure Check that the avatar is readable; if a fixed template video is configured, confirm that path is reachable.
Hugging Face download fails Configure HF_ENDPOINT, or download offline and sync into the same directory.