Apps Page Background Image
Workflows/LTX2 LONG AVATAR (HEYGEN AT HOME) - IMAGE TO VIDEO WITH TALKING AND SINGING (ComfyUI)

LTX2 LONG AVATAR (HEYGEN AT HOME) - IMAGE TO VIDEO WITH TALKING AND SINGING (ComfyUI)

Save it for me
Operate
@
🦙Rishappi
02/05/2026
ComfyUI
New & Trending
Lip-sync
LTX-2
1 / 0
Detailed Introduction

The LTX2 Long Avatar Workflow (HeyGen at Home) is an advanced image-to-video (I2V) pipeline built on a fine-tuned LTX2 model, designed specifically for creating long-form talking and singing avatar videos from your custom audio or music. This workflow is an evolution of earlier LTX2 avatar setups and is optimized to generate stable, consistent videos lasting 1–2 minutes with significantly reduced generation times.

A key improvement in this workflow is the removal of the heavy Gemma text encoder dependency. Instead, it relies on a lightweight, free external API (setup notes included inside the workflow), which greatly reduces system load while preserving strong prompt understanding and audio–visual alignment.

Key Capabilities:

🎭 Long-Form Avatar Video Generation (1–2 Minutes)
Create extended talking or singing avatar videos with consistent facial motion, lip sync, and overall visual stability with your audio.

🎧 Talking & Singing with Audio
Generates videos with custom audio, performing best in English, while also supporting German, Spanish, French, and other languages with reasonable consistency.

⚡ Optimized Performance & Faster Inference
Consistently produces long videos faster than WAN-based workflows (excluding cold start), making it suitable for production-style use.

🖼️ HD Resolution Support
Safely tested up to 1920×1088 resolution without out-of-memory issues, enabling high-quality avatar output.

📝 Prompt-Driven Quality Control
Best results are achieved using detailed prompt structures recommended by LTX2. Clear instructions for dialogue, tone, pacing, and performance significantly improve realism and coherence.

🚀 Lightweight Architecture
By offloading text processing to a free API, the workflow reduces VRAM usage and improves scalability for long-duration avatar generation.

Ideal Use Cases:

  • AI presenters and virtual avatars with your custom audio
  • Talking-head and singing avatar videos
  • Educational and explainer content
  • AI influencer and virtual host creation
  • Long-form character-driven videos

Performance Notes:

  • First run (cold start) may take longer; subsequent runs are much faster.
  • Stable results observed during 1–2 minute generation tests.
  • Experimenting with prompts quickly improves output quality and control.
  • Useful notes included inside workflow

The LTX2 Long Avatar Workflow (HeyGen at Home) brings professional-grade, long-duration avatar video creation into ComfyUI—offering a powerful, fast, and open alternative for generating talking and singing AI avatars with HD quality and audio support.

Details
APPComfyUI(v0.11.0)
Update Time02/05/2026
File Space13.5 GB
Models3
Extensions9