This workflow performs a Wan 2.1 video generation from your image while applying a lipsync to it all at once!
On an ultra machine it takes 17 minutes to render a 10 seconds video.
All you have to do is choose your audio file, your image and the number of frames you want to render and you are good to go.