HumanDiT – Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation

https://agnjason.github.io/HumanDiT-page

By inputting a single character image and template pose video, our method can generate vocal avatar videos featuring not only pose-accurate rendering but also realistic body shapes.

pIXELsHAM