CapTalk: Text-Guided Stylization and Speech-Driven 3D Head Animation 文章

ArXiv CS.CV2026-05-29NEWSen作者: Xuangeng Chu, Yuan Gan, Ziteng Cui, Shuhong Liu, Jian Wang, Bing Zhou, Tatsuya Harada

摘要

arXiv:2605.29316v1 Announce Type: new Abstract: Audio-driven 3D facial animation aims to generate synchronized lip movements and vivid facial expressions from arbitrary audio clips. While existing methods can produce synchronized lip motions, they often rely on predefined identity or style latent features, which limits users' ability to freely control speaking styles. Moreover, applying a fixed style or identity to an entire audio segment typically results in facial animation styles that do not adapt to the emotional content of the audio. To address these challenges, we revisit the entanglement between style and emotion, construct a large-scale dataset with textual descriptions of both style and emotion, and propose a novel talking head generation framework that enables separate control over style and emotion.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据

相关技术

暂无数据