Speech gesture generation from the trimodal context of text, audio, and speaker identity 论文

2020ACM Transactions on Graphics引用 298
Human Pose and Action RecognitionMultimodal Machine Learning ApplicationsHuman Motion and Animation

Speech gesture generation from the trimodal context of text, audio, and speaker identity · 作者