T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences 文章

ArXiv CS.CV2026-06-08NEWSen作者: Taeryung Lee, Fabien Baradel, Thomas Lucas, Kyoung Mu Lee, Gregory Rogez

摘要

arXiv:2406.00636v2 Announce Type: replace Abstract: In this paper, we address the challenging problem of long-term 3D human motion generation. Specifically, we aim to generate a long sequence of smoothly connected actions from a stream of multiple sentences (i.e., paragraph). Previous long-term motion generating approaches were mostly based on recurrent methods, using previously generated motion chunks as input for the next step. However, this approach has two drawbacks: 1) it relies on sequential datasets, which are expensive; 2) these methods yield unrealistic gaps between motions generated at each step. To address these issues, we introduce simple yet effective T2LM, a continuous long-term generation framework that can be trained without sequential data. T2LM comprises two components: a 1D-convolutional VQVAE, trained to compress motion to sequences of latent vectors, and a Transformer-based Text Encoder that predicts a latent sequence given an input text.

T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (5)