MultiAct: Text-to-Motion Generation from Composite Text via Tailored Attention Guidance 文章

ArXiv CS.CV2026-06-01NEWSen作者: Nathan Sala, Ofir Abramovich, Ariel Shamir, Daniel Cohen-Or, Andreas Aristidou, Sigal Raab

摘要

arXiv:2605.30925v1 Announce Type: new Abstract: Text-to-motion generation has progressed rapidly in recent years, offering an expressive interface for animation and human-computer interaction. However, current models remain brittle when handling prompts that describe multiple actions occurring at the same time. Rather than realizing all components of a composite description, models frequently prioritize a single dominant action and neglect the rest, leading to incomplete or ambiguous motion. We present MultiAct, an unpaired, inference-time framework for compositional text-to-motion synthesis that operates directly on pretrained motion generators without retraining or architectural modification. Our method counteracts semantic collapse by adaptively amplifying cross-attention scores associated with underrepresented prompt components.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据