Multimodal Large Language Model-Enabled Video Translation: A Role-Oriented Survey 事件

Name: Multimodal Large Language Model-Enabled Video Translation: A Role-Oriented Survey
Start: 2026-06-02

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Multimodal Large Language Model-Enabled Video Translation: A Role-Oriented Survey arXiv:2604.11283v2 Announce Type: replace Abstract: Recent progress in multimodal large language models (MLLMs) is reshaping video translation from a cascaded pipeline of automatic speech recognition, machine translation, text-to-speech, and lip synchronization into a unified multimodal reasoning and generation problem. High-quality video translation requires not only semantic fidelity, but also temporal alignment

人工智能

关系图谱

Multimodal Large Language Model-Enabled Video Translation: A Role-Oriented Survey 事件

相关公司查看全部 (10)

相关人物查看全部 (1)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)