MotionEnhancer: Leveraging Video Diffusion for Motion-Enhanced Vision-Language Models 事件
PRODUCT_LAUNCH2026-06-08影响: MEDIUM
MotionEnhancer: Leveraging Video Diffusion for Motion-Enhanced Vision-Language Models arXiv:2606.06853v1 Announce Type: new Abstract: The new era has witnessed a remarkable capability to extend Vision-Language Models (VLMs) for tackling tasks of video understanding. While current VLMs excel at event- or story-level understanding, their ability to capture fine-grained motion details remains limited, primarily due to their focus on high-level static semantic structures and macro-event logic. In c