ViCA: Efficient Multimodal LLMs with Vision-Only Cross-Attention 事件

Name: ViCA: Efficient Multimodal LLMs with Vision-Only Cross-Attention
Start: 2026-05-28

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

ViCA: Efficient Multimodal LLMs with Vision-Only Cross-Attention arXiv:2602.07574v2 Announce Type: replace Abstract: Modern multimodal large language models (MLLMs) adopt a unified self-attention design that processes visual and textual tokens at every Transformer layer, incurring substantial computational overhead. In this work, we revisit the necessity of such dense visual processing and show that projected visual embeddings are already well-aligned with the language space, while effective vi

人工智能

关系图谱

ViCA: Efficient Multimodal LLMs with Vision-Only Cross-Attention 事件

相关公司查看全部 (8)

相关人物

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)