LoMo: Local Modality Substitution for Deeper Vision-Language Fusion 文章

ArXiv CS.CV2026-05-29NEWSen作者: Feng Han, Zhixiong Zhang, Zheming Liang, Yibin Wang, Jiaqi Wang

LoMo: Local Modality Substitution for Deeper Vision-Language Fusion · 相关技术