Divide-and-Conquer Inference for Large-Scale Visual Recognition with Multimodal Large Language Models 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Divide-and-Conquer Inference for Large-Scale Visual Recognition with Multimodal Large Language Models arXiv:2605.24799v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have demonstrated strong capabilities across a wide range of vision language tasks. However, when applied to large scale image classification, their performance degrades significantly as the label space expands a phenomenon we define as Performance Collapse in Long Sequence Recognition. Through an informati

Divide-and-Conquer Inference for Large-Scale Visual Recognition with Multimodal Large Language Models · 相关报道