VisualNeedle: Benchmarking Active Visual Search in Information-Dense Scenes 事件

Name: VisualNeedle: Benchmarking Active Visual Search in Information-Dense Scenes
Start: 2026-05-27

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

VisualNeedle: Benchmarking Active Visual Search in Information-Dense Scenes arXiv:2605.26380v1 Announce Type: new Abstract: Frontier multimodal large language models (MLLMs) have been reported to achieve over 90% accuracy on fine-grained perception benchmarks. However, such scores do not necessarily imply faithful use of visual evidence. Prior studies have identified three shortcuts that inflate benchmark performance. First, linguistic priors and lexical cues in questions often enable models to

人工智能

关系图谱

VisualNeedle: Benchmarking Active Visual Search in Information-Dense Scenes 事件

相关公司查看全部 (10)

相关人物查看全部 (2)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)