ODOV: Benchmark the Open-Domain Open-Vocabulary Object Detection 文章

ArXiv CS.CV2026-05-27NEWSen作者: Yupeng Zhang, Ruize Han, Fangnan Zhou, Wei Feng, Liang Wan

摘要

arXiv:2508.01253v2 Announce Type: replace Abstract: Existing studies typically investigate domain shift and category shift as independent problems, however, in real-world scenarios, the two types of shifts often occur simultaneously and interact, leading to significant degradation in detection performance. To address this, we propose and systematically study a novel problem-Open-Domain Open-Vocabulary (ODOV) object detection-which aims to evaluate a model's ability to adapt to the compound domain and category shifts in real-world environments.We construct a new benchmark, OD-LVIS, which contains 46,949 images spanning 15 diverse real-world scenarios and 1,203 categories, for assessing object detection performance. Furthermore, we propose a novel ODOV detection baseline that fully leverages VLM's powerful multi-modal alignment capabilities and introduces two key mechanisms to enhance both category and domain generalization.

ODOV: Benchmark the Open-Domain Open-Vocabulary Object Detection 文章

摘要

相关事件查看全部 (1)

相关公司查看全部 (4)

相关人物

相关产品查看全部 (10)

相关技术查看全部 (27)