OneFocus: Enabling Real-World X-ray Security Screening with a Unified Vision-Language Model 文章

ArXiv CS.CV2026-06-16NEWSen作者: Jiali Wen, Hongxia Gao, Litao Li, Yixin Chen, Kaijie Zhang, Qianyun Liu, Xiaoqin Wen

详细信息

来源站点
ArXiv CS.CV
作者
Jiali Wen, Hongxia Gao, Litao Li, Yixin Chen, Kaijie Zhang, Qianyun Liu, Xiaoqin Wen
文章类型
NEWS
语言
en
发布日期
2026-06-16

摘要

arXiv:2606.15663v1 Announce Type: new Abstract: X-ray contraband detection is critical for security in large-scale logistics and transportation, yet conventional detectors struggle to adapt to emerging contraband types and lack fundamental visual understanding. Vision-language models (VLMs) offer strong generalization but are hindered by the scarcity of high-quality X-ray image-caption data. To bridge this critical gap, we present MMXray, a meticulously curated benchmark of 52,124 image-caption pairs spanning 28 fine-grained classes of X-ray contraband. To enrich MMXray with realistic occlusion patterns, we further introduce CleanDET, a dedicated synthesis dataset containing clean foreground contraband images from 28 categories and background images with diverse density levels, together with AnyContraSyn, a controllable synthesis method designed to operate on CleanDET. We also develop OnePipe, an extensible pipeline for systematic data curation.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据