COMBINER: Composed Image Retrieval Guided by Attribute-based Neighbor Relations 文章

ArXiv CS.CV2026-06-04NEWSen作者: Zixu Li, Yupeng Hu, Zhiwei Chen, Haokun Wen, Xuemeng Song, Liqiang Nie

摘要

arXiv:2606.04604v1 Announce Type: new Abstract: Composed Image Retrieval (CIR) represents a challenging retrieval task that targets locating specific images through multimodal inputs. Despite recent progress in CIR techniques, prior approaches often overlook cases where images appear visually alike yet differ in attributes, potentially undermining both multimodal feature fusion and similarity modeling. To mitigate this limitation, we design a unified representation of cross-modal features based on attribute prototypes. Nevertheless, the task is far from straightforward, owing to three core issues: (1) entanglement in attribute-level semantics, (2) inconsistency across modalities, and (3) supervised signal missing. To tackle the above obstacles, we introduce a COMposed image retrieval network guided By attrIbute-based NEighbor Relations (COMBINER).

相关公司

暂无数据

相关人物

暂无数据