ROGLE: Robust Global-Local Alignment with Automated Region Supervision for Text-Based Person Search 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
ROGLE: Robust Global-Local Alignment with Automated Region Supervision for Text-Based Person Search arXiv:2606.01825v1 Announce Type: new Abstract: Text-Based Person Search (TBPS) aims to retrieve pedestrian images using natural language queries. However, existing TBPS models, especially those based on CLIP, struggle with fine-grained understanding due to global representational bias and semantic sparsity inherited from training on short captions. This results in weak fine-grained alignment, ex