High-Quality Entity Segmentation and Grounding 文章

ArXiv CS.CV2026-06-04NEWSen作者: Lu Qi, Yi-Wen Chen, Tao Zhang, Xiangtai Li, Xu Yang, Bo Du, Ming-Hsuan Yang

摘要

arXiv:2402.02555v2 Announce Type: replace Abstract: In this work, we propose ESG, a pipeline for high-quality entity segmentation and grounding supported by a new dataset EntitySeg. At first, the proposed dataset naming EntitySeg contains images spanning various image domains and entities, along with plentiful high-resolution images and high-quality mask annotations for training and testing. Then, the ESG mainly consists of two modules: CropFormer for high-quality entity segmentation whereas GELLA for accurate noun extraction from sentences and semantic matching between language and visual regions. Unlike existing grounding methods that jointly train a segmentation and a large language model, ESG adopts a two-stage decoupled design, preserving high-quality masks and grounding robustness without the trade-offs often introduced by joint training. CropFormer ensures high-quality entity segmentation results, which can then be encoded into the GELLA model for effective grounding.

相关事件查看全部 (1)

High-Quality Entity Segmentation and Grounding
2026-06-04PRODUCT_LAUNCH影响: MEDIUM

相关公司

暂无数据

相关人物

暂无数据