PinPoint: Prompting with Informative Interior Points 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
PinPoint: Prompting with Informative Interior Points arXiv:2605.26689v1 Announce Type: new Abstract: Modern referring image segmentation pipelines couple a vision-language model (VLM) for grounding with a promptable segmenter such as the Segment Anything Model (SAM) for mask generation. Prior training-free instances of this recipe consistently trail fine-tuned and reinforcement-learning (RL)-tuned specialists, and it has been unclear whether the gap comes from the VLM's grounding, SAM's capacit