CPPO: Contrastive Perception Policy Optimization for VLM Agents 事件
PRODUCT_LAUNCH2026-05-28影响: MEDIUM
CPPO: Contrastive Perception Policy Optimization for VLM Agents arXiv:2601.00501v2 Announce Type: replace Abstract: We introduce CPPO, a Contrastive Perception Policy Optimization method for finetuning vision--language models (VLMs). Reliable perception is a core requirement for VLM-based agents that must reason and act in open-ended environments: faulty visual grounding cascades directly into faulty actions, hallucinated tool calls, and unsafe decisions. While reinforcement learning (RL) has s
相关产品查看全部 (10)
相关报道查看全部 (1)
CPPO: Contrastive Perception Policy Optimization for VLM Agents
ArXiv CS.CV2026-05-28