SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating 事件
PRODUCT_LAUNCH2026-06-08影响: MEDIUM
SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating arXiv:2606.07074v1 Announce Type: cross Abstract: Deep research agents have demonstrated remarkable capabilities in complex information-seeking tasks, yet this power comes at a steep computational cost. Driven by accuracy-focused training paradigms, current models adopt brute-force strategies characterized by blind tool dependency and performative reasoning-generating long, redundant trajectories that are far from nec