SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating 文章

ArXiv CS.AI2026-06-08NEWSen作者: Zequn Xie, Junjie Wang, Dan Yang, Jie Feng, Yue Shen, Jian Wang, Jinjie Gu

详细信息

来源站点
ArXiv CS.AI
作者
Zequn Xie, Junjie Wang, Dan Yang, Jie Feng, Yue Shen, Jian Wang, Jinjie Gu
文章类型
NEWS
语言
en
发布日期
2026-06-08

摘要

arXiv:2606.07074v1 Announce Type: cross Abstract: Deep research agents have demonstrated remarkable capabilities in complex information-seeking tasks, yet this power comes at a steep computational cost. Driven by accuracy-focused training paradigms, current models adopt brute-force strategies characterized by blind tool dependency and performative reasoning-generating long, redundant trajectories that are far from necessary for resolving these tasks, leading to wasteful tool calls and excessive token consumption. To overcome this efficiency trap, we propose SlimSearcher, a principled framework that pushes the Pareto frontier between accuracy and computational cost across both Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). In the SFT stage, SlimSearcher employs Pareto-efficient filtration to distill trajectories that are both successful and economical, guiding the model toward inherently efficiency-aware search behaviors.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据