Cross-Environment Neural Reranking for Sample-Efficient Action Selection in Text-Based Agents 文章

ArXiv CS.CL2026-06-02NEWSen作者: Kan Shao

摘要

arXiv:2606.02204v1 Announce Type: new Abstract: Large language model agents achieve strong performance on text-based benchmarks but incur prohibitive inference costs, motivating the use of compact neural rerankers for action selection. We investigate whether a single lightweight model can perform action selection across multiple diverse environments, a capability that would eliminate per-environment model maintenance. Training DeBERTa-v3 (184M-434M parameters) jointly on ALFWorld, WebShop, and ScienceWorld with minority-class upsampling, we find that rebalanced two-environment joint training substantially improves over single-environment ALFWorld performance (net gain +0.412) while maintaining competitive WebShop performance (+0.214 vs. +0.249 single-environment). Three-environment training yields a mean combined net gain of +0.551 +/- 0.024 across 4 seeds, with per-environment results approaching specialized single-environment models while providing positive cross-domain transfer.

Cross-Environment Neural Reranking for Sample-Efficient Action Selection in Text-Based Agents 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (13)

相关技术查看全部 (3)