CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval 文章

ArXiv CS.AI2026-05-29NEWSen作者: Vaishali Senthil, Ashutosh Hathidara, Sebastian Schreiber

摘要

arXiv:2605.29271v1 Announce Type: new Abstract: Tool retrieval over large API catalogs is a core bottleneck for LLM agents: user queries arrive in colloquial, often underspecified language, while the catalog uses technical API vocabulary that no fixed encoder can bridge on its own. The two dominant training approaches, contrastive encoder fine-tuning and HyDE-style query expansion with a frozen LLM, address this problem from opposite ends and fail in complementary directions: the fine-tuned encoder excels when the query's surface form already matches the catalog but collapses when it does not, while zero-shot HyDE is more robust to underspecified queries yet generates catalog-unaware hypothetical descriptions that degrade retrieval when queries are well-formed.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据

相关技术

暂无数据