Beyond Query Memorization: Large Language Model Routing with Query Decomposition and Historical Matching 文章

ArXiv CS.AI2026-05-26NEWSen作者: Bo Lv, Jingbo Sun

摘要

arXiv:2605.25558v1 Announce Type: new Abstract: Optimizing the trade-off among predictive performance and computational cost is a central focus in the deployment of Large Language Models (LLMs). Current routing methods primarily rely on direct mapping from queries to models based on surface-level features, making them susceptible to the memorization trap and leading to poor generalizability on out-of-distribution (OOD) data. In this paper, we propose DecoR, a novel routing framework that recasts the routing task as a matching process of sifting similar queries from historical logs, effectively mitigating the memorization trap. To enhance matching accuracy, we introduce a query capability deconstruction method that decouples linguistic surface forms from task-intrinsic requirements, directing matching toward capability dimensions to ground decisions in essential task attributes.