HyDRA: Hybrid Dynamic Routing Architecture for Heterogeneous LLM Pools 文章

ArXiv CS.CL2026-06-16NEWSen作者: Aashna Garg, Siddharth Singha Roy, Jinu Jang, Federico Brancasi, Shengyu Fu

详细信息

来源站点
ArXiv CS.CL
作者
Aashna Garg, Siddharth Singha Roy, Jinu Jang, Federico Brancasi, Shengyu Fu
文章类型
NEWS
语言
en
发布日期
2026-06-16

摘要

arXiv:2605.17106v2 Announce Type: replace Abstract: Production LLM deployments increasingly maintain heterogeneous model pools spanning order-of-magnitude cost differences. Existing routers make binary strong-vs-weak decisions and couple learned parameters to specific model identities, requiring retraining whenever the catalog changes. We present HyDRA (Hybrid Dynamic Routing Architecture), a framework that predicts fine-grained, multi-dimensional capability requirements per query and matches them against configuration-defined model profiles via shortfall matching. A ModernBERT encoder with K=4 independent sigmoid heads scores each query along reasoning, code generation, debugging, and tool use; a shortfall-matching algorithm then selects the cheapest model whose capabilities meet the predicted requirements.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据