Train Once, Reuse Everywhere: Generalizable Implicit In-Context Learning by Routing Attention 文章

ArXiv CS.CL2026-06-03NEWSen作者: Jiaqian Li, Yanshu Li, Ligong Han, Ruixiang Tang, Wenya Wang

摘要

arXiv:2509.22854v2 Announce Type: replace Abstract: Implicit in-context learning (ICL) has newly emerged as a promising paradigm that simulates ICL behaviors in the representation space of large language models (LLMs), aiming to attain few-shot performance at zero-shot cost. However, existing approaches largely rely on injecting shift vectors into residual flows, which are typically constructed from labeled demonstrations or task-specific alignment. Such designs fall short of utilizing the structural mechanisms underlying ICL and suffer from limited generalizability. To address this, we propose In-Context Routing (ICR), a novel implicit ICL method that captures and utilizes generalizable ICL patterns at the attention logits level. It extracts reusable structural directions that emerge during ICL and employs a learnable input-conditioned router to modulate attention logits accordingly, enabling an efficient train-once-and-reuse framework.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据