Architecture-driven Shift: towards a lightweight selector for capturing the trends of logit shift 文章

ArXiv CS.AI2026-05-28NEWSen作者: Zhong Ye, Yu Hu, Ruilin Tang

摘要

arXiv:2605.27469v1 Announce Type: cross Abstract: Continual Learning (CL) is a practical paradigm to utilize power of deep pre-trained neural networks, but which pre-trained model has a better ability to balance ``Plasticity-Stability", deserving to be chosen? The logit shift serves as a natural proxy because it represents the logit shift in CL scenarios. However, obtaining the logit shift requires huge computational cost, which hinders large-scale model selection. Existing theoretical analyses fail to offer an efficient alternative because of the assumption of uniform hidden layer widths, which ignores the structural heterogeneity (variable width and depth) of real-world architectures. This raises a critical question: what theoretically relationship can be identified between heterogeneous architecture and logit shift on prior tasks (that the model has been trained on)?

Architecture-driven Shift: towards a lightweight selector for capturing the trends of logit shift 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术