Dynamic Short Convolutions Improve Transformers 文章

ArXiv CS.CL2026-06-03NEWSen作者: Oliver Sieberling, Bharat Runwal, Rameswar Panda, Yoon Kim

Dynamic Short Convolutions Improve Transformers · 相关技术