Latent Recurrent Transformer: Architecture Exploration, Training Strategies, and Scaling Behavior 文章

ArXiv CS.CL2026-05-27NEWSen作者: Zeyi Huang, Xuehai He, LiLiang Ren, Yiping Wang, Baolin Peng, Hao Cheng, Shuohang Wang, Pengcheng He, Jianfeng Gao, Yong Jae Lee, Yelong Shen

Latent Recurrent Transformer: Architecture Exploration, Training Strategies, and Scaling Behavior · 相关技术