Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation 事件

Name: Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation
Start: 2026-06-08

ACQUISITION2026-06-08影响: HIGH

Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation arXiv:2606.06712v1 Announce Type: new Abstract: We study the transformation of autoregressive models (ARLMs) into diffusion language models (DLMs). Rather than pretraining from scratch, prior work replaces the causal attention in ARLMs with bidirectional attention and then trains the resulting model using a DLM objective. However, these approaches incur two distribution shifts. First, transitioning from a next

人工智能

关系图谱

Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation 事件

相关公司查看全部 (8)

相关人物查看全部 (1)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)