Extracting Training Data from Diffusion Language Models via Infilling 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
Extracting Training Data from Diffusion Language Models via Infilling arXiv:2605.24173v1 Announce Type: new Abstract: Memorization in large language models has been studied almost exclusively through prefix-conditioned extraction, a natural choice for autoregressive models. However, diffusion language models (DLMs) can denoise masked tokens at arbitrary positions. Thus, prefix-only probing reveals only one facet of memorization in DLMs and significantly underestimates the risk of training-data
相关产品查看全部 (10)
相关报道查看全部 (1)
Extracting Training Data from Diffusion Language Models via Infilling
ArXiv CS.CL2026-05-26