CoBit: Language Modeling with Bitstream Diffusion 文章

ArXiv CS.CL2026-06-16NEWSen作者: Georgios Batzolis, Mark Girolami, Luca Ambrogioni

详细信息

来源站点
ArXiv CS.CL
作者
Georgios Batzolis, Mark Girolami, Luca Ambrogioni
文章类型
NEWS
语言
en
发布日期
2026-06-16

摘要

arXiv:2605.07013v2 Announce Type: replace Abstract: Diffusion language models (DLMs) promise parallel, order-agnostic generation, but on standard benchmarks they have historically lagged behind autoregressive models in sample quality and diversity. Recent continuous flow and diffusion approaches have narrowed this gap. In this work, we further close the autoregressive gap by modeling text as a continuous diffusion process over fixed-width binary bitstreams. We refer to the resulting model as CoBit (Continuous Bitstream Diffusion). Our approach represents semantic tokens as analog bit sequences and uses a matched-filter residual parameterization to isolate contextual learning from analytic independent-bit posteriors. Crucially, we adopt a stochastic sampler that applies Langevin-type corrections gated by the entropy-rate profile, concentrating stochasticity in high-information regions while remaining nearly deterministic elsewhere.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据