Context-Free Recognition with Transformers 文章

ArXiv CS.CL2026-06-01NEWSen作者: Selim Jerad, Anej Svete, Sophie Hao, Ryan Cotterell, William Merrill

摘要

arXiv:2601.01754v3 Announce Type: replace-cross Abstract: Transformers excel empirically on tasks that process well-formed inputs according to some grammar, such as natural language and code. However, it remains unclear how they can process grammatical syntax. In fact, under standard complexity conjectures, standard transformers cannot recognize context-free languages (CFLs), a canonical formalism to describe syntax, or even regular languages, a subclass of CFLs. Past work has shown that $\mathcal{O}(\log(N))$ looping layers (w.r.t. input length $N$) allow transformers to recognize regular languages, but the question of context-free recognition with looped transformers remained open. In this work, we show that looped transformers with $\mathcal{O}(\log(N))$ looping layers and $\mathcal{O}(N^6)$ padding symbols can recognize all CFLs. However, training and inference with $\mathcal{O}(N^6)$ padding symbols is potentially impractical.

Context-Free Recognition with Transformers 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (6)