LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling 事件

BREAKTHROUGH2026-06-04影响: HIGH

LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling arXiv:2606.04552v1 Announce Type: new Abstract: Genomic foundation models increasingly adopt large language model architectures, yet almost universally rely on fixed tokenization schemes such as $k$-mers, BPE, or single nucleotides, which impose arbitrary sequence boundaries that may obscure biologically relevant structure. We present LDARNet, a 120M-parameter hierarchical genomic foundation model that

LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling · 相关人物

暂无数据