Hot-Start Chinese Language Modeling:Visual Glyphs Accelerate Sample-Efficient Learning 文章

ArXiv CS.CV2026-06-02NEWSen作者: Shuyang Xiang, Hao Guan

摘要

arXiv:2601.09566v4 Announce Type: replace Abstract: In this work, we study whether rendering Chinese characters as visual glyph images, rather than discrete token IDs as mainstream LLMs do, providing an inductive bias for character-level language modeling. Our central finding gives a double-edged insight: visual inputs produce a pronounced hot-start effect, more than doubling early-stage accuracy within the first epoch (at 0.4% of total training steps) (12.3% visual inputs vs. 5.8% index-based baseline), yet both approaches converge to essentially identical final accuracy (39%). This pattern holds across resolutions as low as 8x8 pixels, partial cropping up to 50%, and model scales from 110M to 1.78B parameters. The mechanism we identify is that glyph rendering pre-encodes radical-based structure into embedding space before any training (cosine similarity 0.27 vs. 0.002 for random embeddings), enabling faster alignment but not higher final capacity.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据