Procedural Pretraining: Warming Up Language Models with Abstract Data 事件
ACQUISITION2026-05-29影响: HIGH
Procedural Pretraining: Warming Up Language Models with Abstract Data arXiv:2601.21725v2 Announce Type: replace Abstract: Pretraining language models directly on web-scale corpora is the de facto paradigm. We study an alternative where the model is initially exposed to abstract structured data to ease the subsequent acquisition of rich semantic knowledge, much like humans learning simple logic and mathematics before higher reasoning. We focus on procedural data, generated by formal languages an