Understanding Data Temporality Impact on Large Language Models Pre-training 事件

ACQUISITION2026-05-26影响: HIGH

Understanding Data Temporality Impact on Large Language Models Pre-training arXiv:2605.22769v2 Announce Type: replace Abstract: Large language models (LLMs) are typically trained on shuffled corpora, yielding models whose knowledge is frozen at train time and whose temporal grounding remains poorly understood. In this work, we study the impact of pre-training dynamics on the acquisition of time-sensitive factual knowledge, focusing specifically on data ordering. Our main contributions are twofo