Understanding Fact Recall in Language Models: Why Two-Stage Training Encourages Memorization but Mixed Training Teaches Knowledge 文章

ArXiv CS.CL2026-05-29NEWSen作者: Ying Zhang, Benjamin Heinzerling, Dongyuan Li, Kentaro Inui

摘要

arXiv:2505.16178v2 Announce Type: replace Abstract: While fine-tuning is the standard for injecting factual knowledge into large language models (LLMs), the mechanisms enabling reliable fact recall via unseen queries remain poorly understood. Common two-stage training strategies, which sequentially train on fact storage and query formats, often cause rote memorization. In contrast, mixed training jointly optimizes both formats and exhibits superior generalized recall. We investigate this success by comparing the two paradigms across 2.8$\sim$4B LLMs and identify the core mechanism: the joint optimization objective in mixed training induces gradient consistency across storage and query formats. This in turn drives the representation consistency between the two formats, establishing a format-invariant retrieval process that maps unseen queries to stored facts. In contrast, the lack of such an objective in two-stage training results in inconsistent representations and failed recall.