Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions arXiv:2501.04661v3 Announce Type: replace Abstract: The web-scale of pretraining data has created an important evaluation challenge: to disentangle linguistic competence on cases well-represented in pretraining data from generalization to out-of-domain language, specifically the dynamic, real-world instances less common in pretraining data. To this end, we construct a diagnostic evaluatio