Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions 文章

ArXiv CS.CL2026-06-01NEWSen作者: Wesley Scivetti, Melissa Torgbi, Austin Blodgett, Mollie Shichman, Taylor Hudson, Claire Bonial, Harish Tayyar Madabushi

查看原文 →

关系图谱

摘要

arXiv:2501.04661v3 Announce Type: replace Abstract: The web-scale of pretraining data has created an important evaluation challenge: to disentangle linguistic competence on cases well-represented in pretraining data from generalization to out-of-domain language, specifically the dynamic, real-world instances less common in pretraining data. To this end, we construct a diagnostic evaluation to systematically assess natural language understanding in LLMs by leveraging Construction Grammar (CxG). CxG provides a psycholinguistically grounded framework for testing generalization, as it explicitly links syntactic forms to abstract, non-lexical meanings. Our novel inference evaluation dataset consists of English phrasal constructions, for which speakers are known to be able to abstract over commonplace instantiations in order to understand and produce creative instantiations.

Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术