VLMs May Not Globally Enhance Human Alignment over LLMs During Natural Reading 文章

ArXiv CS.CL2026-05-28NEWSen作者: Jinzhou Wu, Zhengwu Ma, Jixing Li, Baoping Tang, Zitong Lu

摘要

arXiv:2605.28818v1 Announce Type: new Abstract: Large language models (LLMs) have become increasingly useful computational models of human language processing, but it remains unclear whether vision-language learning makes text representations more human-like during natural reading. Here, we address this question by comparing tightly matched LLM and vision-language model (VLM) pairs under a strictly text-only setting, allowing us to isolate the effect of multimodal training history from online visual input or cross-modal fusion. We evaluate model alignment with a human natural-reading dataset that includes whole-cortex fMRI responses and synchronized eye-tracking saccades. Our findings demonstrate that multimodal pretraining may not confer a uniform, global advantage in human alignment during natural reading, indicating that language-internal representations remain the key factor for modeling human text processing.

VLMs May Not Globally Enhance Human Alignment over LLMs During Natural Reading 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (2)