FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning 文章

ArXiv CS.AI2026-06-02NEWSen作者: Zeyu Wang, Jingye Xu, Xiaogang Li, Peiyao Xiao, Qinhao Kong, Ben Wang, Chengliang Xu, Zichao Chen, Bing Zhao, Hu Wei

查看原文 →

关系图谱

详细信息

来源站点: ArXiv CS.AI
作者: Zeyu Wang, Jingye Xu, Xiaogang Li, Peiyao Xiao, Qinhao Kong, Ben Wang, Chengliang Xu, Zichao Chen, Bing Zhao, Hu Wei
文章类型: NEWS
语言: en
发布日期: 2026-06-02

原文

摘要

arXiv:2604.03893v2 Announce Type: replace Abstract: Current multimodal benchmarks for scientific reasoning primarily evaluate local information extraction -- models recognize symbols and values and then perform textual inference. They do not assess whether models can reason over the global structural properties of formal diagrams, such as topology, conservation constraints, and the consistent mapping between visual patterns and algebraic expressions. We introduce FeynmanBench, a benchmark of over 2,000 tasks centered on Feynman diagrams spanning the electromagnetic, weak, and strong interactions of the Standard Model. Each instance couples a diagram image with minimal textual conventions and requires models to recover the full physical content -- vertex inventory, propagator types, topological connectivity, momentum routing, and the complete scattering amplitude.

FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (2)