SimulCost: A Cost-Aware Benchmark and Toolkit for Automating Physics Simulations with LLMs 文章

ArXiv CS.AI2026-06-01NEWSen作者: Yadi Cao, Sicheng Lai, Jiahe Huang, Yang Zhang, Zach Lawrence, Rohan Bhakta, Izzy F. Thomas, Mingyun Cao, Chung-Hao Tsai, Zihao Zhou, Yidong Zhao, Hao Liu, Alessandro Marinoni, Alexey Arefiev, Rose Yu

摘要

arXiv:2603.20253v2 Announce Type: replace-cross Abstract: Evaluating LLM agents for scientific tasks has focused on token costs while ignoring tool-use costs like simulation time and experimental resources. As a result, metrics like pass@k become impractical under realistic budget constraints. To address this gap, we introduce SimulCost, the first benchmark targeting cost-sensitive parameter tuning in physics simulations. SimulCost compares LLM tuning cost-sensitive parameters against traditional scanning approach in both accuracy and computational cost, spanning 2,947 single-round (initial guess) and 1,931 multi-round (adjustment by trial-and-error) tasks across 13 simulators from fluid dynamics, solid mechanics, and plasma physics. Each simulator's cost is analytically defined and platform-independent.

相关公司

暂无数据

相关人物

暂无数据