EngiAI: A Multi-Agent Framework and Benchmark Suite for LLM-Driven Engineering Design 文章

ArXiv CS.AI2026-05-28NEWSen作者: Gioele Molinari, Florian Felten, Soheyl Massoudi, Mark Fuge

详细信息

来源站点: ArXiv CS.AI
作者: Gioele Molinari, Florian Felten, Soheyl Massoudi, Mark Fuge
文章类型: NEWS
语言: en
发布日期: 2026-05-28

摘要

arXiv:2605.19743v2 Announce Type: replace Abstract: Large Language Model (LLM) agents are increasingly applied to engineering design tasks, yet existing evaluation frameworks do not adequately address multi-agent systems that combine simulation, retrieval, and manufacturing preparation. We introduce a benchmark suite with three evaluation dimensions: (1) a workflow benchmark with seven prompt styles targeting distinct cognitive demands-including direct tool use, semantic disambiguation, conditional branching, and working-memory tasks; (2) a Retrieval-Augmented Generation (RAG) benchmark with gated scoring isolating retrieval contributions to parameter selection; and (3) an High Performance Computing (HPC) benchmark evaluating end-to-end ML training orchestration on a SLURM cluster.

EngiAI: A Multi-Agent Framework and Benchmark Suite for LLM-Driven Engineering Design 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (4)