EngiAI: A Multi-Agent Framework and Benchmark Suite for LLM-Driven Engineering Design 文章

ArXiv CS.AI2026-05-28NEWSen作者: Gioele Molinari, Florian Felten, Soheyl Massoudi, Mark Fuge

详细信息

来源站点
ArXiv CS.AI
作者
Gioele Molinari, Florian Felten, Soheyl Massoudi, Mark Fuge
文章类型
NEWS
语言
en
发布日期
2026-05-28

摘要

arXiv:2605.19743v2 Announce Type: replace Abstract: Large Language Model (LLM) agents are increasingly applied to engineering design tasks, yet existing evaluation frameworks do not adequately address multi-agent systems that combine simulation, retrieval, and manufacturing preparation. We introduce a benchmark suite with three evaluation dimensions: (1) a workflow benchmark with seven prompt styles targeting distinct cognitive demands-including direct tool use, semantic disambiguation, conditional branching, and working-memory tasks; (2) a Retrieval-Augmented Generation (RAG) benchmark with gated scoring isolating retrieval contributions to parameter selection; and (3) an High Performance Computing (HPC) benchmark evaluating end-to-end ML training orchestration on a SLURM cluster.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据