DiagFlowBench: Evaluating How Language Models Handle Off-Procedure Inputs in Grounded Diagnostic Dialogue 文章

ArXiv CS.AI2026-06-17NEWSen作者: Guillermo Gil de Avalle, Laura Maruster, Shaina Raza, Christos Emmanouilidis

DiagFlowBench: Evaluating How Language Models Handle Off-Procedure Inputs in Grounded Diagnostic Dialogue · 相关人物

暂无数据