QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents 文章

ArXiv CS.CL2026-05-27NEWSen作者: Ye Yuan, Rui Song, Weien Li, Zeyu Li, Haochen Liu, Xiangyu Kong, Changjiang Han, Yonghan Yang, Zichen Zhao, Zixuan Dong, Fuyuan Lyu, Bowei He, Haolun Wu, Jikun Kang, Xue Liu

查看原文 →

关系图谱

摘要

arXiv:2605.27068v1 Announce Type: new Abstract: Social deduction games have become a popular testbed for probing reasoning, deception, coordination, and belief modeling in Large Language Model (LLM) agents. However, most environments are scored only by game outcomes such as win rates and largely remain to text-only interaction, making it difficult to tell whether an agent's language is actually grounded in what it perceived and did, or to identify the failure modes underlying its behavior. To address this gap, we introduce QUACK, an open-source environment and evaluation framework for auditing the grounding of agent language in multimodal social reasoning. QUACK evaluates agents at three levels: game outcomes, behavioral trajectories, and utterance-level consistency.

QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents 文章

摘要

相关事件查看全部 (2)

相关公司查看全部 (5)

相关人物

相关产品查看全部 (10)

相关技术查看全部 (24)