How reliable are LLMs when it comes to playing dice? 事件

Name: How reliable are LLMs when it comes to playing dice?
Start: 2026-06-08

PRODUCT_LAUNCH2026-06-08影响: MEDIUM

How reliable are LLMs when it comes to playing dice? arXiv:2606.07515v1 Announce Type: new Abstract: We investigate the probabilistic reasoning capabilities of large language models through a controlled benchmarking study on discrete probability problems. We constructed two datasets, respectively a set of standard exercises and a set of counterintuitive exercises, designed to trigger heuristic reasoning, and evaluated 8 state-of-the-art models, each tested with and without Chain-of-Thought prom

人工智能

关系图谱

How reliable are LLMs when it comes to playing dice? 事件

相关公司查看全部 (10)

相关人物查看全部 (2)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)