Design and Evaluation of Multi-Agent AI Oracle Systems for Prediction Market Resolution 文章

ArXiv CS.AI2026-06-01NEWSen作者: Tarun Kota

摘要

arXiv:2605.30802v1 Announce Type: cross Abstract: Prediction markets aggregate collective intelligence to forecast uncertain events, but their utility depends on reliable outcome resolution. Existing oracle systems tradeoff fast but brittle automation against accurate but costly human arbitration. Single-LLM oracles achieve meaningful accuracy but inherit all failure modes of their underlying model with no self-correction mechanism. We evaluate whether multi-agent LLM architectures can improve oracle resolution accuracy over single-model baselines. We compare independent aggregation and deliberative consensus against single-LLM baselines (GPT-5 Nano, DeepSeek V3, and Llama-3.3-70B) on 1,189 resolved prediction market questions from KalshiBench. All agents share a common evidence layer through Exa, with retrieval filtered by publication date to isolate reasoning from retrieval quality. Independent aggregation with confidence-weighted voting achieves the highest accuracy at 83.

Design and Evaluation of Multi-Agent AI Oracle Systems for Prediction Market Resolution 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (5)

相关技术