Evaluating Reliability Asymmetries in Chinese Factual Search and AI Answers 文章

ArXiv CS.CL2026-06-02NEWSen作者: Geng Liu, Li Feng, Mengxiao Zhu, Francesco Pierri

摘要

arXiv:2602.22221v2 Announce Type: replace-cross Abstract: Search engines and AI-powered systems increasingly mediate access to factual information, yet their reliability remains difficult to evaluate in realistic information-seeking settings. We study this problem in the Chinese web ecosystem by constructing a query-based fact-checking dataset from real Chinese search logs and comparing nine systems across traditional search engines, standalone large language models, and search-integrated AI Overviews. Focusing on factual Chinese-language factual Yes/No questions, we evaluate whether systems provide correct, incorrect, or uncertain decisions against evidence-derived ground truth. We find that systems are similarly accurate when they provide definitive answers, but differ sharply in how often they do so. Conditional accuracy ranges from 73.2% to 78.9%, yet search engines answer definitively on over 83% of queries, while Qwen-Max does so on fewer than half.

Evaluating Reliability Asymmetries in Chinese Factual Search and AI Answers 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (3)

相关技术查看全部 (1)