MedFact: Benchmarking the Fact-Checking Capabilities of Large Language Models on Chinese Medical Texts 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

MedFact: Benchmarking the Fact-Checking Capabilities of Large Language Models on Chinese Medical Texts arXiv:2509.12440v3 Announce Type: replace Abstract: Deploying Large Language Models (LLMs) in medical applications requires fact-checking capabilities to ensure patient safety and regulatory compliance. We introduce MedFact, a challenging Chinese medical fact-checking benchmark with 2,116 expert-annotated instances from diverse real-world texts, spanning 13 specialties, 8 error types, 4 writin