Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies 文章

ArXiv CS.CL2026-05-29NEWSen作者: Yuxuan Ye, Raul Santos-Rodriguez, Edwin Simpson

摘要

arXiv:2605.29712v1 Announce Type: new Abstract: Grounded claim factuality checking is important for large language model (LLM) applications such as retrieval-augmented generation, as it helps users assess the correctness of generated outputs. Existing metrics using entailment classifiers require dataset-specific threshold tuning, while LLM-based approaches often use direct prompting, which underutilises the reasoning capabilities of LLMs. We address this by formulating grounded claim factuality checking as a true/false reading comprehension task and prompting LLMs with explicit test-taking strategies for efficient reasoning. Our method reduces token usage by over 80% compared to unguided open-ended reasoning, and achieves competitive performance to more expensive alternatives across two factuality benchmarks, setting a new state of the art on one. To further reduce inference cost, we train small language models (SLMs) to replace LLMs in the checking pipeline.