MultiHaluDet: Multilingual Hallucination Detection via LLM Hidden State Probing 文章

ArXiv CS.CL2026-05-26NEWSen作者: Riasad Alvi, Nurul Labib Sayeedi, Md. Faiyaz Abdullah Sayeedi

摘要

arXiv:2605.24919v1 Announce Type: new Abstract: Hallucinations in Large Language Models (LLMs) represent a critical barrier to their reliable deployment, a vulnerability heavily exacerbated in non-English and resource-constrained contexts. Existing detection approaches that rely on output confidence heuristics or single-layer internal representations frequently fail to capture deep, complex factual inconsistencies across diverse languages. To address this, we introduce MultiHaluDet, a novel three-stage stacking framework that detects multilingual hallucinations by probing the full hidden state trajectories of frozen LLMs without requiring language-specific fine-tuning. Our method extracts sequential features across multiple layers and processes them via a hybrid architecture using multi-scale attention and self-attention pooling.