JuICE: A Benchmark for Evaluating LLM-Judge in Identifying Cultural Errors 文章

ArXiv CS.CL2026-05-27NEWSen作者: Jiho Jin, Junho Myung, Juhyun Oh, Junyeong Park, Rifki Afina Putri, Sunipa Dev, Vinodkumar Prabhakaran, Alice Oh

JuICE: A Benchmark for Evaluating LLM-Judge in Identifying Cultural Errors · 相关技术