GRADE: Generalizable Reasoning-Aware Dialogue Evaluation for AI Tutors 事件
OPEN_SOURCE2026-05-28影响: MEDIUM
GRADE: Generalizable Reasoning-Aware Dialogue Evaluation for AI Tutors arXiv:2605.27866v1 Announce Type: new Abstract: Evaluating AI tutor responses requires more than factual correctness: tutors must identify mistakes, locate errors, provide guidance, and offer actionable next steps. We present GRADE, a systematic study of open-source models for pedagogical ability assessment in student-tutor dialogues. Building on the BEA 2025 TutorMind setting, we evaluate 120 configurations across five lang
相关产品查看全部 (10)
相关报道查看全部 (1)
GRADE: Generalizable Reasoning-Aware Dialogue Evaluation for AI Tutors
ArXiv CS.CL2026-05-28