A Multi-Probe Audit of Clinical-Interview Depression Detection Benchmarks 文章

ArXiv CS.CL2026-05-26NEWSen作者: Takehiro Ishikawa, Jon Duke

摘要

arXiv:2605.23977v1 Announce Type: new Abstract: This paper audits benchmark evaluation in clinical-interview depression detection through four complementary probes across DAIC/E-DAIC, CMDC, ANDROIDS, MODMA, and PDCH. First, we re-evaluate E-DAIC under strict subject-disjoint leave-one-subject-out cross-validation. A lightweight hybrid text-plus-LLM-score model reaches macro-F1 = 0.723 - the highest reported under this protocol, to our knowledge - providing a conservative out-of-fold reference point that does not depend on the privileged official holdout. Second, we test whether the E-DAIC official split supports fine-grained leaderboard rankings by sweeping 96 model configurations across modality bundles, pooling strategies, and learners.

A Multi-Probe Audit of Clinical-Interview Depression Detection Benchmarks 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (6)

相关技术