I Hear, Therefore I Trust: A Socio-Technical Investigation of Humans as Synthetic Speech Detectors 文章

ArXiv CS.AI2026-05-28NEWSen作者: Lelia Erscoi (Computational Speech Group, University of Eastern Finland), Tomi Kinnunen (Computational Speech Group, University of Eastern Finland)

查看原文 →

关系图谱

摘要

arXiv:2605.28064v1 Announce Type: cross Abstract: Automatic deepfake detection has received considerable research attention, yet the socio-technical environment in which humans actually encounter synthetic speech remains poorly understood. We investigate voice deepfake detection as a perceptual and contextual process, presenting a localization task in which 47 participants marked suspected synthetic segments across authentic, fully synthetic, and partially synthetic utterances under three manipulated trust cues: instructional framing, affective priming, and provenance labeling. Participants provided quality ratings on mechanicalness, expressiveness, intelligibility, clarity, calmness, and confidence of evaluation. Utterance class was the primary determinant of detection accuracy and perceptual quality; trust cues produced no main effects but motivated detection behavior. Fully synthetic speech was detected at below-chance levels.

I Hear, Therefore I Trust: A Socio-Technical Investigation of Humans as Synthetic Speech Detectors 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术