Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech 论文

2012IEEE Transactions on Audio Speech and Language Processing引用 232

Speech Recognition and SynthesisSpeech and Audio ProcessingMusic and Audio Processing

Speech Recognition and Synthesis Speech and Audio Processing Music and Audio Processing

作者

摘要

In this paper, we evaluate the vulnerability of speaker verification (SV) systems to synthetic speech. The SV systems are based on either the Gaussian mixture model–universal background model (GMM-UBM) or support vector machine (SVM) using GMM supervectors. We use a hidden Markov model (HMM)-based text-to-speech (TTS) synthesizer, which can synthesize speech for a target speaker using small amounts of training data through model adaptation of an average voice or background model. Although the SV systems have a very low equal error rate (EER), when tested with synthetic speech generated from speaker models derived from the Wall Street Journal (WSJ) speech corpus, over 81% of the matched claims are accepted. This result suggests vulnerability in SV systems and thus a need to accurately detect synthetic speech. We propose a new feature based on relative phase shift (RPS), demonstrate reliable detection of synthetic speech, and show how this classifier can be used to improve security of SV systems.

作者查看全部 (5)

Ibon Saratxaga

Inma Hernáez

Junichi Yamagishi

Michael Pucher

Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech 论文

详细信息

摘要

作者查看全部 (5)

相关技术查看全部 (3)

相关事件

相关文章