Applying the harmonic plus noise model in concatenative speech synthesis 论文

2001IEEE Transactions on Speech and Audio Processing引用 325

Speech Recognition and SynthesisPhonetics and Phonology ResearchSpeech and dialogue systems

Speech Recognition and Synthesis Speech and dialogue systems Phonetics and Phonology Research

作者

摘要

This paper describes the application of the harmonic plus noise model (HNM) for concatenative text-to-speech (TTS) synthesis. In the context of HNM, speech signals are represented as a time-varying harmonic component plus a modulated noise component. The decomposition of a speech signal into these two components allows for more natural-sounding modifications of the signal (e.g., by using different and better adapted schemes to modify each component). The parametric representation of speech using HNM provides a straightforward way of smoothing discontinuities of acoustic units around concatenation points. Formal listening tests have shown that HNM provides high-quality speech synthesis while outperforming other models for synthesis (e.g., TD-PSOLA) in intelligibility, naturalness, and pleasantness.

作者查看全部 (1)

Yannis Stylianou

Applying the harmonic plus noise model in concatenative speech synthesis 论文

摘要

作者查看全部 (1)

相关技术查看全部 (1)

相关事件

相关文章