Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited 论文

2002引用 220
Speech and Audio ProcessingPhonetics and Phonology ResearchSpeech Recognition and Synthesis

摘要

A simple new procedure called STRAIGHT (speech transformation and representation using adaptive interpolation of weighted spectrum) has been developed. STRAIGHT uses pitch-adaptive spectral analysis combined with a surface reconstruction method in the time-frequency region, and an excitation source design based on phase manipulation. It preserves the bilinear surface in the time-frequency region and allows for over 600% manipulation of such speech parameters as pitch, vocal tract length, and speaking rate, without further degradation due to the parameter manipulation.