A parametric approach to vocal tract length normalization 论文
2002引用 260
Speech Recognition and SynthesisSpeech and Audio ProcessingMusic and Audio Processing
摘要
Differences in vocal tract size among individual speakers contribute to the variability of speech waveforms. The first-order effect of a difference in vocal tract length is a scaling of the frequency axis; a female speaker, for example, exhibits formants roughly 20% higher than the formants of from a male speaker, with the differences most severe in open vocal tract configurations. We describe a parametric method of normalisation which counteracts the effect of varied vocal tract length. The method is shown to be effective across a wide range of recognition systems and paradigms, but is particularly helpful in the case of a small amount of training data.