Speech probability distribution 论文

2003IEEE Signal Processing Letters引用 298
Speech and Audio ProcessingBlind Source Separation TechniquesSpeech Recognition and Synthesis

摘要

It is demonstrated that the distribution of speech samples is well described by Laplacian distribution (LD). The widely known speech distributions, i.e., LD, Gaussian distribution (GD), generalized GD, and gamma distribution, are tested as four hypotheses, and it is proved that speech samples during voice activity intervals are Laplacian random variables. A decorrelation transformation is then applied to speech samples to approximate their multivariate distribution. To do this, speech is decomposed using an adaptive Karhunen-Loeve transform or a discrete cosine transform. Then, the distributions of speech components in decorrelated domains are investigated. Experimental evaluations prove that the statistics of speech signals are like a multivariate LD. All marginal distributions of speech are accurately described by LD in decorrelated domains. While the energies of speech components are time-varying, their distribution shape remains Laplacian.