Within-class covariance normalization for SVM-based speaker recognition 论文

2006引用 414
Speech Recognition and SynthesisSpeech and Audio ProcessingMusic and Audio Processing

摘要

This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space. Our ap-proach involves using principal component analysis (PCA) to split the original feature space into two subspaces: a low-dimensional “PCA space ” and a high-dimensional “PCA-complement space.” After performing WCCN in the PCA space, we concatenate the resulting feature vectors with a weighted version of their PCA-complements. When applied to a state-of-the-art MLLR-SVM speaker recognition system, this approach achieves improvements of up to 22 % in EER and 28 % in minimum decision cost function (DCF) over our previous baseline. We also achieve substantial im-provements over an MLLR-SVM system that performs WCCN in the PCA space but discards the PCA-complement. Index Terms: kernel machines, support vector machines, feature normalization, generalized linear kernels, speaker recognition.