Deep learning for monaural speech separation 论文

2014引用 435

Speech and Audio ProcessingSpeech Recognition and SynthesisMusic and Audio Processing

Speech Recognition and Synthesis Speech and Audio Processing Music and Audio Processing

作者

摘要

Monaural source separation is useful for many real-world applications though it is a challenging problem. In this paper, we study deep learning for monaural speech separation. We propose the joint optimization of the deep learning models (deep neural networks and recurrent neural networks) with an extra masking layer, which enforces a reconstruction constraint. Moreover, we explore a discriminative training criterion for the neural networks to further enhance the separation performance. We evaluate our approaches using the TIMIT speech corpus for a monaural speech separation task. Our proposed models achieve about 3.8∼4.9 dB SIR gain compared to NMF models, while maintaining better SDRs and SARs.

作者查看全部 (4)

Paris Smaragdis

Mark Hasegawa‐Johnson

Minje Kim

Po Sen Huang

Deep learning for monaural speech separation 论文

摘要

作者查看全部 (4)

相关技术查看全部 (3)

相关事件

相关文章