Merlin: An Open Source Neural Network Speech Synthesis System 论文
2016引用 298
Speech Recognition and SynthesisNatural Language Processing TechniquesTopic Modeling
摘要
We introduce the Merlin speech synthesis toolkit for neural network-based speech synthesis.The system takes linguistic features as input, and employs neural networks to predict acoustic features, which are then passed to a vocoder to produce the speech waveform.Various neural network architectures are implemented, including a standard feedforward neural network, mixture density neural network, recurrent neural network (RNN), long short-term memory (LSTM) recurrent neural network, amongst others.The toolkit is Open Source, written in Python, and is extensible.This paper briefly describes the system, and provides some benchmarking results on a freelyavailable corpus.