Chunyi Guo, Weiqian Liang, Ming Fan, and Kejun Liu
bilinear model, speaker adaptation, singular value decompostion(SVD), speech recognition
Speaker adaptation is a key technology for the practical applications of speech recognition. The traditional bilinear model-based method decomposes a matrix composed of speaker-dependent HMMs into the style factor specific to each speaker and content factor across all speakers. In this paper, we present a new framework for speaker adaptation using a uniform styled bilinear model which decomposes original trained speaker-independent HMMs into one single matrix to express all the speaker's styles and a matrix to describe the speech content classes. Using adaptation data from a new speaker, the elements of the style matrix are trained to be adjusted in order to be adaptive to the new speaker, rather than generating a new style matrix. The effectiveness of the new method is demonstrated with experimental results on the connected Chinese digits speech recognition.
Important Links:
Go Back