An Optimized Soft Cutting Approach to Derive Syllables from Words in Text to Speech Synthesizer

S.P. Kawachale and J.S. Chitode (India)


Consonant Vowel Structures (CV), CV Structure Breaking Rules, Syllables, Textual Database, and Audio Database.


Syllable based speech synthesizer is proposed in this paper. A text - to - speech (TTS) synthesizer must be capable of automatically producing speech by storing small segments of speech and splicing and resplicing them when required. Two basic methods of speech synthesis are, (1) Rule based synthesis: Rule based speech synthesis uses rules of particular language to generate the synthetic speech. (2) Dictionary based synthesis: Dictionary based speech synthesis uses most commonly used words in the audio database. Rule based synthesis has the drawback of reduced naturalness of synthetic speech. Dictionary based synthesis has the drawback of large database size as each word needs to be stored. But syllable based speech synthesizer generates more number of words based on very small database. Different syllables can form new words. Hence original database is not large. Soft cutting of syllables gives the ‘from’ & ‘to’ location of sample numbers of syllables and then these locations can be used in the database. In this way the database becomes more efficient and hence can generate more number of words through concatenation of newly formed syllables.

