In order to stimulate research and innovation and encourage the use of Kazakh in the digital field, in 2021, ISSAI developed a Kazakh speech dataset called “KazakhTTS”.
KazakhTTS is a high-quality open-source speech dataset that contains over 90 hours of audio recorded by professional speakers (male and female voices). The dataset has attracted great interest and has been downloaded more than 500 times in less than a year by academia and industry representatives.
To carry on the success, we present a new version of the dataset called KazakhTTS2, which includes more data, speakers, and topics. Specifically, we have increased the data size from 90 hours to 271 hours. We have added three new professional speakers (two females and one male), with over 25 hours of transcribed data for each speaker. We have diversified the topic coverage with a book and Wikipedia articles.
KazakhTTS2 dataset can be used to develop Kazakh text-to-speech models for numerous applications, such as interactive smart assistant systems, navigation systems, announcement systems and assistive technologies for the people with special needs. Like the first version, KazakhTTS2 dataset is freely available to both academic researchers and industry practitioners from ISSAI website.
To demonstrate the utility of the KazakhTTS2 dataset, ISSAI has developed a demo program for Kazakh speech synthesis. The demo supports five different voices.
Instructions for Kazakh TTS demo:
ISSAI invites academic and industrial organizations to download the dataset and contribute to the use of the Kazakh language in the digital world.
Please note: this is a KazakhTTS DATASET, not a demo of the Kazakh Text-To-Speech conversion technology