30 September 2020

Kazakh Speech Corpus featured in the news

Recent announcement about Kazakh Speech Corpus and Automated Speech Recognition projects has been published on local media news portals like Egemen.kzInformburo.kzElorda.infoinform.kztoday.kzainews.kzKazakh-tv and other channels. 

In their articles they have included all the main information about the project, project’s aim and results. In addition, they have included the link to try the Automated Speech Recognition system and the Kazakh Speech Corpus dataset.

In addition to the local media publications, project’s authors – ISSAI postdoctoral scholar Yerbolat Khassanov and computer engineer Almas Mirzhakhmetov – have participated in an interview for the national news channel “Khabar” where they have explained the project’s details and showed how it works.

To remind, on the occasion of ISSAI anniversary, the Institute has announced its new development – the Kazakh Speech Corpus and Automated Speech Recognition project.

The project was launched to support the use of Kazakh online and for interaction with computers, such as digital assistants, and in smart home applications.

The ISSAI team has created the world’s largest digital dataset of Kazakh speech, using a web-based technology to record and annotate over 300 hours of spoken Kazakh, collected from more than 2000 native speakers. The dataset was then used to develop speech recognition and speech synthesis for Kazakh language.  These technologies are used in virtual assistants, such as Siri and Alexa, and voice- or text-enabled applications, and will be of great benefit to people with special needs.

The dataset is available at and the public can personally test the performance of the Kazakh speech recognition system.