Publication

Speech Recognition for Turkic Languages Using Cross-Lingual Transfer Learning from Kazakh

Abstract:

This paper investigates the effectiveness of transfer learning in building automatic speech recognition models for nine Turkic languages (Azerbaijani, Bashkir, Chuvash, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek), by leveraging large-scale training data from the Kazakh language. The performance of the models built using transfer learning from Kazakh was compared with the performance of the models for three non-Turkic languages (Indonesian, Japanese, and Swedish) to which transfer learning from Kazakh was also applied. We also compared the performance of the models with the results of models trained on English data. A total of 64 models were created. Most of the models built using transfer learning from Kazakh performed better than the monolingual baselines, with the most notable improvement observed for the Sakha model, which achieved a 45.5% and 22.8% reduction in the word error rate and character error rate on the test set, respectively. The datasets and codes used to train the models are available for download from https://github.com/IS2AI/CLTL_Turkic_ASR.

Information about the publication

Authors:

Daniil Orel, Rustem Yeshpanov, Huseyin Atakan Varol
Data PDF