On December 10, 2024, the Institute of Smart Systems and Artificial Intelligence (ISSAI) at Nazarbayev University unveiled the Kazakh Large Language Model (ISSAI KAZ-LLM), marking a pivotal milestone in Kazakhstan’s journey into the global AI arena. The model represents Kazakhstan’s commitment to innovation, self-reliance, and the growth of its technological ecosystem.
Tailored to the country’s unique multilingual and multicultural context, the open-source ISSAI KAZ-LLM is designed for Kazakh, Russian, and English, with additional support for Turkish, bridging linguistic gaps and advancing generative AI for lower-resourced languages.
Key Features and Achievements
- Locally Developed Excellence: Created by a skilled team of Kazakhstani researchers of the ISSAI Team, the project provided hands-on experience to local talent, bolstering the nation’s AI capabilities.
- Sophisticated AI Capabilities: Available in 8-billion and 70-billion parameter versions, built on Meta’s Llama architecture, and optimized for high-performance systems and resource-constrained environments.
- Vast Multilingual Training Dataset: Over 150 billion tokens were collected, curated, synthesized and translated by the ISSAI team, ensuring robust language performance.
- Benchmarked Leadership: ISSAI KAZ-LLM excels in Kazakh-language performance and achieves competitive results in Russian and English, rivaling global AI leaders.
- Open Access: Released under the CC-BY-NC license, six model versions are available for non-commercial use on Hugging Face, fostering global academic and research collaboration.
The project not only introduces a cutting-edge AI tool but also fosters the growth of Kazakhstan’s AI workforce. Researchers engaged in all stages of the process, from data preparation to model deployment, building a foundation for sustainable AI innovation. Collaborations with leading Kazakhstani institutions enabled the creation of benchmarking tools and datasets, adapted for Kazakh through advanced neural and human translation methods.
ISSAI plans to develop next-generation AI systems, including language-vision models, and expand the models to support additional Turkic and regional languages. This effort aims to strengthen regional ties, promote language inclusion, and foster meaningful economic and technological impact in Kazakhstan and beyond.
The ISSAI KAZ-LLM project was made possible by the generous support of the NU and NIS Foundation, Astana Hub, and QazCode (Beeline), with development carried out independently of public funding. We extend our heartfelt gratitude to these sponsors for their confidence in this initiative. We also thank Nazarbayev University, whose dedication to fostering innovation and intellectual growth has been instrumental in achieving this milestone.
Contact us at issai@nu.edu.kz to learn more or discuss partnership opportunities.