On May 22, 2026, the Institute of Smart Systems and Artificial Intelligence (ISSAI) at Nazarbayev University hosted the LLM Workshop dedicated to the development of the Kazakh language in the era of digitalization and Artificial intelligence.
The workshop took place in the Senate Hall of Nazarbayev University and brought together experts, researchers, and representatives of the partners working on the advancement of language technologies for the Kazakh language. The workshop was organized within the framework of the project “Development of a Large Language Model (LLM) to Support the Kazakh Language and Technological Progress,”. The project is implemented by a Consortium of scientific and research organizations, including ISSAI, Al-Farabi Kazakh National University, the Institute of Information and Computational Technologies, the A. Baitursynov Institute of Linguistics, and the Sh. Shayakhmetov National Scientific and Practical Center “Til-Qazyna.”
The event opened with welcoming remarks by Yerbol Absalyamov, ISSAI Executive Director, who greeted the participants and wished everyone a productive workshop filled with meaningful discussions and fruitful exchanges. The workshop program featured a series of presentations on large language models, multimodal artificial intelligence, data infrastructure, corpus linguistics, and applied AI solutions. Akylbek Maxutov, Data Scientist at ISSAI, presented “Qolda-AVL: Extending a Vision-Language Model with Audio Understanding,” highlighting the expansion of vision-language models toward audio-based understanding. Dr. Nazgul Toyganbaeva, Senior Lecturer at Al-Farabi Kazakh National University, delivered a talk titled “Methods for collecting, processing, and analyzing a Kazakh Language Corpus for training Large Language Models.” Kurmetkan Turdybek, PhD candidate at the Institute of Information and Computational Technologies, presented “Semantic Search & Response Generation System for Legal Queries (Kazakh + Russian),” focusing on AI-based tools for legal information retrieval and response generation. Further sessions addressed the role of specialized Kazakh-language datasets and linguistic expertise in the development of LLMs. Maqpal Zhumabai, General Director of Til-Qazyna, presented “Specifics of Training Large Language Models (LLMs) Based on Til-Qazyna Data.” Dr. Anar Fazylzhan, Director of the Institute of Linguistics, delivered a presentation titled “Achievements and Future Prospects of Kazakh Digital and Corpus Linguistics.”
Opening the afternoon session, Prof. Madina Mansurova, Head of the Department of Artificial Intelligence and Big Data, Al-Farabi Kazakh National University, joined the workshop online and addressed the participants. Later the session continued with a presentation by Dr. Assel Ospan, Senior Lecturer at Al-Farabi Kazakh National University, titled “Development of an AI Assistant for Journalists Based on Large Language Models and Kazakh-Language Newspaper Articles.” Dr. Talgat Ramazan, Senior Researcher at the Institute of Linguistics, shared “A Linguist’s First Experience in Developing a Kazakh LLM,” offering insights into the linguistic perspective on the creation of Kazakh-language large language models.
Representing ISSAI, Vladimir Albrekht, Data Scientist, presented “FogGen: A Self-Evolving Edge-Cloud LLM Router with Predictive Confidence Tokens,” introducing an approach to efficient and adaptive LLM routing between edge and cloud environments.
As part of the workshop program, participants also visited the Block 1 Data Center, where they attended a demonstration of the “Irgetas” Supercomputer. The visit provided participants with an opportunity to learn more about the computing infrastructure supporting advanced AI research and model development.
The LLM Workshop in Astana continued the professional dialogue launched on November 7, 2025, at Al-Farabi Kazakh National University in Almaty. Like the previous event, the workshop served as a platform for knowledge exchange among academic, research, and applied organizations involved in developing language technologies for the Kazakh language. Special attention was given to the progress of the BR24993001 project, opportunities for future joint research, the development of technological infrastructure for the Kazakh language, and the strengthening of cooperation among consortium members. Participants emphasized the importance of creating domestic language models and AI tools capable of supporting the Kazakh language in the digital environment and contributing to Kazakhstan’s technological advancement.
Throughout the workshop, invited speakers and guests demonstrated strong interest in the presented research and actively engaged with the speakers through questions and discussions. Many questions focused on the practical use of modern AI solutions for the development, preservation, and support of the Kazakh language, reflecting the growing demand for high-quality national language technologies.
The LLM Workshop at Nazarbayev University marked an important step in strengthening interinstitutional cooperation and building a scientific and technological foundation for advanced AI solutions tailored to the Kazakh language and Kazakhstan’s national priorities.