ISSAI - Institute of Smart Systems and Artificial Intelligence

ISSAI’s Lead Data Scientist Askat Kuzdeuov and Undergraduate Research Assistant Artur Muratov participated in the 51st Annual Conference of the IEEE Industrial Electronics Society (IECON). The event took place from October 14 to 17 at the Hotel Melia Castilla in Madrid, Spain.

IEEE IECON is the flagship yearly conference of the IEEE Industrial Electronics Society, devoted to the dissemination of new ideas, research and works in progress in the fields of Robotics and Mechatronics, Power Electronics, Cybersecurity, Renewable Energy, Smart Grid, Digital Twins, AI for Industrial Processes, and Industry 5.0, among others.

Askat Kuzdeuov presented his work, “Real-Time Multispectral Human Pose Estimation” (co-authored with Prof. Huseyin Atakan Varol), in the AI and Signal & Image Processing Methodologies session. In addition, he was co-chair of the session. This work proposed a data-centric approach for training YOLO11-pose models for multispectral human pose estimation (MHPE). As a result, YOLO11x-pose achieved an AP50:95-pose score of 95.23% on the test set of OpenThermalPose2, establishing a new benchmark for this dataset. Also, it achieved an AP50:95-pose score of 69.89% on the validation set of COCO, slightly outperforming the original YOLO11x-pose model. The models were optimized and deployed on an NVIDIA Jetson AGX Orin 64GB. The TensorRT models with half-precision floating-point (FP16) achieved the best balance of speed and accuracy, making them suitable for real-time applications. We have made the pre-trained models publicly available at https://github.com/IS2AI/multispectral-motion-analysis to support research in this area.

Artur Muratov presented his work, “Multilingual Speech Command Recognition with Language Identification”, (co-authored with Askat Kuzdeuov and Prof. Huseyin Atakan Varol), in the same session. This work proposed a unified multitask model that performs speech command recognition (SCR) and language identification (LID) simultaneously using a shared encoder and two task-specific output heads. The proposed approach was tested using 15 languages: Kazakh, Russian, English, Tatar, Arabic, Turkish, French, German, Catalan, Spanish, Polish, Dutch, Persian, Kinyarwanda, and Italian. The model achieved an average accuracy of 90.73% for SCR and 90.99% for LID, outperforming both the multilingual SCR model without LID and the LID-only model. We have made the source code and pretrained models available at https://github.com/IS2AI/Keyword-MLP-LangID to promote research in this area.

Both papers were presented in oral and poster sessions, drawing notable attention from the scientific community. In addition, our researchers had an opportunity to exchange research ideas with top experts from different fields, including a famous professor – Toshio Fukuda.

News

ISSAI presented two papers at the 51st Annual Conference of the IEEE Industrial Electronics Society in Madrid, Spain

News

ISSAI presented two papers at the 51st Annual Conference of the IEEE Industrial Electronics Society in Madrid, Spain

Related News