ABSTRACT:
Human pose estimation has many applications in action recognition, human-robot interaction, motion capture, augmented reality, sports analytics, and healthcare. Numerous datasets and deep learning models have been developed for human pose estimation within the visible domain. However, poor lighting conditions and privacy issues persist. These challenges can be addressed using thermal cameras; however, there is a limited number of annotated thermal human pose datasets for training deep learning models. Previously, we presented the OpenThermalPose dataset with 6,090 thermal images of 31 subjects and 14,315 annotated human instances. In this work, we extend OpenThermalPose with more thermal images, human instances, and poses. The extended dataset, OpenThermalPose2, contains 21,125 elaborately annotated human instances within 11,391 thermal images of 170 subjects. To show the efficacy of OpenThermalPose2, we trained the YOLOv8-pose and YOLO11-pose models on the dataset. The experimental results showed that models trained with OpenThermalPose2 outperformed the previous YOLOv8-pose models trained with OpenThermalPose. Additionally, we optimized the YOLO11-pose models trained on OpenThermalPose2 by converting their checkpoints from PyTorch to TensorRT formats. We deployed the PyTorch and TensorRT models on an NVIDIA Jetson AGX Orin 64GB and measured their inference time and accuracy. The TensorRT models using half-precision floating-point (FP16) achieved the best balance between speed and accuracy, making them suitable for real-time applications. We have made the dataset, source code, and pre-trained models publicly available at https://github.com/IS2AI/OpenThermalPose to bolster research in this field.