The research paper “A Study of Multimodal Person Verification Using Audio-Visual-Thermal Data” by Madina Abdrakhmanova, Saniya Abushakimova, Yerbolat Khassanov, and Huseyin Atakan Varol was presented by ISSAI data scientist Madina Abdrakhmanova at Odyssey 2022: The Speaker and Language Recognition Workshop.
In the paper, the authors explore an approach to multimodal person verification using audio, visual, and thermal modalities. The combination of audio and visual modalities has already been shown to be effective for robust person verification. From this perspective, the authors investigate the impact of further increasing the number of modalities by adding thermal images. The experiment conducted demonstrated the superior performance of the trimodal verification system. The authors make their code, pretrained models, and preprocessed dataset freely available in our GitHub repository to enable reproducibility of the experiment and facilitate research into multimodal person verification.
This year, Odyssey 2022: The Speaker and Language Recognition Workshop was hosted by Tsinghua University in Beijing, China, June 28 – July 01, 2022. The workshop is a research workshop organized by the International Speech Communication Association (ISCA) and held in cooperation with the ISCA Speaker and Language Characterization special interest group.
The aim of this workshop is to promote interaction among researchers in the field of speaker and language recognition. Previously, the Odyssey workshops were held in different cities, such as Singapore (2012), Joensuu (2014), Bilbao (2016), Les Sables d’Olonne (2018), and Tokyo (2020).