A large-scale publicly-available dataset designed to encourage research in the general areas of user authentication, facial recognition, speech recognition and human-computer interaction.
Speaking Faces consists of well-aligned high-resolution thermal and visual spectra image streams faces synchorized with audiorecordings of each subject speaking 100 imperative phrases. Data was collected from 140 subjects, yielding 14,000 instances of synchronized raw data (7.5 TB).
Application Areas:
Biometric Authentication | Face Recognition Speaker | Recognition Audio + Visual + Thermal Speech | Recognition Human Computer Interaction | Domain Transfer | Image Translation | Lip Reading from Visual/Thermal Images
Speaking Faces: A Large-Scale Dataset of Voice Commands with Visual and Thermal Video Streams
M. Abdrakhmanova, A. Kuzdeuov, S.Jarju, M. Lewis, Y.Khassanov, H.A. Varol
Powered by
GitHub