Combining Clustering and Functionals based Acoustic Feature Representations for Classification of Baby Sounds
Publication date
2020-10-29
Editors
Advisors
Supervisors
Document Type
Contribution to conference
Metadata
Show full item recordCollections
License
Abstract
This paper investigates different fusion strategies as well as provides insights on their effectiveness alongside standalone classifiers in the framework of paralinguistic analysis of infant vocalizations. The combinations of such systems as Support Vector Machines (SVM) and Extreme Learning Machines (ELM) based classifiers, as well as its weighted kernel version are explored, training systems on different acoustic feature representations and implementing weighted score-level fusion of the predictions. The proposed framework is tested on INTERSPEECH ComParE-2019 Baby Sounds corpus, which is a collection of Home Bank infant vocalization corpora annotated for five classes. Adhering to the challenge protocol, using a single test set submission we outperform the challenge baseline Unweighted Average Recall (UAR) score and achieve a comparable result to the state-of-the-art.
Keywords
baby sounds classification, computational paralinguistics, information fusion, extreme learning machines, support vector machines
Citation
Kaya, H, Verkholyak, O, Markitantov, M, Karpov, A & Markitantov, M 2020, 'Combining Clustering and Functionals based Acoustic Feature Representations for Classification of Baby Sounds', Paper presented at ICMI 2020 Workshop on Bridging Social Sciences and AI for Understanding Child Behavior, Utrecht, Netherlands, 29/10/20 - 29/10/20 pp. 509-513. https://doi.org/10.1145/3395035.3425182, workshop