Combining Clustering and Functionals based Acoustic Feature Representations for Classification of Baby Sounds

Publication date

2020-10-29

Authors

Kaya, HeysemORCID 0000-0001-7947-5508ISNI 000000049289651X
Verkholyak, Oxana
Markitantov, Maxim
Karpov, Alexey
Markitantov, Maxim

Editors

Advisors

Supervisors

Document Type

Contribution to conference

License

Abstract

This paper investigates different fusion strategies as well as provides insights on their effectiveness alongside standalone classifiers in the framework of paralinguistic analysis of infant vocalizations. The combinations of such systems as Support Vector Machines (SVM) and Extreme Learning Machines (ELM) based classifiers, as well as its weighted kernel version are explored, training systems on different acoustic feature representations and implementing weighted score-level fusion of the predictions. The proposed framework is tested on INTERSPEECH ComParE-2019 Baby Sounds corpus, which is a collection of Home Bank infant vocalization corpora annotated for five classes. Adhering to the challenge protocol, using a single test set submission we outperform the challenge baseline Unweighted Average Recall (UAR) score and achieve a comparable result to the state-of-the-art.

Keywords

baby sounds classification, computational paralinguistics, information fusion, extreme learning machines, support vector machines

Citation

Kaya, H, Verkholyak, O, Markitantov, M, Karpov, A & Markitantov, M 2020, 'Combining Clustering and Functionals based Acoustic Feature Representations for Classification of Baby Sounds', Paper presented at ICMI 2020 Workshop on Bridging Social Sciences and AI for Understanding Child Behavior, Utrecht, Netherlands, 29/10/20 - 29/10/20 pp. 509-513. https://doi.org/10.1145/3395035.3425182, workshop