BAST-Mamba: Binaural Audio Spectrogram Mamba Transformer for binaural sound localization

Publication date

2025-10-14

Authors

Kuang, Sheng
Shi, JieORCID 0009-0009-8522-820X
van der Heijden, Kiki
Mehrkanoon, SiamakORCID 0000-0002-0516-0391ISNI 0000000512552651

Editors

Advisors

Supervisors

Document Type

Article
Open Access logo

License

cc_by

Abstract

Accurate sound localization in reverberant environments is essential for human auditory perception. Recently, Convolutional Neural Networks (CNNs) have been used to model the binaural human auditory pathway. However, CNNs face limitations in capturing global acoustic features. To address this issue, we propose a novel end-to-end Binaural Audio Spectrogram Mamba Transformer (BAST-Mamba) model to predict sound azimuth in both anechoic and reverberant conditions. We explore two implementation modes: BAST-Mamba-SP and BAST-Mamba-NSP, which correspond to shared and non-shared parameter configurations, respectively. Our best model BAST-Mamba-SP, equipped with subtraction-based interaural integration and a hybrid loss function, achieves a state-of-the-art angular distance (AD) error of 0.89°and mean squared error of 0.0004, significantly outperforming baseline models. The model demonstrates generalization across acoustic environments, robust hemifield symmetry and high accurate real-time localization performance (<4°AD at 300 ms). Moderate noise augmentation at 30 dB SNR yields the strongest noise resilience. Explainability analyses highlight consistent frequency focus in the 2–3 kHz and 5.5–6.5 kHz bands, aligning with known neurophysiological cues. These results validate the potential of neurobiologically inspired Transformer for robust, high-precision sound localization and offer new insights into human sound localization.

Keywords

Binaural integration, Sound localization, Transformer, Computer Science Applications, Cognitive Neuroscience, Artificial Intelligence

Citation

Kuang, S, Shi, J, van der Heijden, K & Mehrkanoon, S 2025, 'BAST-Mamba : Binaural Audio Spectrogram Mamba Transformer for binaural sound localization', Neurocomputing, vol. 650, 130804, pp. 1-9. https://doi.org/10.1016/j.neucom.2025.130804