To read this content please select one of the options below:

(excl. tax) 30 days to view and download

A sound source localization method based on multi-scale cross-STFT complex-valued convolutional neural network

Mengran Liu, Chao Zhou, Hanghai Feng, Chuanqi Gong, Junhao Hu, Zeming Jian

Sensor Review

ISSN: 0260-2288

Article publication date: 31 January 2025

29

Abstract

Purpose

This paper aims to address the limitations of current deep learning algorithms for sound source localization (SSL), which focus on a single feature and frequency scale, neglecting the integration of multi-scale information. The method developed in this study enhances localization accuracy by effectively using the spatial information and spectral diversity provided by microphone arrays.

Design/methodology/approach

The method is based on a multi-scale cross-short-time Fourier transform (STFT) complex-valued convolutional neural network (CCNN). It uses cross-STFT spectra at different scales to capture detailed acoustic information across various frequencies. The effectiveness of the algorithm was validated through both simulations and experimental studies.

Findings

Experimental results demonstrate that the proposed multi-scale cross-STFT CCNN not only outperforms the single-scale cross-STFT model but also delivers superior localization performance compared to other advanced methods, achieving consistently higher accuracy. The method shows excellent robustness across various signal-to-noise ratio (SNR) conditions and performs well even on imbalanced datasets, confirming its strong generalization capabilities.

Originality/value

This paper introduces a novel approach to SSL that integrates multi-scale information, addressing a key limitation of existing methods. The findings offer significant value to researchers and practitioners in the field of acoustic signal processing, particularly those focused on deep learning-based localization techniques.

Keywords

Acknowledgements

This work was funded by the National Natural Science Foundation of China (Grant No.51805154) and the Hubei Provincial Natural Science Foundation of China (2022CFB473).

Data availability: The data and code supporting the findings of this study are available from the corresponding author upon reasonable request.

Citation

Liu, M., Zhou, C., Feng, H., Gong, C., Hu, J. and Jian, Z. (2025), "A sound source localization method based on multi-scale cross-STFT complex-valued convolutional neural network", Sensor Review, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/SR-10-2024-0870

Publisher

:

Emerald Publishing Limited

Copyright © 2025, Emerald Publishing Limited

Related articles