Spatial Audio Intelligence: From Representation to Understanding and Control of Auditory Environments

Abstract:
In the age of smart cities, the integration of acoustic sensing, edge computing, and sound emission within IoT devices is transforming how we monitor and manage urban sound environments. Systems such as the Audio Intelligence Monitoring at the Edge (AIME), deployed across Singapore, exemplify this shift—operating continuously to sense, process, and respond to complex auditory scenes. These intelligent sensors not only capture real-time aural data to complement CCTV systems but also provide the foundation for sound-aware infrastructure capable of improving urban livability.
This presentation outlines an end-to-end framework for intelligent urban sound management—starting with robust sensing, advancing to deep semantic understanding of the soundscape, and culminating in real-time, AI-driven noise mitigation strategies. Leveraging deep learning, these systems extract essential acoustic features such as noise type, event frequency, spatial direction, and sound pressure levels. Beyond passive monitoring, we introduce an active soundscape intervention device that uses adaptive ambisonic playback to emit contextually selected “acoustic perfumes”—natural sound maskers that dynamically respond to ambient noise conditions. Operating within an edge-cloud framework, this system is designed for low-latency feedback and minimal manual intervention, offering personalized and perceptually optimized soundscapes for residential and public spaces. The development of the Affective Responses to Urban Soundscapes (ARAUS) dataset further supports this direction by benchmarking perceptual models that predict comfort, annoyance, and well-being.

This work illustrates a holistic approach to urban acoustic intelligence: from smart sensing and machine hearing to real-time, AI-guided soundscape intervention. It marks a paradigm shift in how cities can understand, interpret, and actively enhance their acoustic environments—transforming noise into information and mitigation into meaningful auditory experiences.

2023_J11_TASLP_Delayless_Generative_Fixed-filter_Active_Noise_Control_based_on_Deep_Learning_and_Bayesian_Filter.pdf

2023_J3_TAC_ARAUS_A_Large-Scale_Dataset_and_Baseline_Models_of_Affective_Responses_to_Augmented_Urban_Soundscapes.pdf

2021_J6_JIOT_Extracting Urban Sound Information.pdf

2024_J10_BAE_Automating urban soundscape enhancements with 2024_J10_BAE_AI- In-situ assessment of quality and restorativeness in traffic-exposed residential areas.pdf

Page updated

Google Sites

Report abuse