Benvenuti nell'Anagrafe della Ricerca d'Ateneo

This PhD thesis explores the field of scene understanding using sound through artificial intelligence techniques. It addresses the challenge of extracting relevant information from sound in environments where other sensory inputs, such as vision, are limited or occluded. The work contributes novel methods and models for Acoustic Scene Classification (ASC), Sound Event Detection (SED), Unsupervised Anomalous Sound Detection (UASD), and speaker Distance Estimation, with a focus on reducing the complexity of these systems while maintaining high performance. The core of this research lies in the design of low-complexity deep learning models, such as lightweight convolutional networks and methods leveraging Chebyshev moments, which are applied to various sound recognition tasks. These models are tested in noisy environments and shown to be robust, offering state-of-the-art results while being computationally efficient. In addition to the theoretical contributions, the thesis explores practical applications of sound-based scene understanding in domains such as smart devices, security systems, and autonomous vehicles, enhancing human-computer interaction and safety. Future research potential includes the integration of multi-modal sensory data and the development of more interpretable AI systems.

Neri, M. (2025). Scene Understanding with Sound using Artificial Intelligence Techniques.

Scene Understanding with Sound using Artificial Intelligence Techniques

Michael Neri

2025-04-30

Abstract

This PhD thesis explores the field of scene understanding using sound through artificial intelligence techniques. It addresses the challenge of extracting relevant information from sound in environments where other sensory inputs, such as vision, are limited or occluded. The work contributes novel methods and models for Acoustic Scene Classification (ASC), Sound Event Detection (SED), Unsupervised Anomalous Sound Detection (UASD), and speaker Distance Estimation, with a focus on reducing the complexity of these systems while maintaining high performance. The core of this research lies in the design of low-complexity deep learning models, such as lightweight convolutional networks and methods leveraging Chebyshev moments, which are applied to various sound recognition tasks. These models are tested in noisy environments and shown to be robust, offering state-of-the-art results while being computationally efficient. In addition to the theoretical contributions, the thesis explores practical applications of sound-based scene understanding in domains such as smart devices, security systems, and autonomous vehicles, enhancing human-computer interaction and safety. Future research potential includes the integration of multi-modal sensory data and the development of more interpretable AI systems.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di discussione
	
				30-apr-2025
			
	Ciclo di dottorato
	
				37
			
	Corso di dottorato
	
				ELETTRONICA APPLICATA
			
	Parole chiave
	
				Audio Processing, Machine Learning, Anomaly Detection, Acoustics
			
	Appare nelle tipologie:
	
				8.1 Tesi di Dottorato

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11590/508216

Citazioni

ND

ND

ND

social impact