Convolutional neural networks represent the state of the art in multiple fields. Techniques that improve the training of these models are of prime interest since they have the capability to improve performances on a large variety of tasks. In this paper, we investigate the performance of progressive resizing, originally introduced in computer vision, when applied to the training of convolutional neural networks for audio events classification. We evaluate the original resizing algorithm and introduce a novel one, comparing the performances against a baseline system. Two of the most relevant audio datasets are used for assessing the performances of the proposed approach. Experimental results suggest that progressive resizing methods improves the performances of audio events classification models. The novel approach introduces a complimentary gain in performances with respect to the original technique.
Colangelo, F., Battisti, F., Neri, A. (2021). Progressive training of convolutional neural networks for acoustic events classification. In European Signal Processing Conference (pp.26-30). European Signal Processing Conference, EUSIPCO [10.23919/Eusipco47968.2020.9287362].
Progressive training of convolutional neural networks for acoustic events classification
Colangelo F.;Battisti F.;Neri A.
2021-01-01
Abstract
Convolutional neural networks represent the state of the art in multiple fields. Techniques that improve the training of these models are of prime interest since they have the capability to improve performances on a large variety of tasks. In this paper, we investigate the performance of progressive resizing, originally introduced in computer vision, when applied to the training of convolutional neural networks for audio events classification. We evaluate the original resizing algorithm and introduce a novel one, comparing the performances against a baseline system. Two of the most relevant audio datasets are used for assessing the performances of the proposed approach. Experimental results suggest that progressive resizing methods improves the performances of audio events classification models. The novel approach introduces a complimentary gain in performances with respect to the original technique.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.