The recent pandemic crisis combined with the explosive growth of Artificial Intellignence (AI) algorithms has highlighted the potential benefits of telemedicine for decentralised, accurate and automated clinical diagnoses. One of the most popular and essential diagnoses is the auscultation; it is non-invasive, real-time and very informative diagnoses for knowing the state of the respiratory system. To implement a possible automated auscultation analysis, the decision-making explanation of complex models (such as Deep Learning models) is crucial for trusted application in the clinical domain. In this context, we will analyse the behaviour of a Convolutional Neural Network (CNN) in classifying the largest publicly available database of respiratory sounds, originally compiled to support the scientific challenge organized at Int. Conf. on Biomedical Health Informatics (ICBHI17). It contains respiratory sounds (recorded with auscultation) of normal respiratory cycles, crackles, wheezes and both. To capture the phonetically important features of breath sounds, the Mel-Frequency Cepstrum (MFC) for short-term power spectrum representation was applied. The MFC allowed us to identify latent features without losing the temporal information so that we could easily identify the correspondence of the features to the starting sound. The MFCs were used as input to the proposed CNN who was able to classify the four above-mentioned respiratory classes with an accuracy of 72.8%. Despite interesting results, the main focus of the present study was to investigate how the CNN achieved this classification. The explainable Artificial Intelligence (xAI) technique of Gradient-weighted Class Activation Mapping (Grad-CAM) was applied. xAI made it possible to visually identify the most relevant areas, especially for the recognition of abnormal sounds, which is crucial for inspecting the correct learning of the CNN.

Lo Giudice, M., Mammone, N., Ieracitano, C., Aguglia, U., Mandic, D., Morabito, F.C. (2022). Explainable Deep Learning Classification of Respiratory Sound for Telemedicine Applications. In Communications in Computer and Information Science (pp.391-403). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-24801-6_28].

Explainable Deep Learning Classification of Respiratory Sound for Telemedicine Applications

Lo Giudice M.
Writing – Original Draft Preparation
;
2022-01-01

Abstract

The recent pandemic crisis combined with the explosive growth of Artificial Intellignence (AI) algorithms has highlighted the potential benefits of telemedicine for decentralised, accurate and automated clinical diagnoses. One of the most popular and essential diagnoses is the auscultation; it is non-invasive, real-time and very informative diagnoses for knowing the state of the respiratory system. To implement a possible automated auscultation analysis, the decision-making explanation of complex models (such as Deep Learning models) is crucial for trusted application in the clinical domain. In this context, we will analyse the behaviour of a Convolutional Neural Network (CNN) in classifying the largest publicly available database of respiratory sounds, originally compiled to support the scientific challenge organized at Int. Conf. on Biomedical Health Informatics (ICBHI17). It contains respiratory sounds (recorded with auscultation) of normal respiratory cycles, crackles, wheezes and both. To capture the phonetically important features of breath sounds, the Mel-Frequency Cepstrum (MFC) for short-term power spectrum representation was applied. The MFC allowed us to identify latent features without losing the temporal information so that we could easily identify the correspondence of the features to the starting sound. The MFCs were used as input to the proposed CNN who was able to classify the four above-mentioned respiratory classes with an accuracy of 72.8%. Despite interesting results, the main focus of the present study was to investigate how the CNN achieved this classification. The explainable Artificial Intelligence (xAI) technique of Gradient-weighted Class Activation Mapping (Grad-CAM) was applied. xAI made it possible to visually identify the most relevant areas, especially for the recognition of abnormal sounds, which is crucial for inspecting the correct learning of the CNN.
2022
9783031248016
Lo Giudice, M., Mammone, N., Ieracitano, C., Aguglia, U., Mandic, D., Morabito, F.C. (2022). Explainable Deep Learning Classification of Respiratory Sound for Telemedicine Applications. In Communications in Computer and Information Science (pp.391-403). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-24801-6_28].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11590/470767
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact