Parkinson's Project

Speech pre-processing for Parkinson's disease diagnosis 

Parkinson disease (PD) is a neurological disorder which progressively makes the patient unable to control his/her movement normally and consequently, decreases the patient’s quality of life. Since there is no cure for PD, it is necessary to develop tools to diagnose this disease in early stages in order to control its symptoms. It is also important to monitor the progression of the disease in patients when they are recognized as PD subjects. 

Recently, smartphones are being investigated as tools for measuring pathological voice since they are ubiquitous and inexpensive devices with built-in, high-quality microphones. Compared to samples recorded in a clinic or a sound booth, recordings from smartphones are subject to many types of linear and nonlinear distortion in most environments. The presence of distortion in voice signals degrades the performance of algorithms designed to quantify medical symptoms from voice recordings. Therefore, speech recordings should be pre-processed before they are utilized in the recognition algorithms. The ultimate goal of this project is to develop algorithms to prepare the degraded recordings to be used in PD diagnosis. 

A CNN-Based Approach to Identification of Degradations in Speech Signals

In this work, we proposed a new CNN-based approach to automatically identify the major types of degradations commonly encountered in speech-based applications, namely additive noise, nonlinear distortion, and reverberation. Experimental results using two different speech types, namely pathological voice and normal running speech, showed the effectiveness of the proposed method in detecting the presence and the type of degradations in speech signals. 

We also provided a visual analysis of how the network makes decision for identifying different types of degradation in speech signals by highlighting the regions of the log-mel spectrogram which are more influential to the target degradation.

Automatic Quality Control and Enhancement for Voice-Based Remote Parkinson's Disease Detection

The acoustic mismatch between training and operating conditions, caused mainly by degradation in test signals, degrades the performance of voice-based Parkinson's disease detection systems.

In this work, we addressed this mismatch by considering background noise, reverberation and nonlinear distortion, and investigate how these degradations influence the performance of a PD detection system. Given that the specific degradation is known, we explored the effectiveness of a variety of enhancement algorithms in compensating this mismatch and improving the PD detection accuracy. Then, we proposed two approaches to automatically control the quality of recordings by identifying the presence and type of short-term and long-term degradations and protocol violations in voice signals. Finally, we experimented with using the proposed quality control methods to inform the choice of enhancement algorithm. 


Quality Control of Voice Recordings in Remote Parkinson's Disease Monitoring using the Infinite HMM

The performance of voice-based systems for remote monitoring of Parkinson’s disease is highly dependent on the degree of adherence of the recordings to the test protocols which probe for specific symptoms. 

In this work, we proposed an algorithm to automatically identify short-term protocol violations in the signals with a very high accuracy.

Quality Control in Remote Speech Data Collection

Controlling and improving the quality of recordings in large speech databases is challenging, particularly when they are collected remotely and in an unsupervised manner. 

In this work, we proposed a simple and effective approach for identification of outliers in speech data sets which can significantly decrease the effort required for manually controlling the quality of speech data sets.

A Parametric Approach for Classification of Distortions in Pathological Voices

The presence of degradation in signals affects the acoustic features used for subsequent biomedical applications. Information on the type of degradation can help in compensating for its effects.

In this work, we proposed a parametric approach for classification of different types of degradation in speech signals by applying factor analysis to Gaussian mixture model mean supervectors.

ICASSP_2018_Distortion5.pdf
ICASSP_2018_SNR5.pdf

A supervised Approach to Global SNR Estimation for Whispered and Pathological Voices

Most existing SNR estimation methods have been developed for normal speech and might not provide accurate estimation for special speech types such as whispered or disordered voices, particularly, when they are corrupted by non-stationary noises.

In this work, we proposed a new SNR estimation algorithm which can provide accurate estimation and consistent performance for various speech types under different noise conditions.

Speech Enhancement by Classification of Noisy Signals Decomposed Using NMF and Wiener Filtering

Supervised NMF for speech enhancement is not very effective when the spectral models of speech and noise signals are not relevant to a speech signal and a noise signal in a noisy observation.

In this work, we proposed a new single-channel speech enhancement algorithm which can better generalize for unseen noise types comparing to the supervised NMF.

input PESQ: 1.91          Input SDR: 0 dB

Output PESQ: 3.04          Output SDR: 16.9 dB

Dominant Distortion Classification for Pre-processing of Vowels in Remote Biomedical Voice Analysis

In this work, we investigated the impact of four major types of linear and nonlinear degradation (commonly present during recording or transmission in voice analysis) on mel-frequency cepstral coefficients (MFCCs). We then proposed an algorithm to detect the most dominant degradation type in pathological speech signals.

Copyright Notice: The PDFs of the papers are provided for academic purposes ONLY. All the papers are copyrighted by the corresponding publishers.