Parkinson's Project
Speech pre-processing for Parkinson's disease diagnosis
Parkinson disease (PD) is a neurological disorder which progressively makes the patient unable to control his/her movement normally and consequently, decreases the patient’s quality of life. Since there is no cure for PD, it is necessary to develop tools to diagnose this disease in early stages in order to control its symptoms. It is also important to monitor the progression of the disease in patients when they are recognized as PD subjects.
Recently, smartphones are being investigated as tools for measuring pathological voice since they are ubiquitous and inexpensive devices with built-in, high-quality microphones. Compared to samples recorded in a clinic or a sound booth, recordings from smartphones are subject to many types of linear and nonlinear distortion in most environments. The presence of distortion in voice signals degrades the performance of algorithms designed to quantify medical symptoms from voice recordings. Therefore, speech recordings should be pre-processed before they are utilized in the recognition algorithms. The ultimate goal of this project is to develop algorithms to prepare the degraded recordings to be used in PD diagnosis.
A CNN-Based Approach to Identification of Degradations in Speech Signals
In this work, we proposed a new CNN-based approach to automatically identify the major types of degradations commonly encountered in speech-based applications, namely additive noise, nonlinear distortion, and reverberation. Experimental results using two different speech types, namely pathological voice and normal running speech, showed the effectiveness of the proposed method in detecting the presence and the type of degradations in speech signals.
We also provided a visual analysis of how the network makes decision for identifying different types of degradation in speech signals by highlighting the regions of the log-mel spectrogram which are more influential to the target degradation.
Yuki Saishu, Amir H. Poorjam, and Mads G. Christensen
Preprint [PDF]
Automatic Quality Control and Enhancement for Voice-Based Remote Parkinson's Disease Detection
The acoustic mismatch between training and operating conditions, caused mainly by degradation in test signals, degrades the performance of voice-based Parkinson's disease detection systems.
In this work, we addressed this mismatch by considering background noise, reverberation and nonlinear distortion, and investigate how these degradations influence the performance of a PD detection system. Given that the specific degradation is known, we explored the effectiveness of a variety of enhancement algorithms in compensating this mismatch and improving the PD detection accuracy. Then, we proposed two approaches to automatically control the quality of recordings by identifying the presence and type of short-term and long-term degradations and protocol violations in voice signals. Finally, we experimented with using the proposed quality control methods to inform the choice of enhancement algorithm.
Amir H. Poorjam, Mathew S. Kavalekalam , Liming Shi, Yordan Raykov, Jesper R. Jensen, Max A. Little and Mads G. Christensen
in Speech Communication, vol. 127, 2021 [PDF]
Quality Control of Voice Recordings in Remote Parkinson's Disease Monitoring using the Infinite HMM
The performance of voice-based systems for remote monitoring of Parkinson’s disease is highly dependent on the degree of adherence of the recordings to the test protocols which probe for specific symptoms.
In this work, we proposed an algorithm to automatically identify short-term protocol violations in the signals with a very high accuracy.
Amir H. Poorjam, Yordan Raykov, Reham Badawy, Jesper R. Jensen, Mads G. Christensen, and Max A. Little
in Proc. 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019. [PDF]
Quality Control in Remote Speech Data Collection
Controlling and improving the quality of recordings in large speech databases is challenging, particularly when they are collected remotely and in an unsupervised manner.
In this work, we proposed a simple and effective approach for identification of outliers in speech data sets which can significantly decrease the effort required for manually controlling the quality of speech data sets.
Amir H. Poorjam, Max A. Little, Jesper R. Jensen and Mads G. Christensen
IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 2, 2019. [PDF]
A Parametric Approach for Classification of Distortions in Pathological Voices
The presence of degradation in signals affects the acoustic features used for subsequent biomedical applications. Information on the type of degradation can help in compensating for its effects.
In this work, we proposed a parametric approach for classification of different types of degradation in speech signals by applying factor analysis to Gaussian mixture model mean supervectors.
Amir H. Poorjam, Max A. Little, Jesper R. Jensen and Mads G. Christensen
in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018. [PDF]
A supervised Approach to Global SNR Estimation for Whispered and Pathological Voices
Most existing SNR estimation methods have been developed for normal speech and might not provide accurate estimation for special speech types such as whispered or disordered voices, particularly, when they are corrupted by non-stationary noises.
In this work, we proposed a new SNR estimation algorithm which can provide accurate estimation and consistent performance for various speech types under different noise conditions.
Amir H. Poorjam, Max A. Little, Jesper R. Jensen and Mads G. Christensen
in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018. [PDF]
Speech Enhancement by Classification of Noisy Signals Decomposed Using NMF and Wiener Filtering
Supervised NMF for speech enhancement is not very effective when the spectral models of speech and noise signals are not relevant to a speech signal and a noise signal in a noisy observation.
In this work, we proposed a new single-channel speech enhancement algorithm which can better generalize for unseen noise types comparing to the supervised NMF.
in Proc. 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, 2018.
Noisy recording at 0 dB
input PESQ: 1.91 Input SDR: 0 dB
Enhanced signal by the proposed method
Output PESQ: 3.04 Output SDR: 16.9 dB
Dominant Distortion Classification for Pre-processing of Vowels in Remote Biomedical Voice Analysis
In this work, we investigated the impact of four major types of linear and nonlinear degradation (commonly present during recording or transmission in voice analysis) on mel-frequency cepstral coefficients (MFCCs). We then proposed an algorithm to detect the most dominant degradation type in pathological speech signals.
Amir H. Poorjam, Jesper R. Jensen, Max A. Little and Mads G. Christensen
in INTERSPEECH 2017, pp. 289–293, Stockholm, Sweden, August 2017. [PDF]
Copyright Notice: The PDFs of the papers are provided for academic purposes ONLY. All the papers are copyrighted by the corresponding publishers.