Projects

Computational Behavior Science:

• Set up the Child Study Lab (CSL) in Georgia Institute of Technology to collect spontaneous interactions between an adult and a child. The interactions were to study early indications of autism spectrum disorders (ASDs). More than 150 sessions of interactions were collected.

• Developed a novel multimodal-multitemporal classifier fusion algorithm for emotion classification. The fusion algorithm is capable for training and classifying multimodal data analyzed at various temporal lengths.

• Analyzed semi-structured adult-child interactive behaviors to estimate the levels of engagement with linguistic and non-linguistic vocal cues along with visual cues, such as direction of a child’s gaze or a child's gestures.

• Developed an efficient method for synchronizing audio and video files collected in various environmental settings. The method was tested on more than 150 multimodal data collected in noisy and uncontrolled environments.

The results shows all the tested files were successfully synchronized with errors less than 50 ms.

• Developed paralinguistic event detectors, such as laughter and fussing/crying, along with toddlers' speech from multimodal dyadic behavior dataset. Acoustic features were analyzed and the optimal set of features was selected using a novel feature analysis technique.


Acoustic Feature Extraction for Machine Learning:

• Designed a robust formant-tracking algorithm using Gaussian mixtures and maximum a posteriori (MAP) adaptation. The algorithm significantly reduced the root-mean-square error (RMSE) when compared to a LPC-based algorithm, PRAAT.

• Analyzed formant-based acoustic features for emotion classification. Various methods for extracting formant-based features were investigated. Developed a novel method for transforming formant components (frequencies and amplitudes) into features using novel low-level descriptors.

• Designed an algorithm of extracting spectral features using a multi-resolution sinusoidal transform, with the aim of estimating speech intelligibility for speakers undergoing cancer treatment.

• Developed a method for analyzing features to obtain an optimal feature set to train machine learning algorithms. The method was tested against popular feature selections algorithms such as a minimal-redundancy-maximal-relevance (MRMR) and feature projections algorithms such as a principal component analysis (PCA) and a linear discriminant analysis (LDA).


Neural Signal Processing for Speech:

• Developed a speech active detector involving electrodes implanted into the cerebral cortex in locked-in volunteers from the laboratory.

• Developed a hidden-Markov-model (HMM)-based method to decode neural firing activity in speech motor cortex.


Speech Modification using Sinusoidal Transform Coder (STC):

• Developed a speech synthesis and modification system with control over the apparent vocal effort. The system can impose controlled changes to the apparent vocal effort of an arbitrary speech signal.

• Developed a novel method of time-scale modification for polyphonic and multi-pitch audio signals. The method eliminates the frequency jitter artifacts significantly by using multi-onset time estimations.


Aquarium Project:

• Designed a real time system to map the movements of objects into musical soundscapes in order to bring dynamic exhibits to the visually impaired.

• Designed an automatic music generation system by mapping movements of target objects.