Jada Sosa
Class of 2025
Class of 2025
Humans have the incredible ability to focus on one speaker in a crowded setting with multiple speakers, such as a cocktail party. However, people who are hearing-impaired and use hearing aids or hearing devices cannot perform this task. This is called the cocktail party problem. Current hearing aids and assistive hearing devices are ineffective because they amplify all sounds, amplify the speaker you are looking at, or can only suppress background noise like the hum of an air conditioner. These current methods make it difficult for those who are hearing impaired to be satisfied in their social lives and are more likely to struggle with mental illness.
Previous research has found great potential in using neural network models, which are machine learning methods that learn what to output through training, to make a hearing aid that gives the listener the ability to pay attention to one speaker. The neural network models accomplish this by performing a classification task that successfully identifies the speaker the listener is paying attention to, the attended speaker. The visual representation of each speech, called speaker envelopes, and the EEG, a recording of the listener’s brain’s electrical activity, are input into the model. The model processes each of the inputs and puts them into the same embedding space so they can be easily compared. The model then identifies the speech envelope that corresponds to the EEG recording as the attended speaker. Previous research has found success using convolutional neural networks (CNN) and CNN with a long short-term memory (LSTM) layer. CNNs are proficient in processing complex visual information such as EEG recordings, while the LSTM layer is used to process temporal information like a speech envelope over time.
My study aims to increase the accuracy of the neural neural network model. In my study, I will test the accuracy of performing the classification task on dilated CNN (DCNN) with different sets of parameters, which are factors like layers that affect the performance and training time of the model. I will also test the accuracy of a DCNN model with an LSTM layer and a baseline CNN model.
Poster