This is a project that I have participated in during my study in Cornell University. The original project name is A Close-Loop Brain-Heart Machine Interface System, which is a big concept aiming to detect people's emotions in real-time. However, the actual outcome from our group is using machine learning to determine a person's emotion by analyzing electrocardiograph (ECG) and electroencephalography (EEG) signals. We proved that some features in ECG and EEG signals did carry emotion information and can be extracted to predict human emotions. And by training the machine learning model with desired features, the result of emotion classification could be very accurate!
My work mostly centered on EEG parts, including analyzing the relationship of emotions with EEG signals, selecting useful features for machine learning use, classifying emotions with EEG waveforms using machine learning method, and also setting up emotion inducing experiments.
Above is the overall architecture of the project. First of all, we used a portable bioelectrical signal acquisition device called OpenBCI to collect ECG and EEG data from the people of a research group. After importing the raw data into Matlab, several de-noising methods were adapted to remove the artifacts and filter the signal within bandwidth needed. Different features of EEG and ECG signals were extracted and evaluated based on relevance with emotion and importance for prediction. Finally, different machine learning models were fitted and compared to obtain the most promising one. The detail process is shown below:
The ECG and EEG signals were collected and preprocessed using OpenBCI, an open source bio-sensing tool. A total of three channels were used, one for the ECG signal and two for the EEG signal. The real-time raw data can be presented and saved on the laptop with OpenBCI_GUI for further analysis. The sampling rate is 200 Hz for data collection.
The human body and OpenBCI are connected via electrodes. The electrode placement is shown on the right. Two channels of EEG signals are collected from two points on the forehead with the earlobes served as the reference. The ECG signals are collected by putting the positive electrode below the neck and the negative electrode on the left chest near the shoulder. The reference of the ECG signal is on the right forearm.
In order to obtain the data for people under different emotional status, we used music clips, picture sets and video clips of different categories including happy, sad and tender to induce subject's certain emotions. During the process, the subject will receive either music, picture or video inducing method with three clips each represent a type of emotion. Every clip is about 5 minutes long. Between each clip, there will be a 30 seconds break for subjects to relax and adjust their emotions.
Channel 1: ECG
Channel 2: EEG(1)
Channel 3: EEG(2)
ECG Signal Processing
The pre-processing of ECG signal is to recognize the P-Q-R-S-T wave patterns of the cardiographs by adopting Pan-Tompkins algorithm.
First, we put the ECG raw signal into a bandpass filter (5-15 Hz) to get rid of the baseline wander and the muscle noise. Then we differentiated the filtered signal by a derivative filter to highlight the QRS complex. After that a squaring filter is applied to make the differentiated signal become positive and further highlighted the QRS complex. Next, we averaged the squared signal with a moving window (window length = 0.150 second) to get rid of the noise and smooth the output signal. Lastly, the adaptive thresholds are used for searching the signal peak and the noise peak.
EEG Signal Processing
Different from de-noising process of ECG signal, the de-noising process of EEG signal cannot straightforward put the data through the bandpass filter, since the useful information of EEG was mixed up with the artifacts within the same bandwidth. Although, in general, the Notch Filter or High-Pass Filter were used to remove 50/60 Hz power-line interference or high-frequency noise, the Wavelet Transform is a better choice to separate the clean data with the artifacts.
The Short-Term Wavelet Transform (STWT) was selected due to its advantage of being translation invariant. Through performing STWT on each epoch with level-8 decomposition by Haar as basis wavelet, the final approximate and 8 groups of detail coefficients would be obtained. The reason for level value choice is that it could perfectly extract the bandwidth we wanted and divide the frequency into segments matching the corresponding category EEG signals like alpha and beta signal.
ECG Selected Features
The ECG signal could be regarded as a time domain signals carrying information about the heart oscillation activities. By plotting the tendency of each feature on different kinds of emotions, we eliminated those features that couldn't separate emotions apart and finally selected three important features that can be used for machine learning model training.
The first one is “Q-R Ratio”. Considering that different people would have different baselines on their ECG amplitudes, we used the feature “ECG-Q-Mean” divided by the feature “ECG-R-Mean” to get the new feature “Q-R Ratio”. "ECG-Q-Mean" is the mean of all Q-wave in a given window, and "ECG-R-Mean" is the mean of all R-wave in a given window.
The second one is “ECG-HR-Mean”. "ECG-HR-Mean" is the mean of all heart rates in a given window.
The third one is “ECG-HR-StDev”. "ECG-HR-StDev" is the standard deviation of all heart rates in a given window.
EEG Selected Features
Each feature was preliminarily analyzed to see whether it has a certain correlation with human emotions. The final selection of features will be based on the contribution to machine learning performance. The selected features of the EEG signal come from both frequency domain and time domain with a total of seven features.
In frequency domain, we use power spectrum density (PSD) of brain waves as features. There are five brain waves including Delta, Theta, Alpha, Beta, and Gamma waves. Each can play a different role in brain activity. Since Gamma didn't have a good contribution to machine learning performance, we only selected four other waves.
Delta Waves (1 to 3 Hz) are related to deep levels of relaxation and restorative sleep.Theta Waves (4 to 8 Hz) are commonly found in daydreaming or asleep, which is related to a relaxed mental state. Alpha Waves (9 to 13 Hz) are related to calm down and deeper relaxation. Beta Waves (14 to 30 Hz) are related to awakening, alert and focusing.
In time domain, we use three Hjorth parameters as features. The first parameter is activity. The activity of Hjorth parameters is served as an indicator to show whether many high-frequency components exist or not.
The second parameter is mobility. The mobility will give us the proportion of the standard deviation of the power spectrum.
The third parameter is complexity. The complexity parameter of Hjorth parameters will show us how similar a signal is compared to a sine wave. The more similarity a signal has, the more likely the value of the parameter will converge to one.
We compared two basic machine learning models, which were SVM and KNN, in order to get the basic model with the best performance on our training samples. It turned out that SVM generally had a better performance than the KNN algorithm. To be more specific, when considering only ECG features, SVM had an accuracy scored about 7% higher when compared with the KNN model; when considering only EEG features, SVM scored about 15% higher compared to the KNN model; and when considering both ECG and EEG features, SVM scored as well about 10% higher compared to KNN model.
We also tested models training by EEG data, ECG data and combination of EEG and ECG data. We added another voting system, which would choose the most frequent results from three models’ outputs. If the number of votes is the same, the system would choose one of them uniformly.
Based on the results, we observed that the ECG prediction system is the best one probably due to the robustness of the signal and more obvious distinction on selected features. EEG prediction system performed the worst probably because of limited resolution from only two channels and sensitivity to the noise of the external environment, which was difficult to be completely separated with clean original data. In terms of three kinds of emotion, we found that the accuracy of happy was always the highest, probably due to that emotion is easier to induce resulting in more distinctive feature performance and more accurate label.
In conclusion, we have demonstrated an offline integrated system which could successfully sample data from both heart beating signals (ECG) and brain wave signals (EEG) using non-invasive method by placing electrodes at different positions of the human body, extract desired features from the raw data, and perform machine learning algorithms to predict a person’s emotion status.
Based on the results we found and the accuracy score we got from the final machine learning model, we could tell that certain features from the ECG and EEG signals are highly correlated with a person’s emotion status.