My REU Project

First Week of REU

This week has been pretty fun for me. I truly enjoyed every single one of my colleague's company. They were all very welcoming and we already spent time to go eat and shop together. I can already feel the upcoming fun weeks working with them!
To add on, my project teammates are Ciabhan Connelly and Abhi Kumar, and so far it has been great working with them (not to mention how interested and excited I am for the research topic we have chosen!).

Our data mining project focuses on two topics:

  1. Predicting the graduation rate within a 4-year span for STEM students.
  2. Evaluating the models we have chosen on a series of fairness metrics as we attempt to define what practical and ethical considerations need to go into ensuring algorithms are fair within the education domain.

Second Week of REU

Highlight of the Week: HACKATHON!!!


I had an incredible experience during our hackathon this week. The challenge was totally an icebreaker since it allowed me to further understand the concept of Data Mining. I had no advanced experience in Python and Data Mining before this REU opportunity. However, these past 2 weeks have helped mold my knowledge in regard to such aspects.
Our task in the hackathon was to predict wine quality given its physicochemical attributes. We were given 2 datasets (specifically the training set and testing set) which consists of hundreds of different wine attributes. We used the training set as a base to implement the machine learning techniques in order to predict the wine qualities in the testing set. My team specifically used two methods, namely the Support Vector Machine (SVM) and the StandardScaler feature in sklearn. We first used the SVM classification method in our data set and it gave us a prediction accuracy of 72%. We then used the StandardScaler method and it increased our prediction accuracy from 72% to 77%. Our prediction rate was in fact the highest among the other students.
The hackathon was truly fun on top of having my first full experience in Data Mining. As of today, my knowledge in Python programming has also substantially increased since my first day in REU. Words cannot explain how thankful I am to be a part of this amazing opportunity.

My accomplishments this week:

  1. I got to experince my first actual Data Mining task (Big thanks to Dr.Snyder and Dr.Rangwala's REU Hackathon!)
  2. I was able to play around with the GMU data (in which we are given access to) and discovered cool mining tools in Python
  3. I read similar research papers related to my project (interesting papers!)
  4. Increased knowledge in Python programming! Yay

Third Week of REU

Deliverable: Critique a research paper related to my topic


After carefully reading multiple research papers on Google Scholar and GMU Library Database, I have chosen to critique the article “Early Model of Student’s Graduation Prediction Based on Neural Network,” by Budi Rahmani and Hugo Aprilianto (2014) since it is closely related to my research topic. The article focuses on predicting timing of student graduation based on GPA during the first three semesters. Using Artificial Neural Network method, they based their testing on 166 samples of student data (from 2011 and 2012) and resulted to a prediction accuracy of 99.9%. Rahmani and Aprilianto’s (2014) research is quite similar to my topic because both focuses on the aspect of graduation prediction based on student data. However, their topic narrows down to predicting the time frame on when a student is supposed to graduate based on GPA. Meanwhile my topic narrows down to predicting the likelihood of a student to drop out of their STEM major based on GPA, demographics, race, and etc. As of today, my research team has obtained thousands of student data from GMU Database in which we are given access to. We have also tried to merge and clean up our data between different cohorts. So far we have tried to predict the dropout rate of our students using Random Forest method and Cross Validation, and it resulted to a prediction accuracy of 89%.

Photoshoot with my REU Team!

Fourth and Fifth Week of REU

Working progress!


For both weeks, my team and I have done a substantial amount of coding, analyzing, and visualizing data from cohort 2009 to 2013 in order to obtain desired results. Each member of my team had a different focus on our research topic. I was mostly focused on creating visualizations regarding CS student performance during their first year at GMU. I had different visualizations between the students who dropped out and those who did not drop out of CS within a 4-year span. In our research paper, we define a student as "dropout" if they dropped out as a CS major and/or if they did not graduate within 4-years. As a result, I found out that there is a substantial difference between the student performance between dropouts and non-dropouts. The students who dropped out tend to have a low GPA distribution as well as low grade distribution for both CS and MATH classes. Meanwhile, the students who did not drop out of CS tend to a have a high GPA distribution and high grade distribution for both CS and MATH classes.
My colleague Abhi is focused on predicting the likelihood of CS and STEM students to dropout and my other colleague Ciabhan is focused on evaluating the models used on a series of fairness metrics.
I would like to share a sneak peak of some of my visualizations regarding the performance between dropout and non-dropout CS students:

Sixth Week of REU

Deliverable: Discuss one thing you have learned about research that you were not aware of before starting this journey


These past few weeks have been the most fun learning experience I've ever had throughout my Computer Science career. I never really thought that research could be fun! I've always thought that it would be much worse than a classroom setting wherein you are given nothing but exams and deadline. I thought the whole process would be very stressful, especially that I had no experience or an idea on what to expect by the end of this research. Turns out, I was wrong.
I truly enjoyed engaging with my coworkers and mentors as I learned new ideas every day! So far, I have enhanced my knowledge in visualization techniques, data mining, machine learning, and even new programming languages throughout my REU journey. Prior to this journey, I honestly did not expect to enjoy any sort of learning process due to the fact that I thought it would be much more of a classroom setting. Not that learning is never fun, but learning is never fun when you are bounded with stress and deadlines (not to mention taking required classes you were never interested in the first place).
However, this research experience was far more different than what I initially thought. I was super interested and excited regarding the research topic that I was assigned to (Graduation Prediction for STEM Students). I was looking forward to learn something new every day. My mentors, Prof. Rangwala and Prof. Snyder, are really great and effective teachers. I have learned so much from them and I honestly wish that my future mentors and all my professors at GMU would be as engaging as they are. In every working day, I seem to be highly motivated, excited, and enthusiastic about what I'm about to work on and who I'm about to work with!
I honestly feel blessed to be a part of this amazing REU opportunity. I believe I learned so much more than what I was supposed to, and I am truly thankful for that. It has been a great experience and I wish I could work with my mentors and coworkers again in the future.

Seventh and Eighth Week of REU

We gathered our results and created an informative video regarding our research topic (click on the hyperlink):
Graduation Prediction for STEM Students