The current trajectory for our summer research project.
Under the direction of Dr. Rangwala, Anlan Du and I are exploring the connection between faculty traits, such as gender and course evaluation, and student retention through data mining techniques. Before beginning with DM models, we are planning to clean and process data as well as perform feature selection to pare down the dimensionality of our data.
This week, Anlan and I worked towards creating a collection of course evaluation data. George Mason offers the course evaluations for most of its courses on its webpage for the Office of Institutional Effectiveness and Planning, so we downloaded the CS department's course evaluations for our data's timespan. Working with anonymized data, we were able to begin creating our own database for faculty ratings and student course histories. We parsed the information from the evaluations into a new file and then merged each course evaluation with the students who took the courses during their college career. We also read related research in the field of retention, especially for female students in traditionally male-dominated fields, to understand the context of our inquiry.
Early on in week 3, Anlan and I began to actually dun data mining models on our data. Because our project covers two different scopes--both long term retention and shorter term performance prediction in courses--we tackled different portions of our research inquiry. Anlan began to run models on the larger retention question while I narrowed my scope on the course prediction portion. We both decided to document the process with visualizations of our data--including the point at which students left the major and the success of the algorithms on different student groups--to understand the broad trends of our data. We will have a more focused understanding of the next point in our research process for next week as a result of these data visualization techniques and will hopefully get clearer results for our mining!
This week, I continued and broadened my work on the prediction of passing courses based on previous course predictions. In addition, I began to work on adding the averaged course evaluations for the entire slew of STEM courses in a student's semester to add more attributes to investigate. This work was supported by finding the f-scores, mutual information scores, and importance in decision tree models to understand the relationships between each variable and the student success metrics Anlan and I are investigating. Meeting with Dr. Rangwala cemented the work this week, and gave direction for further areas to focus on during the rest of our time here. Using this information and momentum from the preliminary results from the research allowed us to solidify the vision of the methods we would use for the rest of the summer.
The main theme of this week was experimenting with and analyzing models using a huge swath of metrics! Anlan and I both continued to work on predictions--passing class, switching majors, and graduating in CS--and we spent a large portion of our time trying to understand the various metrics for scoring the performance of our models. I added a prediction of switching from computer science on the basis of a math class evaluation, and Anlan added prediction of switching out of computer science.
The focus this week was on finding and organizing results from our experiments up to this point. Following an extremely productive meeting with Anlan and Dr. Rangwala early on during the week, we were able to set up many more unique experiments with our data. We began comparing the results of predictions based on only student features versus predictions from models with information about course evaluations. This was an improvement over just predicting a "baseline" because it creates a control group from our data. In addition, because the evaluations we have are averaged out from each individual class's evaluations, they cannot be distinguishing features between students who take the same courses. This is an issue for ML models, so we created unique course id's to stand in for the perceived quality of courses and to help create distinct features for each student. All of these developments were accompanied with concurrent research into the literature in the field and continued work on model quality and metric enhancement.
This week was a mix of many different tracks for our research: visualization, adding experiments, and tuning. Keeping in mind our presentations at the end of this REU, I began making some visualizations of the students we had in our dataset. Seeing the SAT and high school GPA spread of our students, and the differing academic backgrounds these could imply, we decided to examine our models with the lens of algorithmic fairness. Although we were able to predict grades and graduation with some accuracy, we wanted to see if this accuracy was equivalent across the board for different students. We added experiments to start testing this idea, and also began tuning our models to improve the predictive powers of our original experiments.
With the write-up and poster deadlines set during Week 9, this week's focus was on honing in on our results and finding a narrative to tie up our findings for the summer. With many experiments behind us--baseline, evaluations, unique class id's, and fairness tests--I began working on tables to display the key facts that were most important to our research topic. While Anlan worked on our video presentation, I started working on our full poster and the visualizations that it required (check out the longer Week 8 post to see a few!). Work on the poster went alongside work on our paper write-up.
This week Anlan and I completed the first draft of our paper, and we took part in a group paper critique with the rest of the REU. The rest of the week consisted of fine-tuning our write-up, finishing our poster, and making sure that all of our experiments lined up so that they would be comparable. This included deciding which experiments should go into our final analyses.
:-(