OULAD dataset (Open University Learning Analytics Dataset)
Link to dataset: https://analyse.kmi.open.ac.uk/open_dataset
Link to the same dataset on Kaggle: https://www.kaggle.com/datasets/anlgrbz/student-demographics-online-education-dataoulad
Contains data about-
Courses (called modules)- duration, dates, etc.
Students - registration, gender, highest education, region, financial background of region (IMD band), age, previous attempts, etc.
Assessments - formative and summative assessment scores.
Interaction- student interactions/clicks with a study material in the Virtual Learning Environment (VLE) for seven selected courses.
Descriptive Analytics: The goal of this stage is to come up with insightful graphs and plots that can present the data well. You need to preprocess the data and select suitable plots to show that. Try to go through each of the data columns and think of some innovative visual representation that will unravel new insights into the dataset.
Example: Performance (P, F, W, D) vs Course offerings vs IMD Band
Actions in VLE vs Gender vs Performance
Prediction: This phase will focus on predicting the final course score in the dataset. You will be expected to engineer some new features and keep on experimenting and improving the performance of your predictive model.
Open-ended question: For this question, you are expected to come up with a question on your own. You will address the challenge with suitable methods and analytics and answer the question.
Some work done previously for this project: https://sites.google.com/view/etiitb-610-2021/course-project