Data Collection and Predictive Modeling 

Collecting high quality and reliable data is one of the foundations in all of analytics. This section discusses data that can be collected for applications in predictive analytics by any secondary school educator. Students are complex so even with high quality data, predictive analytics should only inform decision-making and not be the sole decision maker. With sufficient data collection and triangulation, a reliable picture of an individual can then be painted. Some of the following data are not new and have been used for decades, but with the proliferation of modern technology in schools, many more data sources are available and can more easily be analyzed. The focus here is on data collection in physical schools as opposed to e-Learning platforms and Learning Management Systems, since they often contain analytics features with tutorials provided.  

Data Sources

Attendance and Punctuality Records

Frequent tardiness or absences can result from behaviour or wellbeing issues, family reasons, or extra-curriclar activities such as competitions outside of school hours. Data can be collected every day or every lesson to inform teachers and administrators of any further actions necessary for a student.

Academic Performance

Internal Data

Overall scores on assignments can check for a student's level of understanding and engagement. Scores can be compared to the assignment type (e.g. MCQ, independent research, essay, presentations) to identify a student's strengths and weaknesses to predict future scores on similar assignments. Trends and patterns can be found by comparing a student's data against his/her historical data, peers' data in the same year, and historical data for the grade.

External Data

In many schools around the world, curricula with standardized testing are used for graduation or university admission requirements. The scores of widely-used standardized tests are a valuable source of data as a large number of students from various cultural, educational, and sociopolitical backgrounds write them. The data can be used to predict or suggest future academic pathways, university admission rates, or graduation rates. 

Examples of standardized tests with publicly available data include: 

Extra-curricular/Co-curricular Activities (ECA/CCA)

Participation in these activities show engagement and involvement in the school community. A decrease in attendance could mean wellbeing or time-management issues.

Surveys & Questionnaires

Answers to regular surveys or questionnaires can be quantified to give a score to flag any potential concerns.

Teacher Observations

Although not necessarily quantifiable, teachers are the ones who have the most contact time with many students and can provide many facets of data for students.

Predictive Modelling & AI

Data modeling techniques are highly technical and are out of the scope for most secondary school educators. If you would like to read about them, below are some papers where you can start. 

Predictive Modeling in Teaching and Learning

Recent advances in Predictive Learning Analytics: A decade systematic review (2012–2022)

A Review on Predictive Modeling Technique for Student Academic Performance Monitoring  


Artificial Intelligence

Predictive Analytics

Artificial Intelligence (AI) is a disruptor in technology and we are only just beginning to harness its full potential. Predictive Analytics uses mathematical models to predict future events using historical data. When paired with AI, Predictive Analytics can become more powerful as AI is able to process more data, find hidden patterns and correlations, and predict events with higher accuracy. AI and Machine Learning (ML) with their algorithms can automate Predictive Analytics either completely or in part. However, as with all AI algorithms, data quality must be ensured to prevent biases and discrimination in predictions.