An exhaustive list of potential data sources would be impossible, but this page has compiled a number of resources to help you get started. You should explore these sources to determine (a) what it takes to obtain access to their data, and (b) what kind of data they provide.
Please note that some of these sources have considerable gatekeeping procedures for obtaining this data. Please plan accordingly.
Here are some commercial online learning systems that are relatively well-known in the US. THey might give you an idea bout the kinds of environments that are possible.
ABCmouse: ABCmouse is an online learning platform that provides educational resources and activities for K-12 students.
Coursera: Coursera is an online learning platform that offers access to a variety of courses from top universities and institutions.
DreamBox: DreamBox is an adaptive learning platform for K-12 mathematics.
edX: edX is a non-profit online learning platform that provides access to courses from top universities and institutions.
FutureLearn: FutureLearn is an online learning platform that provides access to courses from top universities and institutions in the UK and around the world.
Khan Academy: Khan Academy is a non-profit online learning platform that provides access to free educational resources, including video lessons and practice exercises.
Knewton: Knewton is an adaptive learning platform that provides personalized instruction to K-12 students. Knewton provides publicly available data on student performance and progress, which can be used to study student learning outcomes, course design, and more.
Newsela: Newsela is an online learning platform that provides access to news articles and other content at various reading levels for K-12 students.
NovoEd: NovoEd is an online learning platform that provides access to courses from top universities and institutions.
Open edX: Open edX is an open-source online learning platform that provides access to courses from top universities and institutions.
OpenLearning: OpenLearning is an online learning platform that provides access to courses from top universities and institutions.
OpenSesame: OpenSesame is an online learning platform that provides access to a variety of courses and training resources.
Skillshare: Skillshare is an online learning platform that provides access to courses and tutorials on creative and design-related topics.
ST Math: ST Math is a visual math learning platform for K-12 students.
Udacity: Udacity is an online learning platform that provides access to courses on technology and business-related topics.
Udemy: Udemy is an online learning platform that provides access to a variety of courses on a wide range of topics.
BROMPository: your source for BROMP observational data on student engagement.
Data.gov: Data.gov is a site run by the US government that provides access to a vast collection of open data from various US federal agencies.
DataShop: DataShop is a repository of educational data that is maintained by Carnegie Mellon University's Institute for Software Research. It contains data from a variety of educational technology and learning environments.
ElSi from IES/NCES: https://nces.ed.gov/ccd/elsi/
European Union Open Data Portal: The European Union provides access to a wide range of data on topics ranging from agriculture to energy and transportation.
European Union Open Data Portal: The European Union provides access to data on education, including enrollment, graduation rates, and the quality of education systems in different countries.
Eurostat: An official data page for the European Union https://ec.europa.eu/eurostat/web/main/data/statistical-themes . Includes survey on adult education: https://ec.europa.eu/eurostat/web/microdata/adult-education-survey
Harvard Dataverse: The Harvard Dataverse is a collection of open research data from a wide range of disciplines, including education.
IDEA US DOE: https://sites.ed.gov/idea/data/
IEEE e-Learning Data Repository: https://innovate.ieee.org/whats-new/?f_widget=1&f_cats=&f_tags=125
Kaggle Datasets: Kaggle is a popular platform for data science and machine learning competitions. They also have a large repository of free datasets that can be used for educational purposes.
National Center for Education Statistics (NCES): NCES is the primary federal entity for collecting and analyzing data related to education in the United States. It provides access to data on topics such as enrollment, graduation rates, and student achievement.
NationMaster: Stats from a variety of nations: https://www.nationmaster.com/
NSF's SESTAT Project: https://www.nsf.gov/statistics/sestat/
NYC Open Data: This data source is also not specific to education, so you will need to pick a data set that applies to this course. https://opendata.cityofnewyork.us/
Philadelphia School District Open Data Project: https://www.philasd.org/performance/programsservices/open-data/
Pittsburgh Science of Learning Center: The PSLC is a repository of educational data that is focused on the science of learning.
Programme for International Student Assessment (PISA): PISA is a large-scale assessment of student achievement that is administered by the Organisation for Economic Co-operation and Development (OECD). PISA assesses the knowledge and skills of 15-year-old students in reading, mathematics, and science.
Progress in International Reading Literacy Study (PIRLS): PIRLS is a large-scale assessment of student achievement that is administered by the IEA. PIRLS assesses the reading literacy of 4th grade students.
The Common Core of Data (CCD): CCD is a program of the NCES that collects data on all public elementary and secondary schools in the United States. The data includes information on student enrollment, teacher characteristics, and financial data.
The Education Longitudinal Study of 2002 (ELS:2002): ELS:2002 is a large-scale, nationally representative study of students who were in the 10th grade in 2002. The study collected data on students' academic, social, and economic experiences, including their high school experiences and postsecondary education.
The Integrated Postsecondary Education Data System (IPEDS): IPEDS is a system of interrelated surveys conducted annually by the National Center for Education Statistics (NCES) to collect data from all accredited postsecondary institutions in the United States.
The National Assessment of Educational Progress (NAEP): NAEP is a large-scale assessment of student achievement in the United States that provides data on student performance in subjects such as reading, mathematics, and science.
The National Household Education Surveys Program (NHES): NHES is a program of the NCES that collects data on the educational experiences of the U.S. population, including data on early childhood education, school enrollment, and educational attainment.
The National Student Clearinghouse Research Center: The National Student Clearinghouse Research Center provides access to a wealth of data on college enrollment and completion, including data on transfer rates, persistence, and graduation rates.
The Stanford Education Data Archive (SEDA): SEDA is a repository of educational data that is maintained by Stanford University's Graduate School of Education. It contains data from a wide range of sources, including large-scale assessments, educational surveys, and administrative data.
This is Statistics--Competion by the ASA: https://thisisstatistics.org/resource-roundup-after-the-bell/
Trends in International Mathematics and Science Study (TIMSS): TIMSS is a large-scale assessment of student achievement that is administered by the International Association for the Evaluation of Educational Achievement (IEA). TIMSS assesses the knowledge and skills of 4th and 8th grade students in mathematics and science.
UCI Machine Learning Repository: The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that can be used for educational purposes.
UNESCO Institute for Statistics: UNESCO provides access to data on education, including enrollment, literacy rates, and other indicators of educational attainment.
United Nations Data: The United Nations provides access to a wide range of data on global economic, social, and environmental trends.
University of California Irvine Machine Learning Repository: UCI's MLR is not specific to education, so you will need to pick a data set that applies to this course. https://archive.ics.uci.edu/ml/datasets.php
US DOE Data Express: https://eddataexpress.ed.gov/
US DOE's English Language Learners: https://www2.ed.gov/datastory/el-characteristics/index.html
US Gov Stats: https://www.usa.gov/statistics
World Bank Data: The World Bank provides access to data on education, including enrollment rates, education spending, and the quality of education systems.
Worldometer: World Level Stats: https://www.worldometers.info/ar/
Zenoro: This data source is not specific to education, so you will need to pick a data set that applies to this course: https://zenodo.org/
Here are some of the online learning systems that have appeared in the literature. This list might help you with search terms when you are trying to pick a learning environment to research.
I-MAESTRO
4C/APS
ABSTRACT SPIRIT
Abu Naser
AC-ware Tutor
Acharya
ACLS
ActiveMath
ActiveMath19
ActiveStats
Adaptive Learning Application (ALA)
ADAPTS
Adil (Automated Debugger in Learning System)
ADIS (Animated Data Structure Intelligent Tutoring System)
AeLF
ATS (Affective Tutoring Systems)
AHA!
AI-VT (AI-Virtual Trainer)
AIS-IFT
ALEKS
ALERT
Alge Brain
AlgebraLand
Alphie's Alley
AMDPC (Adaptation with Multidimensional Customization Criteria)
AMDPC (Adaptation with Multidimensional Customization Criteria)
AmritaITS
AmritaITS
ANATOM-TUtor
Andes
AnimalWatch
Aplusix
APOSDLE
APROPOS2
AquaLab
ARIES
ART (Affable Reading Tutor)
AJI-Tutor (Artificial Javanese Intelligent Tutor)
ASSISTments
ASTUS
ATEC
A Tl·:c
AttentiveReview
AutoTutor AutoTutor Lite
AutoTutor-ARC (Adult Reading Comprehension)
AWE (AnimalWatch Web-based Environment; see also AnimalWatch)
BeSocratic
Betty's Brain
BGuILE
BioWorld
BITS (Bayesian Intelligent Tutoring System)
Blackboard
BRCA Gist (see also Auto Tutor Lite)
BUGGY
C-Tutor
CABRI-Gomtre
Calcularis
Canvas
CASSET
CCAITS (Classical Cartography Algorithms ITS)
ChemTutor
CHENE (CHaîne ENErgétique ="Energy Chain")
Autotutor-China** (A Chinese dialogue-based mathematical ITS using the AutoTutor. **No name actually given)
CIMEL-ITS
CIRCSIM-Tutor
Civil Engineering Tutor
CLARE (Collaborative Learning & Research Environment)
ClassMATE
ClassMood
Cognitive Constructor
Cognitive Tutor
COLLECT-UML
COMET
Conceptual Helper
Corrosion Investigator
Course Insights (an LAD that links across several sources)
CPR Tutor
CREEK-Tutor
Crystal Island
CSAL AutoTutor
CSAVE (Computer Science Accessible Virtual Education)
CueThink
CUMPAPH
DB-ITS
DCG (Dynamic Courseware Generation)
DEBUGGY & iDebuggy
DeepTutor
DEPTHS
Design Pattern
DesignFirst-ITS
Digital Tutor
DM-Tutor (Decision-Making Tutor)
Dr. Proctor
Dragoon
Duolingo
EAGLE (Electronic Assistant for Game-Based Learning Experiences)
EASEx (Embodied Agents to Scaffold Education, x = any data-rich domain, e.g., physics, medicine, psychology, astronomy)
EcoMUVE
EDUCE
EduPal
EER-Tutor
EINO
eKidsPower
ELAi
ELIE (Enriched Learning & Information Environment)
ELM-ART (ELM (Episodic Learner Model)
Enlearn
eRater
ESOP
ET (for Italian language)
Example Analogy (EA)-Coach
ExPLoRAA
FA-Tutor (Financial Analysis Tutor)
FACT (Formative Assessment using Computational Technology)
FAVL, the Feedback Articulate Virtual Laboratory
FearNot
FieldDayLabs: (A suite of educational games developed at the University fo Wisconsin)
FITS (Fraction Intelligent Tutoring System)
FUDAOWANG
FunctionLab
GASP-Tutor
GATutor
Gaze Tutor
GEO
GeoMe
Geometry Tutor
GIL (Graphical Instruction in LISP)
Grace
GroupMe
GRUNDY
GUIDON
HAL??????
Harmony Coach
Haskell-Tutor
Heráclito
HYDRIVE (HYDRaulics Interactive Video Experience)
HyperTutor
Ibigkas!
ICICLE
iCollab
ILESA
iList
ILONA
Imagine Learning
IMITS
IMITS (Interactive Multimedia Intelligent Tutoring System)
Ines
Infinite Campus
Inq-ITS
INQPRO
Integration-Kid
Intelligent Essay Assessor
Intelligent Flight Trainer (IFT)
Intelligent Tutoring/Expert System
Intellipath
InterBook
INTUITEL
INTUITION (INtelligent TUITION)
iReady
ISAC
iSTART (Interactive Strategy Trainer for Active Reading and Thinking)
ITADS
ITCDD
ITEAMS (An Intelligent Teaching Environment with Assessment Modules for Self-Study)
ITELS
ITSB
ITSCSBC
ItsLeader
IVRT
Java Sensei
JavaTutor
jGRASP
JITS (Java Intelligent Tutoring System)
KAS
KERMIT (Knowledge-Based Entity Relationship Modelling Intelligent Tutoring)
Korbit
L2Code
LARGO (Legal ARgument Graph Observer)
LCS (Learning Companion System)
LearnSmart
LeCo-EAD
Lemonade (Minnesota Educational Computing Consortium)
LES (Learning EcoSystem)
LICE
LILA (Learn Indian Language through Artificial Intelligence)
Lindquist
LISP tutor
LISPITS
LispTutor
LITES
Logic-Muse
LogicTutor
Loop Tutor
LUV (Lernen aus Unterrichtsvideos–learning from classroom videos)
MACiCAI
Management Tutor
MANIC
MathGirls
Mathia
Mathics
MaTHiSiS
MathSpring (see also Wayang Outpost)
MathWeb
Mathwise
Maya
MBITS (Multicriteria Bayesian Intelligent Tutoring System)
MENO Tutor and MENO-II
Metacognitive Mathematics
MetaDoc
MetaTutor
MeuTutor
MFD (Mixed numbers, Fractions, and Decimals)
MILE
Minecraft
MITS (Mixed Initiative Intelligent Tutoring System for Sudoku)
MITT (Microcomputer Intelligence for Technical Training)
ML-Tutor (Machine Learning Tutor)
Mode Monsakun
Moodle
Moodoo
MTS (Metacognitive Training System)
Music Matters
MyAccess
MYCIN
MyST (My Science Tutor)
NALS (Negotiation-based Adaptive Learning System)
NetCoach
NetsBlox
Newton’s Tablet
Nihongo Tutorial System
NLtoFOL SIP
NORMIT (Normalisation Intelligent Tutor)
Observational Learning (Show Me)
OmniSense
OMRaaT (One Mathematical Relationship at a Time)
OODLE: (Object-Oriented Design Learning Environment)
OOPS (Object Oriented Programming exercises)
OOST
OPERA
Operation ARA
OperationARIES (see also ARIES)
PACT (Parent and Child Tutor)
PACT Algebra
PALD (Personalized Adaptive Learning Dashboard)
Parent-EMBRACE (Enhanced Moved By Reading to Accelerate Comprehension in English)
Pascal Expert Tutorial System (aka, PETS)
PAT: Practical Algebra Tutor
PAT2Math (Brazilian)
PEDRO
PeerWise
PEGASE
PerSketchTivity
Persuasion
Invasion
Physics Playground
Piazza
PID-ITS
PIXIE
PLE (Programmed Learning Environment)
POSIT
PrepU
Problem Solving Tutor
PROJECTTUTOR
PROPA
ProPI
PROUST (Program Understanding for Students)
Prutor
PUSH
PUZZLED
Pyrenees
QED-Tutrix (QEDX)
QIS D2
QUIZ
RecursiveTutor
Reflection Assistant
RiPPLE
RMT (Research Methods Tutor)
S O P H I E (A SOPHisticated Instructional Environment)
SAMPLE: An intelligent educational system for electrical circuits
ScED-ASL (ScEd Adaptive Learning System)
SCHOLAR
SCHOLAR
SchoolKit
SCoT-DC
Sherlock
SHiB CBLE
Shufti
SIAL
SIAS
SICUN
Skills-based Talent Ecosystem for
Slide Tutor
Smart Sparrow
SmartTutor
Smithtown
Snooper Troops (Spinnaker) Factory (Sunburst)
Socratic Tutor ITS
Solar Punk (in development at Ateneo de Manila University)
SONATA
SOPHIE
SOPHIE
SPADE
SPIRIT
SQL-PITS
SQL-Tutor (SQL (Structured Query Language) Tutor)
SQLT-Web (SQL-Tutor on the Web)
STATPLAY
STCEQ
Steamer
Story Station
StoryStation
Student Explorer
SuaCode
SuperTangrams
SYPROS for SYnchronization of parallel PROcesses
T-SKIRT
TACKLE (Teaching Algorithmics with a Computer using the Karel Language Environment)
TalkMoves
TANGOW
Tarski’s World
TARTA: Teacher Activity Recognizer from Transcriptions & Audio
Task Tutor Toolkit™
TeLoDe (Teaching Linear Ordinary Differential Equations)
TERENCE
TEx-Sys (Tutor-Expert System)
The CaBLE Tutor (Case-Based learning-by-doing environment)
The Conceptual Helper
The Fuel Cell Tutor
The Invest Program
The Macsyma Advisor
The MetaHistoReasoning Tool
Turing
TutorJ
Upskilling (STEP UP)
US: Understanding Signs
VC Prolog Tutor
VHLQ
Vitrine 2001
VLab (Virtual Chemistry Laboratory)
VLE
vMedic
VocaTest
VR-ENGAGE
Water Runoff Challenge
Wayang Outpost (now known as MathSpring)
Web F-SMILE
Web-PVT
WEST
WEST
Why/AutoTutor (see also AutoTutor)
WISEngineering
WITS
WITS (Whole-Course Intelligent Tutoring System)
WoPST
WORDMATH
Wumpus
WUSOR
WUSOR-II
XAIDA
Yixue Squirrel AI
ZOSMAT
While you should peruse the other lists on this page before you get started, two search engines may be particularly useful if you want to use real data.
You may also want to check out some of data repositories that have been set up by members of the learning analytics community. These include:
ASSISTments, hosted by WPI
Data Shop, hosted by CMU's Pittsburgh Science of Learning Center (PSLC)
MORF, hosted by the University of Pennsylvania's Penn Center for Learning Analytics (PCLA)
FieldDay's Open Game Data, hosted by the University of Wisconsin
The Open University's Learning Analytics Data Set (OULAD)
This is not an exhaustive list!
ALEKS
ASSISTments
Carnegie Learning (Mathia & Mathspring)
Code.org
CodeCombat
DuoLingo
Edmentum (Study Island & Exact Path)
IXL Learning
Kahoot
Lexia Learning
Minecraft
Prodigy
Quizlet
Smart Sparrow
Thinkster Math
TypingClub