Datasets
This page is old. We have moved the maintenance of data sets to here but leave this page live as others might be pointing to this.
This page is old. We have moved the maintenance of data sets to here but leave this page live as others might be pointing to this.
Professor Heffernan asks his graduate students to release their data and code to this website every time they publish a paper (as long as the data is 100% anonymized; we do have data sets that include student open response data where a student could release personally identifiable information (PII) like this paper. For such a paper you can ask for a legally binding contract that will require you do a few things like not attempting to look for PII and, if you find it, to tell us about it and delete it.) He believes that makes for good science. All papers after 2012 should have a publicly available link to get the data, but if that is not true, email all the authors, and myself, asking for the data. He is glad that researchers that he does not know are starting to use these date sets and are writing their own papers. If you are writing a paper or know of a paper that uses ASSISTments data, please email the citiation to etrials@assistments.org.
With all this concern about student privacy, Professor Heffernan liked this quote from Annie Murphy Paul, "But there’s another potential danger regarding student data that is much less frequently noted: the possibility that it will sit unused, inaccessible to parents, educators, the general public—and students themselves. "
ASSISTments seeks to share its data from its system. Of course, we scrub the data of all personally identifiable information to the best of our ability. Sometime people release data that they think is scrubbed but are wrong (see this ), so the terms of use of this data set is that if you find personally identifiable information, you need to report that and you cannot use that information in anyway (suppose a student typed in "I am Barrack Obama and I live at 1600 Penn Ave in DC" that would be personalized data that we did not catch. If you find any of this, you must delete it, and cannot use that. ) Please review the WPI Institutional Review Board approved terms of use that govern the use of the data.
If you want to hear more on open science, read this.
Click here to read about published work on our commonly used data sets.
How do you read log files from ASSISTments? Since many different projects use the same variables, here is the one place were we try to document the mean of the different variables.
This data is released by Professor Neil Heffernan of WPI.
N. T. Heffernan, 2014.