Tutorial: Using DataShop to analyze educational data

(John Stamper)

The main goal of this tutorial is to inform participants about the features of the Pittsburgh Science of Learning Center DataShop. DataShop is an open data repository and set of associated visualization and analysis tools. DataShop has data from hundreds of thousands of students deriving from interactions with on-line course materials and intelligent tutoring systems. The data is fine-grained, with student actions recorded roughly every 20 seconds, and it is longitudinal, spanning semester or yearlong courses. Currently over 200 datasets are stored including over 42 million student actions. Most student actions are “coded” meaning they are not only graded as correct or incorrect, but are categorized in terms of the hypothesized competencies or knowledge components needed to perform that action. The DataShop hardware infrastructure will be used to support file downloads and uploads during the competition. We will give an in-depth overview of the application and show how DataShop has been used to further the field of EDM. We will also give participants the information needed to add their data to DataShop so that they can use the tools to further their own research.