2011 ASSISTments PFA Data

This data set at https://drive.google.com/file/d/0BxSUIEtxjoanTUdCZ3pjRkhkN28/edit?usp=sharing is used in

Gong, Y., Beck, J. E. & Heffernan, N. T. (2011) How to Construct More Accurate Student Models: Comparing and Optimizing Knowledge Tracing and Performance Factor Analysis. I. J. Artificial Intelligence in Education 21(1-2): 27-46 (2011)

The data set at https://docs.google.com/file/d/0BxSUIEtxjoana2h4TWJIeU1ySVE/edit includes the same set of data. But for multiple skill questions, one such question is split to multiple single skill questions.

The zip file contains a training data set, a test data set and an excel file of students' pre-test and post test scores.

  • skill_id (column A) - skill id

  • user_id (column B) - student id

  • correct (column C) - the correctness of the response (1 is correct and zero is wrong)

  • problem_log_id (column D) - the log id as appeared in the database. Smaller id means the problem was answered earlier in time.

  • problem_id (column E) - problem id

  • 1_s (column F) - the number of prior successes for skill 1

  • 1_f (column G)- the number of prior failures for skill 1