This project attempts to develop a new framework for diagnostic instruments for evaluating student knowledge, and to use data from this instrument to construct an aggregate network model of student knowledge.
One key challenge in Intelligent Tutoring Systems (ITS) or Adaptive Learning Systems (ALS) lies in evaluating student learning when multiple interconnected skills or pieces of knowledge are involved. Additionally, traditional systems also only consider "correct" vs "wrong" outcomes, and do not consider the idea of misconceptions, despite this being a key part of pedagogical study.
Building on traditional Item Response Theory (IRT), we propose a model that instead assumes that each option on a multiple choice question (MCQ) suggests the presence of one or more constructs, which in this context may be pieces of knowledge or misunderstandings. This varies from traditional IRT, where a single scale of difficulty (the Item Characteristic Curve) is assumed across the entire test, and questions are only scored based on whether that question was answered correctly or not. In essence, this framework treats each MCQ option as its own separate true/false question, and considers several independent difficulty scales, one for each skill or knowledge.
To validate this approach, preliminary testing was conducted with 1171 students across 8 schools using the topic of "Forces and Dynamics" in Physics. We attempted to produce a network of weights between constructs using pairwise Cramer's V tests (instead of chi-square tests, which do not account for variations in sample size). The outcomes suggest the presence of expected associations between these theoretical constructs, as shown below. In this diagram, blue arrows show connections where a relationship above a particular threshold was found, with thicker lines indicating a stronger relationship. Thin grey arrows indicate relationships that we expected to observe, but did not observe in the data.