LBLS Dataset

A Quality Data Set for Data Challenge: Featuring 160 Students’ Learning Behaviors and Learning Strategies (LBLS) in a Programming Course.

Introduction to the datasets

Emerging science requires data collection to support the research and development of advanced methodologies. In the educational field, conceptual frameworks such as Learning Analytics (LA) or Intelligent Tutoring System (ITS) also require data. Prior studies demonstrated the efficiency of academic data, for example, risk student prediction and learning strategies unveiling. However, a publicly available data set was lacking for benchmarking these experiments. To contribute to educational science and technology research and development, we conducted a programming course series two years ago and collected 160 students' learning data. The data set includes two well-designed learning systems and measurements of two well-defined learning strategies: Self-regulated Learning (SRL) and Strategy Inventory for Language Learning (SILL). Then we summarized this data set as a Learning Behavior and Learning Strategies data set (LBLS-160) in this study; here, 160 indicates a total of 160 students. Compared to the prior studies, the LBLS data set is focused on students' book reading behaviors, code programming behaviors, and measurement results on students' learning strategies.

Challenges

To achieve the goal of facilitating learning analytics research and development, we have sorted out three learning analytics applications raised in recent years and considered to be potentially achieved through LBLS-160 as follows

  1. Educational data visualization: In recent years, educational data visualization has become increasingly popular to support learners' monitoring and tracking of their learning status. Researchers summarized a few meaningful research questions on this topic, for example: "Who are the learners?", "What do they do while learning?"(Schwendimann et al., 2016).

  2. Learning strategies unveiling: This is also a young topic since 2017(Jovanović et al., 2017). Researchers demonstrated learners' book reading behaviors were a piece of evidence of their SRL strategy (Akçapinar, Chen, Majumdar, Flanagan, & Ogata, 2020). Measuring learners' learning strategies using logs instead of questionnaires could be in more real-time and reliable. Therefore, unveiling the correlation between students' learning behaviors and learning strategies will be a reasonable research question in the proposed data set.

  3. Cross-class risk / outstanding prediction: Prior studies proved learning logs were valuable materials to identify risk students in the classroom (Conijn et al., 2016; Lu et al., 2016). However, the prediction model in prior studies didn't confirm the generalizable in the cross-class scenario. Model performance benchmark on one opened data set has also not been considered.

Reference