Data Challenge

Predicting Performance Based on the Analysis of Reading Behavior

As the adoption of digital learning materials in modern education systems is increasing, the analysis of reading behavior and their effect on student performance gains attention. The main motivation of this workshop is to foster research into the analysis of students’ interaction with digital textbooks, and find new ways in which it can be used to inform and provide meaningful feedback to stakeholders: teachers, students and researchers. The previous years workshops at LAK19 and LAK20 focused on reading behavior in higher education, and LAK21 on secondary school reading behavior. As the COVID-19 pandemic has bought about sudden change in learning environments around the world, participants of this year’s workshop will be given the unique opportunity to analyze the changes from onsite classes in 2019 and online classes in 2020 in the same education institution. As with previous years, additional information on lecture schedules and syllabus will also enable the analysis of learning context for further insights into the preview, in-class, and review reading strategies that learners employ. Participant contributions will be collected as evidence in a repository provided by the workshop and will be shared with the wider research community to promote the development of research into reading analysis systems.

We welcome submissions on some of the following topics(though not restrictive):

  • Student performance/at-risk prediction

  • Student reading behavior self-regulation profiles spanning the entire course

  • Preview, in-class, and review reading patterns

  • Student engagement analysis; and behavior change detection

  • Visualization methods to inform and provide meaningful feedback to stakeholders

Participants will be encouraged to share their results and insights of analyzing the provided data or other research related to reading behavior analysis by submitting a paper for presentation at the workshop

Participants will also be encouraged to contribute their programs/source code created in the workshop to an ongoing open learning analytics tool development project for inclusion as an analysis feature.

Important Dates

  • Initial paper submission: 4th January, 2022 17th December, 2021

  • Notification of acceptance: 14th January, 2022

  • Registration deadline: TBA

  • Data Challenge Final Results Submission deadline: TBA

  • Camera-Ready deadline: TBA


  • 22 March at LAK 2022 in cyberspace , online (A link will be sent to participants by email before the workshop).


March 22nd PDT time zone (March 23rd JST time zone in brackets)

15:00 - 15:15 (JST 7:00 - 7:15) - Opening (Brendan Flanagan, Fumiya Okubo)

Session 1 (Data Challenge)

  • 15:15 - 15:35 (JST 7:15 - 7:35) - Exploring the use of probabilistic latent representations to encode the students' reading characteristics,
    Erwin Daniel Lopez Zapata, Tsubasa Minematsu, Yuta Taniguchi, Fumiya Okubo and Atsushi Shimada (FULL) PDF

  • 15:35 - 15:55 (JST 7:35 - 7:55) - Predicting student performance based on Lecture Materials data using Neural Network Models,
    Sukrit Leelaluk, Tsubasa Minematsu, Yuta Taniguchi, Fumiya Okubo and Atsushi Shimada (SHORT) PDF

  • 15:55 - 16:15 (JST 7:55 - 8:15) - A Preliminary Study on Student Classroom Reading Vs Digital Reading Pattern Behavior Analysis during Pandemic,
    Divanshi Wangoo and S.R.N Reddy (FULL) PDF


Session 2 (Behavior Analysis and Recommendation)

  • 16:30 - 16:50 (JST 8:30 - 8:50) - An analysis of reading process based on real-time eye-tracking data with web-camera??Focus on English reading at higher education level,
    Xiu Guan, Chaojing Lei, Yingfen Huang, Yu Chen, Hanyue Du, Shuowen Zhang and Xiang Feng (FULL) PDF

  • 16:50 - 17:10 (JST 8:50 - 9:10) - Preliminary Personal Trait Prediction from High school Summer Vacation e-learning Behavior,
    Kyosuke Takami, Brendan Flanagan, Rwitajit Majumdar and Hiroaki Ogata (FULL) PDF

  • 17:10 - 17:30 (JST 9:10 - 9:30) - Design of a User-Interpretable Math Quiz Recommender System for Japanese High School Students,
    Yiling Dai, Brendan Flanagan, Kyosuke Takami and Hiroaki Ogata (FULL) PDF

17:30 - 17:55 (JST 9:30 - 9:55) Wrap-up session

17:55 - 18:00 (JST 9:55 - 10:00) - Closing


Data challenge track: Initial paper submissions should at least give an outline of work in progress with some preliminary analysis.

Research track: Paper submissions should be fully finalized papers.

  • Full paper: 8-10 pages (data challenge initial paper submission: 6 pages or more)

  • Short paper: 5-6 pages (data challenge initial paper submission: 4 pages or more)

  • Poster paper: 2-3 pages (data challenge initial paper submission: 1 page or more)

Submit papers using EasyChair:


All submissions to the workshop must follow the format of the Workshop Proceedings Template (download). The workshop proceedings will not be published in the LAK companion proceedings this year (LAK22 policy), and instead will be published through CEUR-WS (

Organizing Committee

  • Brendan Flanagan (Kyoto University, Japan)

  • Atsushi Shimada (Kyushu University, Japan)

  • Fumiya Okubo (Kyushu University, Japan)

  • Huiyong Li (Kyoto University, Japan)

  • Rwitajit Majumdar (Kyoto University, Japan)

  • Hiroaki Ogata (Kyoto University, Japan)

PC Members

  • TBA


By downloading our dataset and using our dataset you have agreed to our Terms of Use.

The datasets for this data challenge have been synthetically generated from real data, and include 4 types of files:


- Data of the logged activity data from students' interactions with the BookRoll system.


- Information about the length of the lecture materials used.


- Information about the schedule of the lectures. This can be used to analyze the preview/in-class/review reading behaviors.




- Data on the final score(0-100) OR grade(A, B, C, D, F) for each student. This can be used as a label for each student when modeling data.

For a more description of the columns, please refer to the README file in the dataset download.

A link to download the dataset will be provided after your contact information has been registered and agreement with the terms of use have been met.

In order to handle GradePoint files in the dataset with OpenLA, it is necessary to update OpenLA by using the following command if you already have it installed.

pip install -U OpenLA

Acknowledgements: Ryusuke Murata (Kyushu University, Japan) contributed to the development of OpenLA and the generation of a part of the datasets.

Handy Tools:

The following is a Python library that can read BookRoll log files provided by this workshop, extract data, convert data, and perform simple visualization.


Developer: Laboratory for Image and Media Understanding, Kyushu University.

For more information about BookRoll and the learning analytics platform on which the data was collected, please refer to the following:

  • Brendan Flanagan, Hiroaki Ogata, Learning Analytics Platform in Higher Education in Japan, Knowledge Management & E-Learning (KM&EL), Vol.10, No.4, pp.469-484, 2018.

  • Digital teaching material delivery system "BookRoll"

  • Hiroaki Ogata, Misato Oi, Kousuke Mohri, Fumiya Okubo, Atsushi Shimada, Masanori Yamada, Jingyun Wang, and Sachio Hirokawa, Learning Analytics for E-Book-Based Educational Big Data in Higher Education, In Smart Sensors at the IoT Frontier, pp.327-350, Springer, Cham, 2017.