Data Challenge

Predicting Performance Based on the Analysis of Reading Behavior

As the adoption of digital learning materials in modern education systems is increasing, the analysis of reading behavior and their effect on student performance gains attention. The main motivation of this workshop is to foster research into the analysis of students’ interaction with digital textbooks, and find new ways in which it can be used to inform and provide meaningful feedback to stakeholders, such as: teachers, students and researchers. Building on previous years workshops at LAK19/LAK20 which focused on reading behavior in higher education, this year we will offer participants a new challenge that focuses on secondary school reading behavior using a synthetic dataset trained from actual data. Additional information on lecture schedules and syllabus will also enable the analysis of learning context for further insights into the preview, in-class, and review reading strategies that learners employ. Participant contributions will be collected as evidence in a repository provided by the workshop and will be shared with the wider research community to promote the development of research into reading analysis systems.

We welcome submissions on some of the following topics(though not restrictive):

  • Student performance/at-risk prediction

  • Student reading behavior self-regulation profiles spanning the entire course

  • Preview, in-class, and review reading patterns

  • Student engagement analysis; and behavior change detection

  • Visualization methods to inform and provide meaningful feedback to stakeholders

Participants will be encouraged to share their results and insights of analyzing the provided data or other research related to reading behavior analysis by submitting a paper for presentation at the workshop

Participants will also be encouraged to contribute their programs/source code created in the workshop to an ongoing open learning analytics tool development project for inclusion as an analysis feature.

Evaluation Metrics

  • Prediction evaluation will be scored using the following metrics: RMSE.

Important Dates

  • Initial paper submission: February 9, 2021

  • Notification of acceptance: February 23, 2021

  • Registration deadline: TBA

  • Data Challenge Final Results Submission deadline: TBA

  • Camera-Ready deadline: TBA


  • 4pm (PDT) 12th April at LAK 2021 in cyberspace , online (A link will be sent to participants by email before the workshop).


  • Hiroyuki Watanabe, Li Chen, Yoshiko Goda, Atsushi Shimada and Masanori Yamada, Development of a Time Management Skill Support System Based on Learning Analytics (Full Paper)

  • Hiroyuki Kuromiya, Automatic Classification of the Learning Pattern -Time-Series Clustering of Students’ Reading Behaviors (Short Paper)

  • Data Challenge Overveiw (Brendan Flanagan)

  • SoLAR Asia SIG Overview (Rwitajit Majumdar)

  • General discussion on reading behavior analysis, and future directions.


Data challenge track: Initial paper submissions should at least give an outline of work in progress with some preliminary analysis.

Research track: Paper submissions should be fully finalized papers.

  • Full paper: 8-10 pages (data challenge initial paper submission: 6 pages or more)

  • Short paper: 5-6 pages (data challenge initial paper submission: 4 pages or more)

  • Poster paper: 2-3 pages (data challenge initial paper submission: 1 page or more)

Submit papers using EasyChair:

All submissions to the workshop must follow the format of the Companion Proceedings Template (

Organizing Committee

  • Brendan Flanagan (Kyoto University, Japan)

  • Atsushi Shimada (Kyushu University, Japan)

  • Rwitajit Majumdar (Kyoto University, Japan)

  • Hiroaki Ogata (Kyoto University, Japan)

PC Members

  • Gökhan Akçapınar (Hacettepe University)

  • Mohammad Nehal Hasnine (Hosei University)

  • Mei-Rong Alice Chen (National Taiwan University of Science and Technology)

  • Hiroyuki Kuromiya (Kyoto University)

  • Tsubasa Minematsu (Kyushu University)

  • Patrick Ocheja (Kyoto University)

  • Yuta Taniguchi (Kyushu University)

  • Min Lu (Kyushu University)

  • Masanori Yamada (Kyushu University)

  • Lakshmi T G (IIT Bombay )

  • Chengjiu Yin (Kobe University)

  • Rekha Ramesh (Mumbai University)

  • Ivica Boticki (University of Zagreb)

  • Louis Lecailliez (Kyoto University)


By downloading our dataset and using our dataset you have agreed to our Terms of Use.

The dataset for this data challenge includes 2 types of files:


- Data of the logged activity data from students' interactions with the BookRoll system.


- Data on the final score for each student. This can be used as a label for training and testing prediction models.

For a more description of the columns, please refer to the README file in the dataset download.

A link to download the dataset will be provided after your contact information has been registered and agreement with the terms of use have been met.


The following is a Python library that can read BookRoll log files provided by this workshop, extract data, convert data, and perform simple visualization.


Developer: Laboratory for Image and Media Understanding, Kyushu University.

For more information about BookRoll and the learning analytics platform on which the data was collected, please refer to the following:

  • Brendan Flanagan, Hiroaki Ogata, Learning Analytics Platform in Higher Education in Japan, Knowledge Management & E-Learning (KM&EL), Vol.10, No.4, pp.469-484, 2018.

  • Digital teaching material delivery system "BookRoll"

  • Hiroaki Ogata, Misato Oi, Kousuke Mohri, Fumiya Okubo, Atsushi Shimada, Masanori Yamada, Jingyun Wang, and Sachio Hirokawa, Learning Analytics for E-Book-Based Educational Big Data in Higher Education, In Smart Sensors at the IoT Frontier, pp.327-350, Springer, Cham, 2017.