LDI Workshop @ EDM 2021

The Learner Data Institute is funded by The National Science Foundation

Big Data, Research Challenges, & Science Convergence in Educational Data Science

The Second Workshop of the Learner Data Institute

A Half-Day Virtual Workshop @ EDM 2021

June 29, 2021 — online

11:30AM - 4:30PM Eastern Daylight Time (USA)
17:30 - 22:30 CEST (Central European Summer Time)

Proceedings URL: http://ceur-ws.org/Vol-3051/#LDI

For our past workshop, see LDI @ EDM '20.

Update - January 5, 2022 - Proceedings Available!

  • The Proceedings of the Second Workshop of the Learner Data Institute are published and available here!

Updates - June 28, 2021

  • The agenda for tomorrow's workshop is posted!

Updates - May 27, 2021

  • We're excited to announce that Dr. Ryan Baker (University of Pennsylvania) will deliver the keynote talk at our upcoming workshop. His talk is entitled "Opportunities for Learning Engineering and Educational Data Mining: Findings from an Asynchronous Virtual Convening." See details below.

  • We welcome submissions (especially, but not limited to, position papers) that address the topic of Dr. Baker’s keynote, highlighting opportunities for learning engineering and EDM.

  • The new deadline for submissions is Friday, June 11, 2021. We will notify authors of accepted submissions by Friday, June 18.

Keynote Speaker

Dr. Ryan S. Baker is Associate Professor in the Graduate School of Education at the University of Pennsylvania and Director of the Penn Center for Learning Analytics. His talk is entitled "Opportunities for Learning Engineering and Educational Data Mining: Findings from an Asynchronous Virtual Convening." We're planning for an interactive keynote with time for workshop attendees to discuss these opportunities.

Agenda (All times are Eastern Daylight Time [EDT] in the USA.)

11:30AM EDT - Workshop & Interactive Keynote Introduction [Stephen Fancsali]

Interactive Keynote Session

11:35AM EDT - Keynote Talk [Ryan S. Baker] - "Opportunities for Learning Engineering and Educational Data Mining: Findings from an Asynchronous Virtual Convening"

12:00PM EDT - Breakout Room Introduction [Stephen Fancsali]

12:05PM EDT - Breakout Room Sessions on Learning Engineering Opportunities from Keynote [All Participants]

12:25PM EDT - "Greatest Hits" Curated Sharing of Emerging Ideas from Breakout Room Sessions [Stephen Fancsali, Ryan S. Baker, & Vasile Rus]


Contributed Papers

1:15PM EDT - "The Learner Data Institute - Conceptualization: A Progress Report" [Vasile Rus, Stephen E. Fancsali, Deepak Venugopal, Arthur C. Graesser, Philip Pavlik Jr.., Dale Bowman, Steve Ritter, & The LDI Team] [PDF]

1:45PM EDT - "Toward Scalable Improvement of Large Content Portfolios for Adaptive Instruction" [Stephen E. Fancsali, Hao Li, & Steven Ritter]

2:05PM EDT - BREAK [20 minutes]

2:25PM EDT - "Sentence Selection for Cloze Item Creation: A Standardized Task and Preliminary Results" [Andrew Olney]

2:45PM EDT - "Neuro-Symbolic Models: A Scalable, Explainable Framework for Strategy Discovery from Big Edu-Data" [Deepak Venugopal, Vasile Rus, & Anup Shakya]

3:05PM EDT - BREAK [20 minutes]

3:25PM EDT - "The Nature of Achievement Goal Motivation Profiles: Exploring Situational Motivation in An Algebra-Focused Intelligent Tutoring System" [Leigh Harrell-Williams, Christian Mueller, Stephen Fancsali, Steve Ritter, Xiaofei Zhang & Deepak Venugopal]

3:45PM EDT - "ENGAGE: An API Capable Data Collection and Analysis System for Classroom Behavior" [Susan Elswick, Laura Casey, & William Hendrick]

4:05PM EDT - "Random Data Partition for Big Data Model Building" [Lih-Yuan Deng, Dale Bowman, and Ching-Chi Yang] [PDF] [PPT]

4:25PM EDT - Open Discussion & Closing

Workshop Summary

The Second Workshop of the Learner Data Institute (LDI) builds on the success of last year's virtual workshop and seeks to bring together researchers working across disciplines on data-intensive research of interest to the educational data science and educational data mining communities. In addition to welcoming work describing mature, data-intensive or “big data” research and emerging work-in-progress that spans traditional academic disciplines, the workshop organizers welcome case studies of interdisciplinary research programs and projects, including case studies of learning engineering efforts pursued by universities, learning technology providers, and others (both successful and unsuccessful), as well as position papers on important challenges for researchers harnessing “big data” and crossing disciplinary boundaries as they do so.

We convene researchers and developers from diverse fields who seek to “harness the data revolution” in educational data science and “grow convergence research,” aligning with (at least) two of the U.S. National Science Foundation’s “10 Big Ideas” for emerging research and development opportunities. “Convergence builds and supports creative partnerships and the creative thinking needed to address complex problems” (NSF’s 10 Big Ideas: Growing Convergence Research), and we expect that bringing together highly experienced researchers, as well as students and early-career researchers, will stimulate substantial growth and interest in state-of-the-art, data-intensive, transdisciplinary or “convergent” approaches to solving vexing societal problems related to education. We also seek to explore the big data and learning engineering frameworks that will enable convergent solutions.

Questions & Areas of Interest

  • How can we use massive and diverse datasets generated by adaptive instructional systems (AISs) to address core questions and challenges in learning science and engineering?

  • Are learners, teachers, and learning science researchers successfully interacting with cyber-learning technologies?

  • What are some critical challenges with respect to scaling the development of AISs across many domains and perhaps millions of learners?

  • What are the limitations of AISs and adaptive components of instructional systems?

  • Which aspects of learning are best handled by humans and which ones by cyber-learning technologies (and how do we enhance the interaction of the two)?

  • How can data from student and teacher interactions with cyber-learning technologies, in and outside the classroom, be collected in ways consistent with best practices—e.g., with respect to data fidelity, security, reliability, privacy, human subject research protocols, school policies, parental consent, HIPAA, FERPA, etc.?

  • Methodology, infrastructure, and workflows for “big data” and data-intensive educational research

  • Inter/multi/trans-disciplinary approaches to data-intensive educational research

  • Case studies of successful & unsuccessful efforts to practically harness insights from large datasets in settings where learning takes place (e.g., case studies of “learning engineering” efforts)

  • Emerging challenges for researchers working across disciplines with large datasets

  • Use-cases, workflows, and case studies (illustrating the need) for (possibilities of extensions to existing) data infrastructure for research leveraging learner data, including data repositories, (open source) software and statistical libraries, innovative use of cloud computing resources, etc.


The Workshop Committee solicits three types of submissions:

  • Full papers (up to 8 pages): describing mature research, extensive descriptions of data-intensive workflows, and learning engineering and convergence research efforts suitable for a 20 minute presentation.

  • Short papers (4-6 pages): suitable for a 10 minute presentation, especially appropriate for work-in-progress and shorter case studies.

  • Position papers (up to 3 pages) describing approaches to convergence research (see below), emerging challenges (e.g., that the LDI might take on collaboratively with authors with future funding), “wishlist(s)” for transformative learning applications, resources like data repositories, and other infrastructure that would fuel innovative work (e.g., that LDI could collaboratively develop with future funding): suitable for a 5 minute presentation.

We hope that all papers, but especially position papers, will spark conversations and interactions to drive future collaborations between LDI researchers and workshop participants.

Important Dates

  • Submission Deadline: June 11, 2021 [Deadline Extended!]

  • Acceptance Notification: June 18, 2021

  • Workshop: June 29, 2021


Use the EDM paper templates.

Microsoft Word template: https://educationaldatamining.org/edm2020/wp-content/uploads/sites/4/2019/09/edm_word_template2020.doc

LaTeX template: https://educationaldatamining.org/edm2020/wp-content/uploads/sites/4/2019/09/edm_submission2020.zip

Submission System (EasyChair): https://easychair.org/conferences/?conf=ldiedm2021

More About the Workshop

The proceedings of the International Conference on Educational Data Mining (and related conferences, including the International Conference on Artificial Intelligence in Education, the ACM Conference on Learning at Scale, and Learning Analytics and Knowledge) demonstrate inherent linkages across traditional and emerging academic disciplines and research areas. Whether efforts are described as interdisciplinary, multidisciplinary, or trans-disciplinary, providing solutions to compelling challenges faced by learners, those individuals and institutions that facilitate learning, and other learning stakeholders must draw on expertise across boundaries of disciplines as diverse as, but not limited to, psychology, cognitive and learning science(s), mathematics, computer science (e.g., machine learning, artificial intelligence), statistics, human-computer interaction, public policy, education, neuroscience, social work, moral and political philosophy, and any of a number of sub-fields and research areas at the intersection of these disciplines.

The need for such multi/inter/trans-disciplinary solutions is even more relevant today as the vast and diverse repositories of digital data available can make such solutions viable. Indeed, recognizing both substantial scientific challenges and the need for innovative scientific frameworks to solve them, the U.S. National Science Foundation has identified the notion of “convergence” research as one of ten “big ideas” for its on-going investment strategy. Two attributes are crucial to NSF’s notion of convergence research, namely that such research is “driven by a specific and compelling problem” and emphasizes “deep integration across disciplines.” Such integration is achieved when:

“... experts from different disciplines pursue common research challenges, [and] their knowledge, theories, methods, data, research communities and languages become increasingly intermingled or integrated. New frameworks, paradigms or even disciplines can form sustained interactions across multiple communities” (Convergence Research at NSF).

The NSF-funded LDI focuses on such science convergence solutions for major challenges in learning with technology.

About the Learner Data Institute

The Learner Data Institute (LDI) is an NSF-funded “data-intensive research in science and engineering” (DIRSE) initiative (NSF Award #1934745) seeking to set out compelling, specific, big data research challenges for educational data science researchers and large-scale scientific and data convergence approaches to address them.

The LDI will help us learn: (1) how to transform a far-flung group of interdisciplinary researchers, developers, and practitioners into a community of practice that can fully exploit the data revolution through data and science convergence; (2) how adaptive instructional systems and data science can be used as research vehicles to further our understanding of how learners learn; (3) to explore the human-technology partnership with data and data science to improve learners’ and teachers’ ability to employ technology in a way that facilitates learning, while at the same time improving the affordability, effectiveness, scalability of these systems; and (4) more generally, how to extend the frontiers of data science to include: new methods of data collection and design; more interpretable machine learning methods (e.g., by combining deep learning with more interpretable inference frameworks like Markov Logic); scalable new algorithms (e.g., for joint inference in Markov Logic Networks); and methods for identifying causal mechanisms from unstructured, semistructured, and structured data.

More specifically, LDI contributors from university-based research groups, industry, and government are focusing on cutting-edge, big data approaches to assessment, learner modeling, instructional design, modeling subject-area domains in instructionally useful ways, socio-cultural aspects of learning, ethical aspects of working with learner data, and the human- technology frontier, among other areas of interest.

Workshop & Review Committee

Vasile Rus, Ph.D., University of Memphis (Co-Chair)

Stephen E. Fancsali, Ph.D., Carnegie Learning, Inc. (Co-Chair)

Dale Bowman, Ph.D., University of Memphis

Jody Cockroft, AA, BS, CCRP, University of Memphis

Art Graesser, Ph.D., University of Memphis

Andrew Hampton, Ph.D., University of Memphis

Philip I. Pavlik Jr., Ph.D., University of Memphis

Chip Morrison, Ed.D., University of Memphis

Steven Ritter, Ph.D., Carnegie Learning, Inc.

Deepak Venugopal, Ph.D., University of Memphis

[Ad-hoc reviewers will be drawn from the group of LDI contributors and broader community as necessary.]