Workshop on Automated Evaluation of Learning and Assessment Content

AIED 2024 workshop |  Recife (Brazil), Hybrid

 

Call for Papers

The evaluation of learning and assessment content has always been very important in the educational domain. Assessment content, such as questions and exams, is commonly evaluated both with traditional approaches such as Item Response Theory [8] and more modern approaches based on machine learning [1, 3, 2]. However, the evaluation of learning content – such as single lectures, and whole courses and curricula – still relies heavily on experts from the educational domain. The same is true for several other components of the educational pipeline: for instance, distractors (i.e., the plausible incorrect options in multiple choice questions) are commonly evaluated with manual labelling [4, 6, 7], since automatic evaluation approaches proposed so far have some limitations [11]. The need to develop accurate metrics for evaluating learning and assessment content became even more pressing with the rapid growth and adoption of LLMs in the educational domain, both open (e.g., Gemma, Llama 2 and Vicuna) and closed (e.g., GPT-4). Indeed, previous research showed that LLMs can be used for a variety of educational tasks – from feedback generation and automated assessment to question and content generation [10, 5, 9] – and being able to accurately evaluate the output of LLMs in an automated manner becomes of crucial importance to ensure the effectiveness of their application, since traditional approaches based on human feedback are not easily scalable to large amounts of data.

Importantly, the evaluation needs to consider both the educational requirements of the generated content and the biases that might emerge from the generation models. For instance, the generated content must align to the learning objectives of the specific course (or exam) where it is being used, as well as to the language level suitable for the target students. Moreover, similarly to what happens when applying language models to other domains, the evaluation must assess the factual accuracy, and the EDIB (Equity, Diversity, Inclusion, & Belonging) appropriateness of the generated text. This workshop focuses on approaches for automatically evaluating learning and assessment content.

We expect this workshop to attract professionals from both industry and academia, and to create a space for discussion about the common challenges in evaluating learning and assessment content in education. Through papers and debate, we aim at collecting guidelines and best-practices for the evaluation of educational content. We believe these will be a very relevant contribution for the AIED community, and a reference for future research on the evaluation (and generation) of learning and assessment content.

Topics of interest

 Topics of interest include but are not limited to:

Human-in-the-loop approaches are welcome, provided that there is also an automated component in the evaluation and there is a focus on the scalability of the proposed approach. Papers on generation are also very welcome, as long as there is an extensive focus on the evaluation step.

Submission guidelines

There are two tracks, with different submission deadlines.

Full and short papers: We are accepting short papers (5 pages, excluding references) and long papers (10 pages, excluding references), formatted according to the workshop style (using either the LaTeX template or the DOCX template). 

Extended abstracts: We also accept extended abstracts (max 3 pages), to showcase work in progress and preliminary results. Papers should be formatted according to the workshop style (using either the LaTeX template or the DOCX template). 

Submissions should contain mostly novel work, but there can be some overlap between the submission and work submitted elsewhere (e.g., summaries, focus on the evaluation phase of a broader work). Each of the submissions will be reviewed by the members of the Program Committee, and the proceedings volume will be submitted for publication to CEUR Workshop Proceedings. Due to CEUR-WS.org policies, only full and short papers will be submitted for publication, not the extended abstracts.

Submission URL: https://easychair.org/conferences/?conf=evallac2024

Important dates

References