Educational Testing Service
Making assessments useful in language education or “fine words butter no parsnips”
Assessments are developed to support score interpretations and uses, and the score-based claims are validated by stating the claims clearly and by evaluating their plausibility. Educational assessment programs can make use of at least three different kinds of interpretations: descriptive interpretations in terms of observable attributes (based on classical and IRT models), explanatory interpretations based on theories of performance, and developmental interpretations in terms of learning progressions (e.g., the CEFR), and can use the scores in a number of ways (e.g., grading, diagnosis, placement). Traditional psychometric analyses tend to focus on descriptive interpretations, but the explanatory and developmental interpretations may be more useful in practice. In any case, it is the interpretations and uses of scores that are validated (rather than the assessments or the scores per se), and it is important that the claims actually based on the scores are the same as those that have been validated.
Language program evaluation in contemporary language education: Current practices, future directions
An increasingly common requirement for language teachers is the integral use of program evaluation in language educational delivery. To this end, evaluation is meant to be a useful, practical tool that helps instructors, administrators, funders, and communities know or do something that enhances the quality of language instruction. Yet, the reality of evaluation practice—particularly in the present climate of heightened scrutiny and accountability—often fails to be a fully useful endeavor that impacts language education for the better. As such, language research has tried to understand evaluation efficacy and how different approaches can best serve the aims of different language education stakeholders. Furthermore, researchers have periodically looked to mainstream evaluation for insight into the various methodological and contextual factors that make evaluation an effective and meaningfully useful activity. This plenary will likewise canvass current trends in mainstream evaluation practice and research (e.g., organizational learning, evaluation capacity-building, evaluability assessment, logic-modelling, etc.) and discuss their relevance and application in contemporary approaches to language program evaluation and assessment.
University of Southampton
Developing students’ self-assessment skills: The role of the teacher
An important role of program evaluation is the facilitation of innovation in classrooms and learning activities. Successful innovations which transform programs and enhance learning, require the support of teachers in terms of a commitment to making the innovation work. In this talk I focus on how teachers innovate, and the implications for program evaluation. I draw on recent research with teachers on the development of student self-assessment skills in language programmes. The issues raised relate to three themes relevant to developing useful program evaluations. First, the complexity of teachers’ practice makes externally motivated innovations difficult. Second, teachers’ evaluations of their teaching are based on personal and largely tacit frameworks, which become apparent through long-term reflective and collaborative professional development. Third, institution-level approaches to quality management and program evaluation are a significant factor for teachers: where they are considered to limit teachers’ autonomy and agency, it is likely they do just that.
New York University
The relationship between language proficiency and content knowledge
in the assessment of English language learners in schools
The relationship between language proficiency and content knowledge in assessment is a complicated one. Traditionally, language has been considered a source of construct irrelevant variance when assessing English language learners’ (ELLs) content knowledge. Similarly, content (or topical knowledge) has also been considered a potential source of construct irrelevant variance in the assessment of ELLs’ language proficiency. Thus, for the purpose of assessment, language proficiency and content knowledge have been viewed as separate and distinct constructs, with “academic language” serving as the bridge between the two. In this talk, I will discuss how evolving views of language and the introduction of new content standards that emphasize language as a critical component of content mastery are forcing us to rethink the language-content link and our construct definitions.
Center for Applied Linguistics
Uses for and consequences of language proficiency tests for students and teachers
This symposium will begin with a brief review of argument-based approaches to validity (Kane, 1993; Chappelle et al, 2008; Bachman &Palmer, 2010, Renn & MacGregor, 2014). It will then explore the assessment use argument and its influence on different tests of language proficiency developed for students and teachers. By focusing on the consequences of test results, the symposium will examine how language test developers design tests, items and tasks intended to promote effective teaching practice and accurately reflect student ability. We examine this from the perspective of the test development process and the research that informs that process, from developing test items to examining test performance via operational data. Presenters will include CAL researchers, test developers and psychometricians to provide a holistic picture of both the individual lenses of each part of the test design, development and operationalization process and to inform the consequences of test results from these perspectives.
Meg Malone || Center for Applied Linguistics
Jennifer Renn || Center for Applied Linguistics
Consider the consequences: Applying an argument-based validation framework to assessments with different purposes
Justin Kelly, Jennifer Norton, Michele Kawood || Center for Applied Linguistics
Ensuring content validity: From construct definition to test item
David MacGregor || Center for Applied Linguistics
The role of pre-operational testing: Asking questions and seeking answers
Cary Lin || Center for Applied Linguistics
Psychometric analysis of test performances: Informing operational tests
Dorry Kenyon || Center for Applied Linguistics
King's College London
What's in a name? New constructs in language assessment
The idea of using assessment to promote learning has been receiving increasing attention in recent years in all areas of education. In the field of second/additional language education, there is now a growing body of work that ties assessment, learning and pedagogy closely together in a variety of ways, with the teacher playing an important central role in one way or another. Terms such as ‘Assessment for Learning’, ‘Dynamic Assessment’, ‘Embedded Assessment’, ‘Formative Assessment’, ‘Learning-oriented Assessment’ and ‘Teacher Assessment’ appear in research and professional journals regularly. The common theme that unites these pedagogically-linked assessment approaches is their commitment to promote learning. A key question is: Do these different terms refer to a common underlying concept or do they represent diverse conceptualisations and epistemologies in terms of learning, teaching and assessment? And if these nomenclatural differences do represent significant conceptual and theoretical differences, how do they influence and shape practice? In this colloquium we bring together a team of language assessment specialists from Australia, England, Hong Kong, New Zealand and the USA to address these issues. It is hoped that the discussions will help to build dialogues that can help identify commonalities and differences, with a view to enhancing pedagogic usefulness.
Constant Leung || King’s College London
Chris Davison || University of New South Wales
Assessment for learning: Building on the brand
Martin East || University of Auckland
Embedding assessment for learning into a high-stakes assessment system: Can it really work?
Liz Hamp-Lyons || University of Bedfordshire
PEST principles for implementing effective learning-oriented language assessment
Yongcan Liu, Michael Evans || Cambridge University
A conceptual framework for the use of Dynamic Assessment with EAL learners in schools in England
Jim Purpura || Columbia University
Evaluating technology-mediated language education
As technology-mediation becomes more
prominent in all areas of education, there is an increasing need to
demonstrate its value and justify the economic and time investment it
usually requires. Language education is not an exception. Since the late
1980s (e.g., Chapelle & Jamieson, 1986) the field of CALL has been
trying to assess the effectiveness of tools, platforms, approaches,
programs, and pedagogical choices, and although we have come a long way
since then, there is no unified or standardized approach to evaluating
technology-mediated language education. This colloquium addresses the
need of evaluating technology-mediated language education from a
programmatic view that considers the variables and factors that affect
teaching and learning in contexts mediated by technology. The colloquium
will address essential questions in evaluation such as: What is it
exactly that we are evaluating (language learning, digital learning,
outcomes, processes)? What are the best methods and tools to evaluate
technology-mediated language courses? Can we borrow instruments and
procedures from non-tech language courses? Do the environments provide
unique affordances for or constraints on evaluation? Who are the
intended users of the evaluation and how will it be utilized? If we want
to keep the field of computer-assisted or technology-mediated language
learning moving forward, these are essential questions that need to be
addressed and resolved in the immediate future.
Jonathan Leakey || University of Ulster
A proposed model for evaluating CALL
A multi-variable framework for evaluating online
language courses: the case of Voxy
Jim Ranalli || Iowa State University
Exploring adaptations of an argument-based
validation approach to the task of CALL evaluation
|| National Chengchi University
A model to evaluate language learning MOOCs: MandarinX/edX
|| University of Hawai'i at Mānoa
Expert Panel Discussion
Theme: "Challenges and Prospects in Making Assessment and Evaluation Useful"
Invited Panelists: John Davis, Michael Kane, Dorry Kenyon, Richard Kiely, Constant Leung,
Marta Gonzalez-Lloret, Lorena Llosa, Meg Malone
Moderator: John Norris
This interactive panel discussion will provide GURT participants a unique opportunity to raise questions and engage in dialogue with a group of assessment and evaluation experts. Topics for discussion will be crowd-sourced from attendees over the first two days of the conference, and there will be time allocated for spontaneous Q&A. Please join us for this exciting event, and for the champagne reception to follow.
“USEFUL EVALUATION IN
John Davis, Amy Kim, Todd
McKay, Mina Niu, Young A Son, Francesca Venezia
The workshop is introductory and relevant for language educators new to program evaluation and looking to learn practical techniques to implement in their courses and programs.
Participants will receive a certificate of attendance after completing the workshop.
Thursday, March 10
Intercultural Center 101
“Planning useful evaluation in college language programs: Clarifying evaluation
users, uses, and foci”
Synopsis: Language program evaluation must be organized and
planned in specific ways to ensure its usefulness and productiveness for
interested stakeholders. This session will illustrate why language educators
should adopt a use-oriented approach to program evaluation. That is, evaluation
in language programs should proceed via a clear understanding of (a) why the
program—or particular aspects of the program—is/are being investigated; (b) who
specifically is going to use evaluation findings and processes; and (c) how evaluation
information/processes will be used by intended evaluation users. Clarifying and
identifying these elements helps to increase the likelihood that evaluation
findings will actually be used and useful for evaluation users and program
Objectives: Participants will understand (a) the range of
potential users and users for language program evaluation projects, (b)
strategies for identifying stakeholders in language program evaluation projects,
and (c) elements of high-quality evaluation questions. Participants will be
able to (a) identify intended uses and uses of a potential/future evaluation
project and (b) articulate high-quality evaluation questions.
“Identifying evidence—or “indicators”—of program quality, effectiveness, needs,
Young A Son, Francesca Venezia
Synopsis: The usefulness of evaluation is challenged when
evaluators collect information in a way that stakeholders regard as
untrustworthy or unrelated to evaluation project goals. To avoid this
situation, evaluators can/should take specific steps to identify the relevant
sources of information (i.e., “indicators”) that will help answer evaluation questions and systematically shed light
on targeted program elements, doing so in ways that help specific users
understand the phenomena under evaluation and that help them make decisions and
Objectives: Participants (a) will be familiar with selected
evaluation indicators commonly used in language program evaluation and (b) will
be able to identify relevant, useful “indicators” for answering
language program evaluation questions.
for collecting evaluation information in language programs: Interviews, focus
groups, and questionnaires”
Synopsis: Interviews, focus groups, and questionnaires are
the most commonly used tools for collecting the views and opinions of program
stakeholders and constituents. Each, however, has advantages and disadvantages,
strengths and weaknesses, and is best used in particular circumstances to
achieve specific project goals. This session will help participants identify
which tool is best suited to specific evaluation aims. The session also provides
best practices and how-to advice for implementing questionnaires, interviews,
or focus groups in language program evaluation projects.
Objectives: Participants will (a) understand the strengths and
purposes of interviews, focus groups, and questionnaires for collecting
information for program evaluation purposes, and (b) be able to identify which
method is best suited to shed light on a given evaluation question.
next steps: Strategies for getting evaluation started in language education programs”
McE. Davis, Mina Niu
Synopsis: While program evaluation can be a powerful tool
for educational innovation and improvement, it requires intentional planning
and particular conditions and practices to function successfully. The final
session asks participants to identify specific strategies, next steps, and time lines for
implementing evaluation in their programs. The session will partly involve
participants analyzing the current capacity in their programs for conducting
high-quality, useful evaluation (e.g., extant resources, expertise,
infrastructures) and brainstorming about how to build evaluation capabilities to
enhance the usefulness of future evaluation efforts.
Objectives: Participants will identify immediate next
steps, time frames, and needed capacity for implementing evaluation in their
language programs and institutions.