Validity

Validity of Assessment is about measuring what students know and can do with relation to a particular unit's Achievement Standards and Specific Unit Goals.

It is impossible to directly assess the knowledge and understandings in the brain of a student. What teachers instead try to do is use carefully selected proxies (assessment tasks) to provide evidence in order to make valid inferences on the knowledge and understandings of the student (Christodoulou, 2016).

The idea of validity in assessment is a key lynch pin of all assessment tasks and inferences drawn from assessment data (Christodoulou, 2016). Three perspectives are considered in determining validity, “the form of the measure, the purpose of the assessment, and the population for which it is intended.” (Dirksen, 2013). Masters (2013) argues that validity focuses on how fit for purpose the assessment is for the domain being assessed. Darr (2005) notes that “Judging validity cannot be reduced to a simple technical procedure. Nor is validity something that can be measured on an absolute scale. The validity of an assessment pertains to particular inferences and decisions made for a specific group of students.” (p.55). Inferences drawn from the data that assessment generates, is the foundation of the ACT system.

An interesting reading on Validity can be found here: VALIDITY

The Construct

Our assessment should be focused on the Unit Goals and Achievement Standards. We might need to take a step back from content descriptors, cross-curricular priorities, etc. to focus on the key ideas of the unit--this is often referred to as "the construct". We need to ensure that our assessment is appropriately addressing the construct of our units.

Is your assessment task measuring what it is meant to be measuring?

Below is an example of Unit Goals for students to achieve at the A, T or M level for the unit Creativity in Media. It is imperative that assessment is linked to the goals of the unit at the appropriate level.

In addition to the specific unit goals, framework achievement standards must be addressed across the suite of assessment tasks for a unit.

The two major places where assessment can go wrong when it comes to the construct are:

Construct under - representation
Construct irrelevant variance

Construct under-representation is when the suite of assessment tasks for the unit do not appropriately assess student understanding of the unit's goals due to lack of depth; that is key knowledge, skills and understanding of the unit have not been addressed in the suite of assessment tasks. This means the suite of assessment lacks validity because the content assessed is not reflective of the goals of the unit.

Examples of construct under-representation include:

Trivial content
Rote memorisation for factual recall
Few examination items
Teaching to the test
Covering only some required content

Construct irrelevant variance is when the suite of assessment tasks for the unit are assessing knowledge, understandings and skills not relevant to the unit goals. Construct irrelevant variance is the introduction of extraneous, uncontrolled variables that affect assessment outcomes. The meaningfulness and accuracy of assessment results is adversely affected, the legitimacy of decisions made upon assessment results is affected, and the validity is reduced.

Examples of Construct irrelevant variance include:

Poorly constructed assessment questions
Guessing
Item bias
Poor weighing of assessment marks
Requiring the use of skills outside the curriculum

AI and the Changing Construct

Consider if AI is a part of the construct or if it undermines your ability to accurately assess the construct. Your answer to this will have a strong impact on how you design the task. For example, is the use of AI to assist in writing code in programming an industry standard tool that students should learn to use well to enhance the interactions, volume, efficiency and scope of their work? For example, if you need to assess grasp of grammar and expression in language, the use of generative AI prevents insight into student understanding. This means assessment conditions will need to minimise access to generative AI.

Research shows validity of assessment can be affected by a further six factors which form the core of the ACT BSSS Quality Assessment Guidelines:

coverage of the curriculum
reliability
bias
provision for a range of thinking levels
student engagement
academic integrity

Below are links to pages which go into greater depth for each factor impacting the validity of an assessment task:

Coverage of Curriculum

Reliability

Bias Awareness

Levels of Thinking

Student Engagement

Academic Integrity

Page updated

Report abuse