Validity

In Research:

In research, validity refers to the degree to which an intended test interpretation (or the concept or construct that the test is assumed to measure) matches the proposed purpose of the test. This evidence is based on test content, responses processes, internal structure, relations to other variables, and the consequences of testing.

For example, if a study makes claims about teacher motivation based on a survey instrument, validity refers to the degree to which those claims can reasonably be inferred from the survey instrument given the survey items, how the survey was administered, how subjects were selected for the survey, etc.

Related concepts:

In Teaching:

In teaching, assessors ask themselves a similar question about validity: Does this assessment measure what I want to know about what students learned?

From Wiggins & McTighe (2005): [Validity is] the inferences one can confidently draw about student learning based on the results of an assessment. Does the test measure what it purports to measure? Do the test results correlate with other performance results educators consider valid? Does the sample of questions or tasks accurately correlate with what students would do if tested on everything that was taught? Do the results have predictive value; that is, do they correlate with likely future success in the subject in question? If some or all of these questions must have a “yes” answer, a test is valid.

Because most tests provide a sample of student performance, the scope and nature of the samples influence the extent to which valid conclusions may be drawn. Is it possible to accurately and reliably predict from the performance on a specific task that the student has control over the entire domain? Does one type of task enable an inference to other types of tasks (say, one genre of writing to all others)? No. Thus, the typically few tasks used in performance assessment often provide an inadequate basis for generalizing. One solution is to use a wide variety of student work of a similar type or genre, collected over the year, as part of the summative assessment.

To be precise, it is not the test itself that is valid, but the inferences that educators claim to be able to make from the test results. Thus, the purpose of the test must be considered when assessing validity. Multiple-choice reading tests may well be valid if they are used to test the student’s comprehension ability or to monitor grade-level reading ability of a district’s population as compared to other large populations. They may not be valid as measures of a pupil’s repertoire of reading strategies and the ability to construct apt and insightful responses to texts. The format of the test can be misleading; an inauthentic test can still be technically valid. It may aptly sample from the subject domain and predict future performance accurately but nonetheless be based on inauthentic, even

Sources:

Creswell, John W. (2011-03-14). Educational Research: Planning, Conducting, and Evaluating Quantitative and Qualitative Research (4th Edition) (Page 630). Pearson.

Wiggins, Grant; McTighe, Jay (2005-03-22). Understanding by Design, Expanded 2nd Edition (Page 353). Association for Supervision & Curriculum Development. Kindle Edition.