Scoring and Interpreting Standardized Tests
Raw score-This refers to the actual number of items on a test a student answered correctly.
These scores are usually not reported because they are not meaningful information unless tests are on the same scale.
In order for raw scores to be meaningful they have to be converted to some type of derived score such as percentile rank, grade equivalent score, or standard score so that accurate comparisons can be made
Correct/Incorrect possible score-The number of both correct and incorrect responses out of the total number of test items are reported, typically expressed as a percentage.
Generally associated with informal, criterion-based tests
Commonly used to inform instruction
Standard score-This is the term for a variety of scores that represents performance by comparing the deviation of individual scores with the mean score in a norm group of the same chronological age or grade level.
These allow for the comparison of performance across tests
Stanine-A combination of the words standard and nine.
Scores range from one to nine, reflecting the bell curve
Scores 4-6 represent the average range, stanines above that are considered above average, and achievements below this are considered below average.
It may be helpful to think of each stanine in this format:
9 = Very superior
8 = Superior
7 = Very good
6 = Good
5 = Average
4 = Low average
3 = Considerably below average
2 = Poor
1 = Very poor
Percentiles-Describes how a student compares with others in the same age group or grade. For example, a percentile of 80 means that 80% of the sample scored at or below the examinees score
The most frequently used norm score
Range from a lower of 1 to a high of 99 and reflects the bell curve (50th percentile is the average
A percentile rank is the percentage who had scores the same as or lower than the student being tested
Useful for describing the student’s relative standing in the population
Standard error of measurement (SEM)-An estimate of how often errors are expected
A low SEM is an indication of high reliability and a high SEM is an indication of low reliability.
Confidence interval-The range of scores in which a true score will fall within a given probability range.
Grade and age equivalent score-Scores in terms of grade/age levels, expressed in years and tenths of years
Grade and age scores are most meaningful when the test students have taken is at the right level and is not more than a year above or below average.
The APA has advocated that these scores not be reported because of their inadequate statistical properties.
For example, a student who achieves the reading comprehension grade-equivalent score of 6.6 is reading like a sixth grader in the sixth month of school
Things to include in an assessment report.
What type of standardized assessment is this?
Aptitude/IQ
Achievement
Norm-referenced
Criterion-referenced
What is the purpose of this assessment?
The purpose of this assessment is to determine a student’s ______ ability in the areas of _____
The assessment also provides identification of strengths and weaknesses.
Questions include _____
What is the child’s stanine?
______ is in the 5th stanine which is considered average. An average score falls between the 4th and 6th stanine.
What is the child’s percentile?
Overall _____ was rated 42nd percentile when compared to their peers. Which means that _____ performed as well as or better than 42% of their peers who are their same age or grade level.
What is the child’s grade equivalency?
_____ received a GE of 1.6, meaning their performance is equivalent or similar to a student in the 1st grade and 6 months of school.
How did they perform compared to their peers? _____ standard score was 98. The average standard score is between 85 and 115. This means that_____ has an average standard score.
What areas of strength does the child have?
Areas of strength that _______ has is in Word Reading. _____ scored 88% for sight or irregular word meanings, and 79% for decodable word meanings.
What areas of need does the child have?
_____ weakest areas were in sentence comprehension, which included prepositions (0%), adverbs (0%), and complex (33%). _____ also showed difficulty when presented passages including questioning (33%), clarifying (36%), fiction (33%), science (50%), practical (50%), and medium-sized (27%) passages.
This doc includes links to Zoom recordings that describe terminology of score interpretation.