On Your Mark

This activity can be a faculty meeting. It is based on the book ON YOUR MARK.

The Purpose of Grading and Reporting

The key to overcoming these difficulties is to begin the entire reform process by focusing on the purpose of grading and reporting (Brookhart, 20lla; Guskey & Bailey, 2010). Before considering any change in the format or structure of the report card, reformers must consider what the report card's purpose is. In other words, why do we assign grades or marks to students' work, and why do we record those on report cards?

Researchers who have asked teachers and school leaders these questions generally find that their answers can be classified in six broad categories (see Airasian, 2001; Feldmesser, 1971; Frisbie &Waltman, 1992; Guskey & Bailey, 2001; Linn, 1983). As described in Developing Standards-Based Report Cards (Guskey & Bailey, 2010), these categories include:

1. To communicate information about students' achievement in school to parents and others. Grading and reporting provide parents and other interested persons (for example, guardians, relatives, and so on) with information about their child's achievement and learning progress in school. To some extent, they also serve to involve families in educational processes.

2. To provide information to students for self-evaluation. Grading and reporting offer students information about the level and adequacy of their academic achievement and performance in school. As a source of feedback, reports also serve to redirect and hopefully improve students' academic performance.

3. To select, identify, or group students for certain educational paths or programs. Grades and report cards are the primary source of information used to select students for special programs. High grades typically are required for entry into gifted education programs and honors or advanced classes. Low grades are often the first indicator of learning problems that may result in students' placement in special needs programs. Report card grades on transcripts are also used in determining admission to selective colleges and universities.

4. To provide incentives for students to learn. Although many educators debate the idea, extensive evidence shows that grades and other reporting tools are important factors in determining the amount of effort students put forth and how seriously they regard learning or assessment tasks (Brookhart, 1993; Cameron & Pierce, 1994, 1996; Chastain, 1990; Ebel, 1979; Natriello & Dornbusch, 1984).

5. To evaluate the effectiveness of instructional programs. Comparisons of grades and other reporting evidence are frequently used to judge the value and effectiveness of new programs, curriculums, and instructional strategies.

6. To provide evidence of students' lack of effort or inappropriate responsibility. Grades and other reporting devices are also

used to document the inappropriate behaviors of students. In addition, teachers sometimes threaten students with poor grades in order to coerce more acceptable and suitable behavior.

While educators generally agree that all of these purposes may be legitimate, they seldom agree on which purpose is most important. If asked to rank-order these purposes in terms of their importance, school leaders and teachers often vary widely in their responses-even when they are staff members from the same school (see Guskey, 2013a). And that is precisely the problem.

When educators don't agree on the primary purpose of grades, they often try to address all of these purposes with a single reporting device, usually a report card, and end up achieving none very well (Austin & McCann, 1992). The simple truth is that no single reporting instrument can serve all of these purposes well. In fact, some of these purposes are actually counter to others.

Suppose, for example, that the educators in a particular school or school district work hard to help all students learn well. Suppose, too, that these educators are highly successful in their efforts and, as a result, nearly all of their students attain high levels of achievement and earn high grades. These positive results pose no problem if the purpose of grading and reporting is to inform parents about students' achievement or to provide students with self-evaluation information. The educators from this school or school district can take pride in what they've accomplished and can look forward to sharing those results with parents, students, and others.

This same positive outcome poses major problems, however, if the purpose of grading and reporting is to select students for different educational paths or to evaluate the effectiveness of instructional programs.

Selection and evaluation demand variation in grades. They require that the grades be widely dispersed in order to differentiate among students and programs. How else can selection take place or one program be judged as better than another? But if all students reach the same high level of achievement and earn the same high grades, there is no variation in the grades. Determining differences under such conditions is impossible. Thus while one purpose was served well, another purpose was not.

Mathematical Precision Versus Valid Grades

Consider the performance patterns of the seven students described in figure 7.1 (page 86). Their assessment results over five instructionalunits are shown on the left side of the table. The boldfaced scores on the right side of the table represent summary scores for these students calculated by three different tallying methods that reflect three different mathematical algorithms.

The first method is the simple arithmetic average of the unit scores, with all units receiving equal weight. The second is the median or middle score from the five units. In other words, if the five scores were arranged in rank order, the median would be the third of the five ranked scores. Because the median is positional rather than proportional, it is not influenced by extreme scores, as is an average. One extreme score might drastically affect the average, for example, but it would not affect

the median because the middle score in a set of ranked scores would stay the same.

The third method is the arithmetic average, deleting the lowest unit score in the group. This method is based on the assumption that no one, including students, performs at a peak level all the time (Canady & Hotchkiss, 1989). So this method gives students the advantage of having their lowest score removed from the grade calculation. These are the three tallying methods or mathematical algorithms most frequently used by teachers and most commonly employed in computerized grading

programs and electronic gradebooks.

Student 1 (Steady Improvement) struggled in the early part of the marking period but continued to work hard, improved in each unit, and performed

excellently in unit 5.

Student 2 (Steady Decline) began with excellent performance in unit 1 but then lost motivation, declined steadily during the marking period, and received a failing

mark for unit 5.

Student 3 (Consistent Performance) performed steadily throughout the marking period, receiving three Bs and two Cs, all near the B-C cutoff.

Student 4 (Poor Start I Great Finish) began the marking period poorly, failing the first two units, but with newfound interest performed excellently in units 3, 4, and 5.

Student 5 (Great Start I Poor Finish) began the marking period excellently, but then lost interest and failed the last two units.

Student 6 (Unexcused Absence) skipped school during the first unit, but performed excellently in every other unit.

Student 7 (Cheating) performed excellently in the first four units, but was caught cheating on the assessment for unit 5, resulting in a score of 0 for that unit.

Grading categories:

90%-100% =A 80%-89% = B 70%-79% = C 60%-69% = D

As can be seen, while all three of these tallying methods are mathematically precise, each one yields a very different pattern of grades for these seven students. If you use the arithmetic average (method 1), all seven students would receive the same grade of C. If you use the median (method 2), there would be two Cs, one B, and four As. And if you use an arithmetic average, deleting the lowest score (method 3), there would be one C, four Bs, and two As. Note, too, that the one student who would receive a grade of C using method 3 had unit grades consisting of just two Cs and three Bs. More importantly, no student would receive the same grade across all three methods. In fact, two students (4 and 5) could receive a grade of A, B, or C, depending on the tallying method and mathematical algorithm used!

Oddly, when asked to select the method they would likely use prior to seeing these data, the majority of teachers choose method 3: the arithmetic average deleting the lowest score. Most teachers believe that method would offer students the best chance of receiving a high grade.

But as the figure shows, using that method would result in only two students receiving a grade of A, and those two students would be the one who skipped school and the one who cheated. Would any teacher consider that to be an optimal outcome? And student 7 always prompts controversy. Some teachers believe that a single transgression, even one as serious as cheating, should be forgiven if appropriate reparations are made. Others vehemently oppose such leniency. They believe the likelihood is great that this student also cheated on all previous assessments but simply was not caught.

The teacher responsible for assigning grades to the performance of these seven students has to answer a number of difficult questions. For example, which of these three methods is truly fairest? Which method provides the most accurate summary of each student's achievement and level of performance? Do all seven students deserve the same grade, as the arithmetic average (method 1) indicates, or are there defensible reasons to justify different grades for certain students? And if there are

reasons to justify different grades, can these reasons be clearly specified? Can they be fairly and equitably applied to the performance of all students? Can these reasons be clearly communicated to students before instruction begins? Would it be fair to apply them if they were not? The nature of the assessment information from which these scores are derived could make matters even more tangled. Might it make a difference, for example, if you knew the content of each unit assessment

Alternatives to Averaging

  1. Give priority to the most recent evidence.
  2. Give priority to the most comprehensive evidence.
  3. Give priority to evidence related to the most important learning goals or standards

Use of Zeros

Few teachers believe that grades should be used to punish students for their lack of effort or for demonstrating inadequate responsibility. At the same time, however, many teachers assign zeros to students' work that is missed, neglected, or turned in late (Canady & Hotchkiss, 1989; Stiggins & Duke, 1991). Obviously if grades are to represent how well students have learned, then the practice of assigning zeros clearly misses the mark (Raebeck, 1993; Reeves, 2004).

Zeros have an even more profound effect if combined with the practice of averaging in a percentage grading system (see chapter 2). Students who receive a single zero have little chance of success because such an extreme score so drastically skews the average. Note, for example, the scores of students 6 and 7 in figure 7.1 (page 86). That is why when combining judges' scores in Olympic events like gymnastics, diving, or ice-skating, the highest and lowest judges' scores are always eliminated.

If they were not, one judge could control the results of the entire competition simply by giving extreme scores.

Some teachers defend the practice of assigning zeros by arguing that they cannot give students credit for work that is incomplete or not turned in-and that's certainly true. But there are far better ways to motivate and encourage students to complete assignments than by assigning zeros, especially considering the overwhelmingly negative effects.

One alternative approach to lowering zeros is to assign an I or "Incomplete" grade, and then require students to do additional work to bring their performance up to an acceptable level (Guskey, 2004). Students who miss an assignment or neglect a project deadline, for example, might be required to attend after-school study sessions or special Saturday school programs in order to complete their work (Reeves, 2012). Parents, of course, are informed of this policy and the reasons behind it before it is implemented. In schools where the after-school programs are not possible, students might be required to attend a special study session held during lunchtime. In other words, they are not let off the hook with a zero. Instead, students learn that they have responsibilities in school and that their actions have specific consequences. In addition, it helps make the grade a more accurate reflection of what students have learned.

Students should learn to accept responsibility for their actions or inaction and within reason, should be held accountable for their work. Nevertheless, no evidence shows assigning zeros helps teach students these lessons (Kohn, 1993). Unless we are willing to admit that we use grades to show evidence of students' lack of effort or inappropriate responsibility, then alternatives to the practice of assigning zeros must be found.

The use of an I or "Incomplete" grade as an alternative to assigning zeros is both educationally sound and potentially quite effective. Students who miss an assignment or neglect a project deadline receive a grade of I and then are required to take the steps necessary to complete their work to a satisfactory level. Teachers stress that this policy is not a punishment, but a structured opportunity to ensure students succeed. And although implementing such a policy typically requires additional funding and support, the payoffs can be great. Not only is it more beneficial to students than simply assigning a zero, but it is also a lot fairer. Furthermore, it helps make the grade a more accurate reflection of what students have learned.