In April 2016, I was hired by an independent school to observe its middle and high school English department for two days and offer feedback, first in person and then in a written report. The following is a lightly edited version of the report’s section on assessment strategies.
The school avoids giving grades at all until eighth grade. Some teachers view this as a strength, while a few others view it more as a weakness. My own perspective leads me to label this a strength for many reasons, not least of which is that the absence of grades renders debates about the minutiae of quantitative assessment unnecessary.
Based on my attendance at the department-wide conversation on grading and on my interviews with faculty, it seems that despite the diversity of opinion within the department, everyone comes at the issue with a depth of thinking and the clear objective of doing right by the students. No one falls back on the argument that “it’s just how we do things here.”
The department’s practice of “cross-grading”―of having different teachers evaluate the same paper, with an eye toward consistency and common priorities―is (or at least should be) helpful in getting to the root disagreements within the department and crafting grading policy that at least honors the diversity of opinion.
The students at our lunch meeting reported that during student-teacher conferences, their English teachers typically go over what they did well alongside what they didn’t do as well. This demonstrates that the department is not stuck in a deficit-based mindset regarding assessment.
The department obviously struggles with the root questions that any thoughtful English department should be struggling with: Should we be giving grades? If so, which assignments should we grade? What kind of grades should we give? What do our grades mean? How objective are these grades? Who is the primary audience for these grades?
A related struggle is how to conduct and resolve conversations about these questions, and what to do when the department’s educational instincts veer away from what is assumed to be the imperative to use grades to help colleges determine which of the school’s students they should consider accepting.
The department also struggles with questions that, from my perspective, are not as essential. One of these questions is whether “grade inflation” is occurring and what to do about it. Another is how to explain the distinction between, as a few teachers put it, an 87% and an 89%. I consider these nonessential because the questions themselves are symptoms of the larger unresolved problems.
Some teachers at the department meeting I attended discussed the potential dangers of a “deficit-based model” of assessment. From my perspective, when teachers are on the hunt for deficiencies and then grade accordingly―perhaps in order to avoid giving too many A’s―it risks placing students in a fixed mindset about their abilities and intelligence, and over time, the lower grades become a self-fulfilling prophecy. This is one reason why overcoming the concerns about grade inflation is vital.
The students at our lunch meeting seemed quite confused about grades in English. One of them posed this question: “I’m supposedly a ‘B’ writer, but what does that mean?” These students expressed a desire for grading criteria to be more transparent. They also revealed to us that many students choose their elective courses based largely on “how hard a grader the teacher is.” The students discussed other issues, as well. One student expressed frustration that she took an intellectual risk with an argument, only to see her grammar mistakes bring her overall essay grade down to an 85%. Another student asked whether these grades are about the quality of their ideas or instead about how they express their ideas.
One specific question that seems to get a lot of attention is whether to allow students to submit two, three, or more revisions of a written piece, and, if so, which version(s) should be graded.
Another popular specific question: How might the department avoid grading formative assessments―and generally grade fewer assignments―without ending up with so few grades at the end of the semester that the averaging of them appears unfair and weighted too heavily on a single assignment?
Finally, this question: Since the department does not offer honors or AP courses, how are colleges to differentiate between good students and exceptional students if so many of them have A’s on their transcripts?
Letter grades, along with the corresponding plusses and minuses, are intended as shorthand. Yet what are they shorthand for? Is a grade intended to reflect mastery of skills? Conceptual understandings? Natural talent? Effort? Responsibility? Improvement? Does a student earn a grade as a reward for work, or does a student receive a grade as recognition of mastery or as information about how she is doing? Or is a grade simply intended to convey how a student ranks relative to her peers?
The next set of questions: What is each letter intended to signify? Does an A mean that the student has met all the requirements and has understood the material? Or is that a B? Does an A mean instead that the student has performed above and beyond the expectations? Or is that an A+? (And what are those expectations anyway?) Does a C mean that the student is doing well enough? Or is it a signal that intervention is necessary? These questions seem much more important than whether grades are “inflated” to the point where there are too many A’s.
Grades in any subject area, particularly in a class that focuses on writing, are inherently subjective to some degree. Rather than attempt to make grades as objective as possible, the department might more explicitly focus on how to strike the best balance between what colleges need and what students need. The department already appears to acknowledge that these two needs are different and largely incompatible, in that moving too far in favor of one need serves to undermine the other need.
The department’s practice of assigning percentages to student writing, particularly without the guidance of a rubric, strikes me (and the students we spoke with) as exceedingly arbitrary. What is the rationale for continuing to use percentages?
From my perspective, the issue of “grade inflation” is primarily an issue of incentives. A teacher’s job is to teach a discipline effectively enough that most if not all of that teacher’s students excel by the conclusion of the course. Therefore, it would follow that the best teachers are the ones whose students receive A’s; likewise, if a critical mass of students receives C’s and D’s, the teacher’s competence (or the curriculum’s appropriateness) is suspect. If an effort were made to combat “grade inflation” without first coming to consensus about what each letter grade actually means, then the incentive for the teacher would be either to teach less effectively so that more students would do less well, or to unfairly punish some students with lower grades in order to arrive at a more balanced bell curve. (That is, unless the department intends for grades to reflect natural ability, in which case teachers don’t really need to teach much in the first place!)
One approach to resolving many of the department’s concerns and confusions about grading would be to adopt a standards-based model. In this model, students are graded based on a variety of common standards, which are announced in advance and plotted on rubrics that stay consistent over the course of many assignments or even the entire year. This resolves the questions of what grades are shorthand for (mastery, not effort); of what each letter signifies (up to the department, but consistent throughout the department); and of the extent to which grades are subjective (somewhat, but tied to a rubric for maximum objectivity). Standards-based grading also makes the question of grade inflation moot.
While I do think standards-based grading would be an improvement over what currently exists in the English department, I also want to acknowledge a number of potential drawbacks of the approach. The use of rubrics results in teachers assessing only what seems measurable―and we know that with writing, there are many valuable components that are unrubricable. Similarly, rubrics take the art of writing and break it down into long lists of isolated skills; even as students may improve in all of the skill areas, they may lose sight of their writing as a holistic endeavor. For these reasons, I see the school’s lack of rubrics as a strength, and I would be hesitant to advise the department to introduce them as an antidote to the grade questions.
With all this said, here are a number of questions that move beyond those discussed above and that imagine alternatives to the department’s current assumptions about grades and assessment. I am not necessarily endorsing any of these particular approaches; rather, I offer them as starting points from which deeper discussion might emerge.
Why must each assignment receive a single grade?
Imagine giving three, four, five, or even more grades to each piece of writing, with each grade serving as shorthand for a skill set in categories such as structure, grammar, or ideas. This approach would increase the sheer number of grades in the gradebook while allowing teachers to grade fewer assignments (namely summative assessments).
Why must grades be compiled based on assignment?
Imagine grading based on skill set (using the system proposed above) and at the end of the semester averaging not assignment grades but skill set grades. So, for example, argument structure might count for 40% of a student’s final semester grade, grammar for 25%, etc. This approach would make the final grade less arbitrary, based not on what the teacher happened to assign that semester but rather on how the student has been doing in specific areas of the curriculum.
Why must the final grade be an average of grades from individual assignments?
Imagine arriving at the final semester grade based not on the “data” from individual assignments but instead on overall patterns from the semester. This approach appears on the surface to be much more subjective than the typical system, yet the school’s process of student-teacher conferences provides the perfect opportunity to see the bigger picture, beyond just the sum of the parts. Were the department to couple this approach with a clear statement of what each letter grade means, it will have changed the incentive structure for assignments―away from one in which students feel compelled to please the teacher and toward one in which students feel encouraged to experiment and take risks, without fear of being penalized for it at the end of the semester. This also allows teachers to focus on more aspects of learning than what seems measurable within the curriculum.
Why must the final grade be determined solely by the teacher?
Imagine working with each student to determine her final grade, based on a set of general criteria established at the beginning of the year and on a clear statement of what each letter grade means. One method is to have each student complete a reflection form about her semester, ending with a proposed letter grade. If her proposed grade matches the teacher’s own determination within a plus or minus, then the process is done―no averaging necessary. In the (surprisingly rare) cases when they don’t match, the teacher can write a draft of the report, sit with the student as she reads it to confirm its accuracy and fairness, and then ask the student which letter grade matches the report best. It will very likely mirror the teacher’s own conclusion. Teachers who employ a version of this approach often find that they learn more about each student’s learning process and experience of the course, enabling them not only to provide the most “accurate” grade but also to improve the course itself going forward.
Why must the teacher grade every written piece?
As a number of teachers pointed out during the department meeting I attended, one roadblock to having students write more pieces is that already time-crunched teachers would have to grade all of them. Yet once a department has established a difference between formative and summative assessment, students can write pieces in formative mode without necessarily receiving a grade―maybe, instead, a few comments here and there regarding an area of particular concern to the student. Then, once a student feels ready to demonstrate her mastery of a particular skill, she declares her next piece to be summative, and thus to be graded. (If the department adopts the approach discussed at the top of this list, then the piece might be summative in a specific skill set category.) This differentiated approach treats grades not as rewards for work but as acknowledgements of skill, and it promotes student-centered learning in a much deeper way. The approach also potentially settles the question of whether a student may submit more than one revision of a piece for credit: once she has declared herself grade-ready, the first version is the graded version, yet she has opportunities for drafting and revision in the non-graded pieces leading up to that.
Why must the transcripts all suggest the same course level?
This is admittedly an oddly worded question, but the gist of it is this: Imagine giving a student an “A/B” for the semester, and then giving her choices as to what goes on her transcript. Option 1 could be an A in English, and Option 2 could be a B in “honors” English. Since the primary concern of the department is how best to communicate performance and potential to colleges, this approach gives students and families control over how they are perceived, given the high level of all the school’s English courses. It also lessens the pressure on teachers to determine the most “accurate” grade, instead requiring only a grade range.
I come away from my visit extremely impressed with the school and the English department. The teachers are of the highest caliber pedagogically and intellectually. While they vary to some extent in their philosophical and stylistic approaches, they all care deeply about the education of their students and are, for the most part, asking the right set of questions about how they might improve the program for the benefit of the students. It also seems that this process of hosting outside visitors for observations, interviews, and group meetings was a positive one for the department. Teachers were very open with me about what they see as the weaknesses of the program while also being effusive in their admiration for each other and their excitement for working there. I’ll also say selfishly that I learned a great deal during my visit, and I come away a more thoughtful educator for having gone through this process of observation, questioning, and critique. I have high praise for the school for initiating this ongoing, cyclical process of departmental evaluation.
Final department grade: 87%. Or maybe 89…