Determine How to Summarize Progress

What Does Summarizing Progress Mean?

The purpose of a Progress Report is to provide a snapshot in time that helps families, learners and future educators understand learners' strengths and areas for growth in order to support them in their next steps.

Juab School District includes the following statement on the Progress Report to describe the purpose of the report:

When designing a new assessment, grading, and reporting structure, systems must consider the appropriate level of detail to report on. It might be helpful to return to the levels and depth of an outcome, which you can read more about here.

In the Common Core Framework, Standards (or sub-standards) are the smallest grain size of an outcome and are what we teach, assess, and provide feedback on. In a Competency-Based approach, these are usually the indicators or performance level descriptors at the skill or sub-skill level. However, reporting on every skill, subskill, indicator, or standard in a formal Progress Report could result in a lengthy report that overwhelms families and learners. Therefore, many systems choose to report at a higher level to provide a summary of progress.

In a Standards-Based approach, many systems report on domains, providing more specificity than summarizing at the course or subject level only, but less detail than sharing every standard on the report.

Some systems, like Mineola Public Schools, have identified essential standards – a selection of layer 4 in the image above– as part of their outcome process and report on all of them, believing that level of reporting is crucial for clarity on learners' progress relative to specific learning goals.

In a Competency-Based approach, it’s possible to summarize at any level for sharing. Some systems, such as Embark Education, report on individual skills or competencies. Others choose to summarize scores at a higher level, such as Waukesha East, who report at the subject area level.

Above are examples of Waukesha East’s and Embark Education’s competency frameworks to show the variation in the ways systems taking a competency-based approach to outcomes can assess and report.

As you read each school’s profiles in this section, you will see the level at which each system reports varies greatly.

Once a system decides at what level of the outcomes framework they will assess and report, a discussion needs to take place about how assessment will be summarized on the Progress Report. In other words, educators and learners will be looking at evidence of where learners are on the outcomes over the course of an entire term. The question is, how will all of that evidence, collected over time, contribute to a final proficiency level evaluation, or a summary of progress, on the Progress Report?

Embark Education's Competency Critical Thinking with 3 subskills is used in this example to highlight how to summarize progress, using their 3 proficiency levels: (P) Practicing, (M) Meeting, (E)Extending

Ways to Summarize Progress

In order to determine how to summarize progress on the Progress Report, a system must answer two questions:

How will we determine the learner’s current proficiency level on each of the individual outcomes, given multiple pieces of evidence?
How will we summarize the proficiency levels on each individual outcome up to a higher layer in the outcome framework?

Question 1: Using Embark’s competencies as an example, the first question asks: What is this learner’s overall or final proficiency level for the subskill “Inquire,” given the 4 different proficiency level scores they have received this trimester? In other words, if we are not averaging tests and quizzes together, how are we using evidence to determine proficiency?

Question 2: Given this learner’s proficiency levels for the subskills “Inquire,” “Evaluate,” and “Conclude,” what should their proficiency level be for “Critical Thinking,” which is the layer of outcomes reported on the Progress Report?

There are two main approaches to answering these questions:

A mathematical formula approach: a system uses technology and a mathematical formula to take all pieces of evidence and recommend a final proficiency level for the Progress Report that educators can typically override using professional judgment
A body of evidence approach: a system defines rules and/or holistic rubrics that support educators in making a professional judgment looking at multiple pieces of evidence for a final proficiency level

Many systems take a mathematical formula approach as a way to track progress over time and leverage technology to save educator’s time when it comes to finalizing Progress Reports at the end of the term. However, there are challenges with mathematical formulas, much like with traditional grades, that don’t account for all that happens with learning and so most systems do incorporate structures for educators to use their professional judgment to override scores. Other systems choose a body of evidence, or more holistic, approach that looks at all evidence over a period of time. However, these systems must have clear guidelines in place to prevent incredibly subjective and unclear final assessments.

As a system discusses the different approaches to summarizing and reporting progress, it may need to return to previous decisions regarding the design of the outcomes, the levels of proficiency, and the decisions around what counts as evidence of learning. It’s important to treat this design process as a cyclical process, not a linear one.

Mathematical Formula Approach

A mathematical approach to summarizing progress means that educators input proficiency level scores (either numerical or descriptive) into a technology tool. These are typically an LMS (Learning Management System) or SIS (Student Information System), although some systems use low-tech options such as spreadsheets. These technology tools use a mathematical formula to produce a final score, descriptive or numerical, for a Progress Report.

There are a variety of formulas that technology tools use, all with their own pros and cons. It’s important with these technologies to use them as intended: tools for professional educators to use as recommendations, so they can more efficiently do their work. Many systems that take a mathematical approach have structures in which educators review those calculations and can make professional judgments to override the technology. Ken O’Connor advocates for this in his book “A Repair Kit for Grading” saying, “grading should not be merely a numerical, mechanical exercise; it should be an exercise in professional judgment” (O’Conner, 2022, p. 145)

Question 1: Mathematical Formulas for Determining a Final Proficiency Level on Individual Outcomes Assessed

In a traditionally graded system, everything is averaged, typically over an entire year. One poor performance can bring down a grade, even early in the year. Philosophically, researchers writing about grading reform have encouraged using the most recent evidence, leaning away from an average calculation, which is challenging for a learner to overcome a lower score over time (Guskey, 1996; O’Connor, 2022; Feldman, 2019). The mathematical formula a system chooses to determine a final proficiency level score on an individual outcome largely depends on the technology tool they use, as different tech platforms offer different options.

The following are the most commonly offered calculations:

Average: a straight average of all scores on the same outcome over a given period of time
Decaying Average: an average that weighs more heavily recent scores (usually 65-75%, weighing an average of all previous scores only 25-35%) on the same outcome over a given period of time
Most Recent: this approach only takes the most recent score on an outcome into consideration, ignoring all previous scores as “practice” or “formative,” even if they were higher
Highest: this approach only takes the highest score on an outcome into consideration, ignoring any other scores, even if they were given recently
Average of X Highest: this approach takes the average of the typically 3-4 highest scores within a given period of time, ignoring in the calculation any lower scores given
Mode: this approach takes the score that has been given the most on the same outcome over a given period of time
Power Law: similar to a decaying average but based on Robert Marzano’s research and writing in his book Transforming Classroom Grading, this complex algorithm requires 3 pieces of evidence or scores in the technology tool and will weigh more recent evidence and growth to essentially predict what it thinks is the “best fit” score for where a learner is currently performing.

Below in the table, you can see the implications for choosing different formulas as each produces a different final score on the individual standard assessed multiple times for the learner in a semester.

There are many pros and cons to each of these calculations, which are well documented by technology companies. This one from JumpRope (n.d.) is particularly thorough.

It’s important to remember that the more complex the formula, the more difficult it will be to explain final scores to families and support educators and learners in understanding the system they are using. Joe Feldman writes in his book “Grading for Equity” about the formulas of traditional weighted grades, but the same could be said about any mathematical formula when using it to determine final scores:

“If the formula confuses us, imagine how confusing it is for students and parents, particularly those who have less educational background. It’s hard to see how this kind of a formula gives students any feeling of ownership or control” (Feldman, 2019, p. 48).

Feldman goes on to recommend that schools focus on more recent performance, stating, “In most cases when we measure and describe someone’s skill, we describe her most recent performance at that skill” (Feldman, 2019, p. 97). Ken O’Connor, in his book A Repair Kit for Grading, advocates for a formula, using professional judgment, that takes into account the mode and more recent scores (2022).

Whatever calculation is used, a shorter time frame should be given, such as a trimester or a semester, without averaging for the entire year, to chart progress and growth over time, not punishing learners for earlier scores.

Another consideration when using a mathematical approach is that it may require a certain amount of assessment data for the formula to truly work. For example, if there are only two pieces of evidence documented in the LMS and an average is used, a lower score will matter more than if there are multiple pieces of evidence documented and taken into account in the formula. Many systems choose to decide on a number of pieces of evidence required for each outcome. This is what Waukesha East has done as part of defining the requirements for completing what they call a Backpack. Typically, each skill in their competency framework requires 3-4 pieces of evidence at a specific proficiency level before they can move onto the next level.

Question 2: Mathematical Formulas for Summarizing Progress at a Higher Layer in the Outcome Framework

Once final proficiency levels on individual outcomes have been determined, a system needs to consider how all of those individual outcomes will be summarized at a higher layer in the outcomes framework. There are two main approaches to this decision: average and percent met.

Average

The majority of systems that use a mathematical approach to summarizing progress, take an average of all of the final proficiency level scores on individual outcomes (standards, subksills, skills, etc) within a layer (such as a domain or a competency). This is represented in the image below with standards, but this approach can also be taken with competencies.

Percent Met

There are some systems that take a percentage approach. They define for each proficiency level on the Progress Report the percentage of the individual outcomes (usually individual standards) in that layer learners are expected to meet. If learners earn a proficiency level below meeting on a standard, it is counted as not meeting. This approach is illustrated in the image below using standards, but this approach can also be taken with competencies.

An Example of Summarizing Progress with a Mathematical Formula: Embark Education

First, it’s important to revisit Embark’s competency framework. At Embark, they report at the competency layer on their Progress Report but they assess on individual subskills in their LMS. They also use a series of indicators to describe proficiency on each subskill that are not listed in the image above. See here for more details about this model.

Question 1: How will we determine the learner’s current proficiency level on each of the individual subskill, given multiple pieces of evidence?

Educators at Embark use the indicators in the performance descriptors (see the bullet points) to create daily learning targets and support learners with feedback. When they review evidence of learning on the subskill “Inquire” or “Evaluate,” educators enter a proficiency level score in their LMS (learning management system) using their 3 proficiency levels: (P) Practicing, (M) Meeting, (E)Extending.

Let’s focus on the Subskill “Inquire.” Imagine that the learner provides evidence of “Practicing” in November on a summative assessment. The educator would input a score of “Practicing” in the LMS on the subskill “Inquire.” In a separate learning experience in December, the learner again provides evidence on the subskill “Inquire.” This time, they have listened to their educator’s feedback from November and they have worked more to define the problem. On this piece of evidence, they showed the “Meeting” level so the educator inputs a score of “Meeting” on the subskill “Inquire” into the LMS. The student continues to work on the subskill and in January, in two separate learning experiences, the learner demonstrates evidence of the “Extending” level of proficiency so they receive those scores in the LMS.

Embark has chosen to use a straight average formula. This means that the “Practicing,” “Meeting,” and two “Extending” scores earned throughout the trimester will be averaged in the LMS to produce a final trimester score on the subskill “Inquire.” For the technology to do this, a numeric scale of 1-3 has been attached to each descriptive level. The system rounds scores, so in this case, the learner’s final score would round up to “Extending” in this subskill.

Embark chose a straight average because multiple educators across multiple learning experiences will provide learners with opportunities to demonstrate proficiency on the same outcomes. Although philosophically they believe in considering more recent evidence, they wanted all of those pieces of evidence across learning experiences to be considered equally. This is where the timeframe for the formula is also considered. For Embark, only scores earned within a trimester are averaged, so learners start over each trimester with a clean slate, preventing scores from earlier trimesters from potentially bringing down their average in future trimesters, and showing growth over time. Educators can also override final scores from the technology to account for recency.

Question 2: How will we summarize the proficiency levels on each individual subskill up to a higher layer, the competency?

Embark assesses in the day-to-day at the subskill layer and summarizes that progress on the Progress Report at the competency layer. For the competency "Critical Thinking,” there are three subskills. The learner has earned a final proficiency level of Extending, Meeting and Meeting on these three subskills. To produce the final score in “Critical Thinking” for the Progress Report, the LMS will average these three subskills’ final scores and round accordingly. Therefore, this learner would have a final score of “Meeting” on the competency “Critical Thinking” on the Progress Report.

At Embark, educators meet in the final weeks of the trimester to ensure there were enough pieces of evidence documented over the term and discuss the final scores recommended by the technology. They can override scores that they don’t agree match the evidence the learner has demonstrated and to account for growth from the beginning of the trimester that the straight average may have lost.

Review the following profiles to see how they take a mathematical approach to summarizing progress using technology:

Embark Education (Altitude Learning)
Mineola Elementary (Empower)
Synergy @ Mineola High School (Empower and Canvas Mastery Connect)
Bostonia Global (One Stone’s Growth Coach Tool)
Waukesha East (Beacon Learning)
Roseville City Schools (Otus and PowerSchool)
Ephrata Area Schools (PowerSchool)

Body of Evidence Approach

Many systems take a body of evidence approach to determine final proficiency levels for reporting on progress. Rather than using mathematical calculations of scores over time, these systems look at evidence, which needs to be clearly defined, and use professional judgment to provide a final proficiency level on the outcomes.

To be clear, this is not educators randomly deciding that a learner has earned a level 2 or 3 on an outcome during Progress Report season. In this approach, educators look at concrete evidence of learning in a variety of ways, whether by reviewing scores overtime in a technology system and making their own calculation based on a set of norms or rules, looking at multiple large pieces of work, or looking at a portfolio of work and determining a final level of proficiency. In all of these systems, educators refer back to the evidence when providing a final proficiency level score.

For the International Big Picture Learning Credential, which is an alternative transcript and Progress Report, this means educators spend significant time unpacking the rubrics, called “Frames,” for each Learning Goal and deeply understanding the different proficiency levels. They then look at at least 3 pieces of evidence, which need to be large in scope such as a senior capstone project or thesis paper. Citing that evidence, they determine a holistic proficiency level from 1-5 for that Learning Goal. Educators providing these credentials go through an extensive professional learning program to practice this approach and ensure validity, which is underscored by the Frame’s development in partnership with psychometricians at the University of Melbourne. Although a technology tool – the RUBY Assessment Platform – is used to document and create the credential, there is no mathematical formula used to calculate final levels of proficiency.

Educators at Hawaii Technology Academy and Surrey Schools input proficiency level scores into their system’s technology tool to track their body of evidence, but instead of having the technology provide a summary score using mathematical formulas, they look at all of the evidence accumulated over a term and use professional judgment to provide a final score.

This is also the approach taken, typically, when using a binary “earned” or “not yet earned” scale for proficiency levels. At Liberty Academy, which uses a binary scale, educators collect regular data on how learners are spending their time working towards competencies through Google Forms, which you can read more about here. They use that data as part of a body of evidence at the end of each 6-week term to determine what competencies learners have earned that are then documented in their Mastery Transcript as credits. No mathematical formula is used for that determination. Similarly, at Red Bridge, educators use evidence of assessments to provide learners their Learning Credit badges, which use no mathematical formulas or technology.

All systems that take a body of evidence approach have spent significant time thinking about, defining, and refining what evidence looks like and how they will use evidence to provide a final determination of a student’s progress for the Progress Report. This could look like guiding educators to look more closely at more recent evidence of proficiency than evidence from the beginning of the year, providing robust rubrics or having clear guidelines around what counts as evidence. This is a key step in designing a Competency-Based assessment, grading and reporting structure that if ignored can make the structure seem fluffy or incredibly subjective.

Page updated

Report abuse