Welcome to the Second May 2023 Asynchronous Module
In this session, we will explore Theories of Action in detail – what they are, how we can measure them, and what steps we need to take to evaluate them correctly.
By the end of the session, Data Fellows will be able to
List the assumptions behind their RSSP Theory of Action
Evaluate the veracity of those assumptions
Develop plans to use comparison group designs to create better estimates of RSSP intervention impact
Assess their district’s capacity to understand the validity of its Theory of Action
Resource Gathering
To successfully complete this session, you will need to have access to your district/charter's following RSSP Resources:
RSSP Theory of Action -- Wildly Important Goal -- Measurement plan -- Implementation plan -- Implementation data -- Performance data
Case Study
At the beginning of SY22-23, the Lost Pines ISD RSSP team created the following Theory of Action to guide their RSSP process:
"If teachers implement high quality instructional materials in their classrooms with fidelity, students will achieve 10% growth on end of year assessments.”
Guided by their Theory of Action, the Lost Pines team implemented monthly professional development sessions designed to enhance teachers’ knowledge of HQIM as well as communicate expectations for how HQIM should be implemented within classrooms. Additionally, the RSSP team coordinated ongoing classroom walkthroughs that measured the degree to which teachers met the expectations conveyed in the professional development sessions. These walkthroughs revealed that at the beginning of the year 30% of teachers were meeting at least 80% of the criteria listed on the walkthrough form. By the end of the year, 70% of teachers were meeting at least 80% of the criteria listed on the walkthrough form.
At the end of the school year when summative assessment scores became available, the RSSP team learned that students experienced an average of 1% growth in math and 4% growth in reading when compared to their previous year’s test scores.
Reflection Question: Use your note-catcher to answer the following question: Given the circumstances described above, do you feel Lost Pine’s RSSP team is able to determine if its Theory of Action is valid? Why or why not?
Throughout this session, you will be introduced to the concepts and frameworks required to determine if Lost Pine's RSSP team is able to determine if their Theory of Action is valid.
Theory of Action Overview
In the May Synchronous Session “Using Data for Goal Setting Part I”, we learned that a Theory of Action is:
The theory we create to guide stakeholder behavior toward the actions we assume will generate the outcomes we desire.
In the synchronous session, we further explained that a Theory of Action describes a series of linked assumptions. For example, the Theory of Action used by Lost Pine's RSSP team assumes that:
If professional development on HQIM is provided, it will improve teachers’ knowledge of HQIM
If teachers have a greater knowledge of HQIM, they will implement it in their classrooms with greater fidelity
If teachers implement HQIM with greater fidelity, students will receive more support for their learning needs
If students receive more support for their learning needs, they will experience greater academic growth
If students experience academic growth, that growth will appear on a summative assessment
To determine the validity of an RSSP Theory of Action, we should evaluate each embedded assumption. Doing so enables us to identify and target faulty assumptions, leading to positive change within the RSSP process.
Activity: Copy and paste into your note-catcher the linked assumptions you developed for your Theory of Action in the May Synchronous Webinar Using Data for Goal Setting: Part I. If you do not have access to your note-catcher from the previous webinar, please rewrite use this your district/charter's Theory of Action assumptions into this session's note-catcher.
How to Evaluate a Theory of Action
In order to effectively evaluate RSSP Theories of Action, we need to take the following steps
Turn assumptions into questions
Collect the requisite data
Use correct analytical techniques
Turn Assumptions into Questions
Turning our assumptions into questions helps us develop clarity on how to evaluate them. For example, the assumption: "If professional development on HQIM is provided, it will increase teachers' knowledge of HQIM.” Can be turned into the question: “Does providing professional development on HQIM increase teacher knowledge of HQIM?”
When turning assumptions into questions, specificity breeds clarity. Note the difference between the following:
“Does providing professional development on HQIM increase teachers’ knowledge of HQIM?”
“Does providing monthly professional development on HQIM that utilizes direct instruction and is held 30 minutes prior to school starting increase teacher knowledge of HQIM?”
The addition of details – monthly sessions that utilize direct instruction methods and are held immediately prior to school starting – surfaces potential pitfalls about the design of the PD.
Activity: Use the note-catcher to turn the assumptions embedded within your district/charter's RSSP Theory of Action into questions with specific details.
Collect Requisite Data
The inclusion of details helps us think of additional data we can collect to discover the weak points in the assumptions that power our Theories of Action. For example, asking teachers what time, frequency, and instructional methods would be most effective for their professional development could produce important insights into what adjustments could be made to ensure the first assumption in Lost Pine's Theory of Action is accurate.
When collecting data about our assumptions, it is recommended that we ask the following questions posed by Peter Rossie, Mark Lipsey, and Gary Henry in their book Evaluation: A Systematic Approach.
Do our data effectively represent the phenomena we are attempting to measure?
“Convenience and familiarity are not sufficient criteria for selecting a measure. In a recent study of the validation of measures of the effects of teacher training programs, most measures, including observational ratings of student teaching by university supervisors, survey of teacher candidates’ dispositions, or ratings of portfolios of teacher candidates’ work were unrelated to their subsequent performance as teachers. Only college grade point average and number of math courses were systematically related to the teacher candidates’ effectiveness in the classroom.”
Most of us would assume that the score a student teacher received from a professor's observation would correlate with their performance as a full time teacher. But in the study above, observational ratings had no relation to eventual teacher performance. Surprisingly, a student's college GPA was an effective measurement for eventual teacher quality. This finding paints a cautionary picture: we have to be certain that the data we collect actually represents the underlying phenomena we are attempting to measure.
Which assumptions are most urgent to test?
“It is seldom possible or useful to individually appraise each distinct assumption and expectation represented in a [Theory of Action]. But there are certain critical tests that can be conducted to provide assurance that it is sound.”
Because it is impractical to test every assumption embedded within our Theories of Action, it is up to us to triage. Creating a priority order in which RSSP assumptions should be tested directs the limited time and money we have as efficiently as possible.
Activity: Use the note-catcher to answer the following questions: What data can you collect to effectively evaluate each question? How can you confirm that the data you are collecting represents the phenomena you are attempting to measure?
Use Correct Analytical Techniques
Once we have identified and collected data on our RSSP Theory of Action, the next step is to determine the veracity of our assumptions by analyzing the data we collected.
It is essential that we use correct analytical techniques when analyzing the data. If we use incorrect techniques, we can come to faulty conclusions, leading our teams to make bad decisions and inhibit positive student outcomes.
Specifically, we recommend avoiding using pre-post comparisons and instead leveraging comparison group designs whereever possible (as described below).
The fundamental concept in evaluating program interventions is called the counterfactual. The counterfactual is best illustrated by the following question:
“What would have happened had we not implemented the intervention?”
If we could watch both realities play out – what happened to student growth and proficiency when we implemented the intervention and what happened to student growth and proficiency when we DID NOT implement the intervention, then we would be able to credibly determine what the impact of the intervention was.
This idea is further explained in the following video developed by the 2021 Nobel Prize in Economics winner Joshua Angrist.
Why Pre-Post Comparisons Are Misleading
Using pre-post comparisons is a common practice in public education. A pre-post comparison is when school leaders measure student growth and proficiency before an intervention has been implemented (or at the beginning of a school year). They then measure student growth and proficiency after an intervention has concluded (or at the end of the school year). They then compare student growth and proficiency from before the intervention was implemented to growth and proficiency after the intervention was implemented. If student scores increased, the intervention is deemed a success. If student scores decreased, the intervention is deemed a failure.
Using pre-post comparison's to determine a program's impact is a highly misleading practice. Rossi, Lipsey, and Henry describe that
“The main drawback to simple pre-post (before and after) comparisons is that any improvement they reveal cannot be confidently ascribed to program effects. For example, one of the main reasons people choose to enter job training programs is that they are unemployed and experiencing difficulties obtaining employment. Hence, they are at a low point at the time of entry into the program, and their situation from there is more likely to improve than deteriorate with or without the assistance of the program. Pre-post comparisons for such programs thus almost always show an upward trend that may have little to do with program effects.”
Broadly speaking, pre-post comparisons do not illustrate the effect of interventions on student performance because the interventions are not the only variable that changed in the lives of students between when they took EOY summative assessments. Below is a list of things that are likely to have also changed in a student’s life between EOY assessments:
Biological development – maturation
Teachers
Socio-economic status
Physical health
Mental health
Motivation
Friend group
Area of residence
School leadership
There are thousands of other additional changes that can happen to a student within the course of one year. Each change causes a positive, negative, or neutral shift in a student’s summative test scores. When we perform a simple pre-post test, we are measuring the combined effect of all the changes that happened within a student's life on their end of year test score.
Given this insight, it is clear that a pre-post comparison does not enable us to determine the unique impact a program intervention had on a student's test scores.
Comparison Group Designs
If we want to credibly determine the impact of a program intervention, we would have to use a series of statistical methods from the field of Econometrics.
While describing econometric methods is beyond the scope of the Data Fellows program, there are some foundational concepts we can leverage to help us get very rough estimates of the effects RSSP has had on student outcomes – estimates that, although inaccurate, are not as misleading as simple pre-post comparisons.
Rossi and his colleagues describe that we can obtain estimates of an intervention’s effect by comparing the group of students who received the intervention to a similar group of students who did not receive the intervention:
“Instead of individual-level counterfactual estimates, evaluators most often find it necessary to rely on group-level estimates. A common way of doing this is by constructing or identifying a group of individuals who did not participate in the program being evaluated whose outcomes of interest can be averaged to use as a counterfactual estimate for the average of the group that did participate in the program. The difference between those averages then becomes the estimate of the overall average program effect. Depending on the similarity of the groups and the potential for selection bias... this approach can yield good estimates of overall average program effects, and generally also for average program effects for some subgroups. However, it does not produce a counterfactual estimate for each individual in the program group.”
To determine if there is a potential comparison group in your district/charter, you can ask yourself questions such as:
Can I compare the growth rates of students who were targeted in RSSP interventions to similar students who were not targeted in RSSP interventions?
If, for example, 3rd grade reading is an RSSP focus area and 4th grade reading is not an RSSP focus area, are the 3rd grade students in my district similar enough to the 4th grade students in my district that comparing their reading score growth would help me see the associated effect of RSSP on 3rd grade reading?
If students only participated in RSSP reading interventions, can I compare their reading growth rates to their math growth rates?
While comparison group designs produce better estimates of the unique influence an intervention had on student outcomes than simple pre-post comparisons, it can only be considered a correlation because the groups of students still have fundamental differences between them.
Despite these limitations, comparing similar groups can yield better estimates of program effects than simple pre-post comparisons, especially when they are paired with historical data. For example, if you can compare the reading growth rate of 3rd grade students from 2017 - 2023 to the reading growth rates of 4th grade students from 2017-2023, you will have the ability to see if any significant observable changes happened to 3rd grade reading scores when RSSP was implemented relative to 4th grade reading scores.
Theory of Action -- Assessing our Understanding
Identifying our Ability to Understand the Validity of our Theory of Action
So far, this module has described the processes we can use to evaluate the validity of our RSSP theories of action. However, we recognize the difficulty associated with implementing these processes in practice. To help determine where we are at in our ability to understand the degree to which our Theory of Action is valid, we can use the Theory of Action Guide.
Activity: Use the Theory of Action Guide to answer the following question question in your note-catcher:
“Which of the category and situation described in the Theory of Action Guide best describes the current status of your district’s understanding regarding its Theory of Action?”
Reflection Question: After identifying the category and situation that best describes your district/charter, use the note-catcher to answer the following question: “What can your district/charter RSSP team do in the upcoming school year to strengthen its understanding regarding the validity of your Theory of Action?”
Congratulations on completing the module. Please complete the Exit Ticket form by clicking on the link above. We will use the information you submit to track your completion.