Weighted-factor models (WFMs) are a spreadsheet tool useful for systematically comparing options (e.g., of interventions, grant proposals, experts, potential advisors, or candidates when hiring). WFMs thus are built by listing options and assessing them through quantified criteria weighted in relation to their importance for the decision at hand.
A weighted factor model (WFM) is a decision-making tool that uses quantification to arrive at a conclusion about the relative ordering of a set of options based on a set of criteria adjusted by their relevance. Imagine you have a few books in your reading pile and need to decide which one to take on a short vacation. If you were a normal person you may just pick one based on your preferences that day, but you are a researcher. You figure the decision should be based on a few factors, say:
Book quality
Appetite for reading it now
Length of book
Weight of book
You could stop there, but surely the book’s weight is not as important as your appetite for reading it or its quality. All books could be read in the time of the vacation, but you slightly prefer a book that isnt very short. After a grueling and completely unnecessary process involving transforming scores in different metrics onto a unified metric (by z-scoring, more on that later), you now have an answer, happy days!
I input the data from handy sources:
And then apply some manipulations to the data so that it is standardized and weighted (more on this later). How to choose what book to read next like an AIM researcher (i.e., you have too much time on your hands)
WFMs thus have three noteworthy components: 1. Options (in the example above, these were books), 2. Criteria (e.g., quality of the book), and 3. Weights are assigned to each criterion. The scores assigned to each option under each criterion are standardized using one of many methods. If all your criteria are scored using the same metric (for instance, a binary 1 or 0 or a score out of 5), further standardizing may not be needed. When this is not the case, mathematical manipulations such as z-scores help us keep the value of each input standard across disparate criteria.
By using WFMs for decision-making in research, we can draw from different metrics, evidence, and judgments to make decisions, adjusting weights for how relevant those different criteria are to the decision at hand. WFMs allow decision-makers to combine disparate types of evidence (e.g., rational arguments and scientific evidence) and objective and subjective factors (for example, the cost of living of a city and personal excitement about its lifestyle).
Researchers are likely to use something like a WFM whenever we need to make a high-stakes choice between several options, like which interventions to prioritize or which country to recommend for a specific organization. This section’s core material goes into more detail about the benefits and drawbacks of the tool.
Here’s a serious example from AIM.
Example weighted factor model (Charity Entrepreneurship, n.d., para 4)
Until this point, we talked about constructing weighted factor models. However, spreadsheets can guide our decision-making even when we don’t assign specific weights and numerical values. By putting our options into a spreadsheet, setting criteria and qualitatively describing how each option did on each criteria, we are getting a lot of benefits of WFM, such as allowing for systematic comparison, transparency, and emphasis on convergence. We especially recommend using just a more straightforward spreadsheet instead of a weighted factor model under the following conditions:
when you are making important decisions, and this is the only tool you are using
when you have fewer options to compare (e.g., 15 rather than 100)
when by quantifying inputs into a single number, you lose valuable information
One way we use spreadsheets like this is to support our decision-making on which charity idea to recommend for a new charity to implement. AIM’s whole internal research process culminates when we write deep reports and where researchers use many different research methods and criteria to evaluate an idea. After completing each report, they summarise all the information into a decision-making spreadsheet. The factors that are taken into account differ slightly according to the cause area. As an illustration, for the last decision about large-scale global health and development direct delivery interventions, the CE research team assessed the following factors:
Intervention name
Short description of the intervention
Potential impact:
What scale could this charity reach
Results from the cost-effectiveness analysis
An assessment of how speculative the cost-effectiveness analysis is
Overall quality of evidence:
that charity can make this change happen
that the charity has the expected effect if it implements the intervention
Overall likelihood of success
Experts views
Limiting factors:
Talent (founders & key hires)
Access to information and relevant stakeholders
Feedback loops
Funding
Scale of the problem
Neglectedness
Execution difficulty/ Tractability/ Paths to failure
Externalities & risks
Others:
Remaining uncertainties
Other notes
Core materials
Weighted Factor Models (Charity Entrepreneurship, n.d.) (read until section 7, ~14 minutes)
One of the main applications for WFMs at AIM is to use the tool for decisions around country prioritization (i.e., which countries are most promising for a specific intervention). This helps AIM decide whether there are sufficient attractive options for the intervention and helps future implementers have a head start in narrowing down which countries to scope for their activities once they get down to it.
This section details how to construct a WFM using geographic weighted factor models as an example. When it is time to conduct a geographic assessment, we will usually already have a ToC and have conducted some cursory cost-effectiveness modeling and evidence review. This is to say, we are starting to get a sense of the critical factors that impact the effectiveness and cost-effectiveness of an intervention.
Clarify your goal. Before starting, it is worth clarifying what exactly you are trying to achieve with the WFM – this helps to focus on the criteria of fundamental importance and avoid wasting time on rabbit holes. In our case, we aim to identify a list of ten or twenty countries where an intervention would be most suitable. Let’s use the example of a policy organization focused on reducing sodium consumption through sodium limits, a recently recommended AIM report.
Set up your options. List all options that must be evaluated and compared. In our case, this is all countries.
A note on using countries as options in geographic WFMs. Using countries is the most convenient option for us because of how data is often produced and shared. However, there will be cases you come up with where you must be aware of the limitations of using countries as your primary option for a geographic WFM:
Country sizes are very unevenly distributed. An Indian, Nigerian or Bangladeshi primary sub-division is often larger than lots of countries. An organization could work in India for decades and not reach the full country scale but work in Lesotho for a few years and achieve that.
Often, the metrics we care about are affected by inequality; country scores are averages across a population but may be hiding how the poorest quintiles, or rural populations, are doing on a given metric.
Figure out which criteria to use. The hardest part of making a good WFM is probably picking the right criteria to use. This is partly because the ideal criteria will differ according to the problem you aim to solve. Criteria used in a WFM can include anything from hard data, like population number, to very soft judgment calls, such as a general sense of logistical difficulty. We recommend listing out a long list and narrowing it down by considering the following three pieces of advice. Good criteria are
Relevant: Good criteria tell us about factors pertinent to cost-effectiveness and the ToC. In our sodium reduction example, we would want to think about prioritizing countries with a larger burden for cardiovascular disease and where salt consumption is high. Looking at broader processed food consumption could be an alternative option, but salt consumption data was available, more granular, and relevant to the intervention.
Useful: Your criteria must have data for most of your options, as it is not useful to have lots of empty data cells. They must also have sufficient variation to facilitate decision making (all else equal regarding relevancy, we would prefer a criteria with lots of cross-country variation to one constructed with a three-item score where 90% of countries score a 2). In the sodium example, AIM researcher Morgan Fairless time-capped himself at two hours and tried to use existing data to construct a criteria for the number of relevant sodium policies in each country.
Practical: Sometimes, the perfect criterion for relevance and usefulness are impractical for your research. You can’t collect primary data, and you cannot afford to spend ten to fifteen hours cleaning up a messy database to obtain a score. Some questions to ask yourself are: Can you get data on it? Is it more objective or subjective? Can others understand what the column indicates?
These sorts of factors can allow your model to be interpreted and criticized by whoever is using it. Given that the geographic WFM is ultimately a tool used by other actors (the implementer), we may want to provide some optionality. For instance, Morgan was unsure whether the list of options for the sodium policy non-profit should include High-Income Countries (HIC) or not. Given that uncertainty, he added a criteria for whether a country was an HIC or not and allowed it to be toggled by the spreadsheet user to adjust for preferences.
Source the data for each criterion. We usually rely on international datasets for this when conducting geographic WFMs, as they are (usually) complete for all country options. Academic papers sometimes build datasets that can be useful as well (e.g., we relied on academic estimates for the snakebite burden in this report, given the lack of available international datasets on the subject).
Manipulate the data so that it is standardized.
We use z-scores to normalize data across different criteria. Z-scores are a measure of “value’s relationship to the mean of a group of values, measured in terms of standard deviations from the mean” (Charity Entrepreneurship, n.d., section 5.2). That is, a z-score of 0 indicates that the value of the data is the same as the mean, and a z-score of 1 is one standard deviation above the mean. Z-scores can be used informally to:
“Standardize values measured across multiple criteria so they can be combined into an overall score and compared to other ideas. For example, we can have an overall z-score for a given idea based on how it compares to an average in terms of CEA, expressed in $ per DALY; population size affected, expressed in millions; and crowdedness, expressed in percentage of the problem addressed by other entities.
Assess how a given idea scores compared to all the other ideas considered (including an average idea), for example, idea x is better than 70 percent of the ideas on our list.
Spot what values are anomalous. For example, if one of the factors in the scale was an objective number such as population size, a Z-score value would show which countries are outliers relative to others even though population size can differ by orders of magnitude.
Reduce the risk of some biases; for example, in a situation where the score is not converted to a z-score, we may use a higher range of values for one criterion but not another, effectively changing its weight. For example, suppose a given intervention is evaluated on each factor on an arbitrary scale of 1 to 10. However, one criterion, scale, varies significantly, and you tend to give out sevens and eights frequently. In contrast, on the criterion of tractability, you tend to give very consistent scores of four or five. The net effect is that even if you think tractability is more important, you weigh the scale higher. Converting this to a z-score takes care of this” (Charity Entrepreneurship, n.d., section 5.2).
The formula for the Z-score is as follows:
z-score =(X - mean) / standard deviation
In Google Sheets, for data point in A2, and data A2:A:10, the formula would be:
=(A2-AVERAGE(A2:A10))/STDEV(A2:A10)
Furthermore, we want to bind those z-scores in some scenarios. For example, India and China have huge populations and have, therefore, also a large number of people living in extreme poverty. Having said that, it is unclear if working in India would be several times better than working in Nigeria for example, as it would be hard to work in different states in India. This is why we often cap Z-scores at -2 and 2 to prevent India and China from coming up as the top choices every time. The formula you can use in Google Sheets for this is (for a bound of 2 and data in A2):
=MAX(MIN(A2,2),-2)
Not all sheets will have all data available for each country, which means that some cells in your spreadsheet will come up empty with NA. Since this means you can’t really perform calculations, you can change NA fields to 0. Since 0 in Z-scores means that the number equals the absolute mean, the calculation will assume that this country is in the mean of the distribution for that criteria and does not skew the calculation in one or another direction but simply takes that factor out of consideration. The formula you can use in Google Sheets is as follows:
=IFNA(FORMULA,0)
Some variables, such as the populations of countries, naturally vary over multiple orders of magnitude. If we use them in a WFM – even after being z-scored – a country that is 100 larger will get a 100 times greater score. This may be appropriate for some interventions where the size of the country really matters, and we want to give countries points in direct proportion to their sizes (e.g., top-down policy interventions). But for many interventions, that might not be true: Yes, India is big, but a direct-delivery charity may never manage to operate at the scale of the whole country. In these cases, we can use log-transformed variable, the core materials cover log-transformations.
Add weights to each criteria and calculate a final score. Each weight should correspond to the importance for the questions you identified in the first stage. In Google Sheets, for a set of criteria scores A3:D3 and weights A2:D2 would be:
=SUMPRODUCT(A3:D3,A2:D2)
Generally, the results of this list should be used as a starting point for deeper and more qualitative research including reaching out to people who are familiar with the context and asking them about the likely tractability of and need for the intervention in the context. Of course, this is just one way to think about and use geographic assessments, the video in the core materials presents a more nuanced approach to building and drawing insights from geographic weighted factor models.
The templates we provide will help with all the relevant calculations and hopefully simplify data inputs quite a bit. Here’s an example of a geographic assessment for the example we mentioned across the section.
Finally, here are some guiding questions we use when evaluating a geographic WFM.
Has the author reached useful conclusions about the priority order of countries?
Has the author chosen a sensible set of criteria?
Are the criteria pertinent to the delivery mechanism
Are the criteria pertinent to the burden
Has the author chosen appropriate weightings for the model?
Are the choices justified in the text?
Are the choices reasonable?
Do the weights match up with the ToC and CEA sections, that is, are the most important factors relevant to bottom-line cost-effectiveness?
Has the author handled the data in the assessment correctly?
z-scores, log-transforms, spreadsheet errors
Has the author identified key relevant players in this space?
Are the players described factually?
Weighted Factor Model (Charity Entrepreneurship, n.d.)
How and why to use log transforms in WFMs (Filip Murár, 2024)
Geographic WFM walkthrough (Filip Murár, 2023) (video, ~24 minutes)
Practice project and samples in our full PDF version.