We use this process to narrow down the field of ideas pragmatically and roughly to maximize the chances that the final reports we focus on have a high expected value.
The “research round” process is the main way AIM Research Team sources and evaluates the non-profit intervention ideas we identify for the CEIP.
This process can be understood as a funnel; it starts with hundreds of potential ideas and progressively narrows down to the most likely recommended ideas.
The process can be separated into prioritization stages (stages 1 through 4) and evaluation stages (stages 5 and 6). The prioritization stage aims to narrow the field pragmatically and roughly; the evaluation stage seeks clarity to equip decision-makers and implementers with the best possible information on how the ideas perform against the evaluative criteria.
To be updated
The note should establish:
A problem or impact statement: a motivational note on why this cause area matters, why we should care, and the goals of the research process.
Scope: A well-defined scope statement that clarifies what is to be researched and what falls within and outside the scope of the round. Even within a broad cause area, we may want to consider some areas to be a focus (e.g., policy/direct/research/high-scale/etc) and some out of scope.
Adaptations required to the evaluative criteria and bar: Adapt the evaluative criteria described here to the research round. Consider:
What is the standard charity in this space that new talent could work for?
What is the bar that funders are looking to beat?
What has AIM done in the past?
A metric to use; consider DALYs, WELBYs, Suffering-adjusted Days (SADs) (animals), lives saved, BCR, $ moved, growth impact, etc.
Samples: Climate Co-benefits & Income and Growth
Broad reading and research—as you read, list ideas in the idea spreadsheet (see below). As you read, make a note of experts that it may be good to reach out to for the brainstorming stage (or later)
5-30-minute presentation to the team on research done (optional)—when writing a presentation for others, focus on conveying your understanding of the topic, your sense of the theory of change for a topic, and your take on critical uncertainties. Focus less on putting across context or key background information and facts.
Consider contacting subject matter experts for an initial conversation on key topic-level uncertainties. Ask them for ideas.
This step aims to develop a long list of potential ideas to investigate. Ideas can be sourced through:
Brainstorming (individual, group)
Stakeholder contributions (AIM staff and community experts)
Remember to check the scratchpad (private) where random ideas await rebirth.
No bad ideas(ish) – if you have an idea but don’t think it can win, but it is interesting, feel free to list it anyway.
Maybe others will have a better version of that idea. If we get many ideas from experts that need time to filter, then we spend less time internally generating ideas.
Categorize ideas:
Make sure idea titles are descriptive and clear, ideally describing the theory of change (ToC) of the idea in a sentence.
Double up, split, and delete ideas to manage the list. Feel free to delete silly ideas.
Aim for as much clarity as possible in each idea to increase the chances that raters aren’t comparing apples to oranges.
Track where / who each idea came from: This is important as we want to be able to thank and acknowledge contributors (especially those outside of AIM) and evaluate where the best ideas come from.
Before moving on to stage 2, one researcher should spend ~1 hour excluding duplicates, merging where obvious, and re-categorizing. The researcher should also ensure that all ideas have a clear scope and description.
Unique numbering: Each idea should have a unique number associated with it.
This stage relies on aggregate rapid scores from reviewers that provide intuitive scoring to ideas and may do some light desk-based research for a handful of minutes. The idea is that aggregate scores and review meetings where those scores are discussed help narrow the list and exclude underperforming ideas.
Prepare a simple rubric for scoring to help researchers think through the key aspects of each idea and ensure that scores mean roughly the same for all researchers. The standard rubric ranges from 0 to 10:
0 = not worth looking into (veto)
5 = may be worth looking into
10 = is worth looking into (override others' veto).
Fraction scores (such as 5.375) are encouraged.
Run through stage 2 with about 5-20 ideas chosen randomly and see if it works well before running it fully. Set a time for how long each researcher should spend on each idea. 1 to 15 minutes per idea per person, depending on the timings of the round.
The general principle is that several people should score these ideas and, at minimum, two reviewers.
Round leads may play around with the number of reviewers (e.g., by inviting guest raters) and the time taken per idea based on project management needs and the specificities of the round.
Reviewers are expected to review the list category by category, rating the ideas. Reviewers can write notes to explain scores and link to resources as needed.
The round lead (or other) chairs a succession of meetings where ideas are organized by the average score across reviewers and classified into discrete criteria based on consensus and scoring relative to the alternatives:
Yes
Probably Yes
Maybe
Probably no
No
Merge
Bump to S4/5
Reviewers who are present in the meetings can change their scores in these meetings to reflect opinion changes based on debate.
Review meetings should be held to manage the time needed for the final decision-making meeting for stage 2. That is, the round lead should schedule these to check progress and speed up the stage 2 process.
Some ideas may require some short follow-up research to clarify any critical uncertainties. Many ideas will have changed from the initial list; they may have been rewritten and combined and recategorized, there may be new ideas, and so on. Some work might be needed to turn this into a final spreadsheet that can be used in:
Decision making. Each idea should be clearly separate and top ideas should be distinct enough to support future researchers to know what they are researching.
Data analysis. We may want to analyse what ideas from what sources were successful, tracking all data points (source, category, idea number, etc) is useful.
Sharing. We may want to share this with external stakeholders to get their feedback on the ideas list. (There may be a case for publishing this eventually)
The round lead (or other) chairs a final meeting to select the top-performing ideas for stage 3. The number of ideas selected will depend on the round type and human resource capacity. Usually, the range of selected ideas is between 20% and 35% (~ 50 ideas).
All long-listed ideas are to be considered based on their score and category.
The round lead should look out for red flags, such as a wide variation in the team members’ views, or views on ideas that regularly change or are very different in different meetings. Where there are red flags, consider putting in additional time to resolve uncertainties and seek expert input.
The round lead (or others) can share the shortlisted ideas list with others outside the research team, such as other AIM staff and interested external stakeholders. It is helpful to ask for ideas that should have made the cut but did not. Ask if any ideas should not have reached the top that did do so.
Approaches to scoring for QP differ by researcher, a few options include:
Thinking about Importance, Tractability, and Neglectedness
Searching for evidence directly using Scholar, AI, etc.
Searching Google, and typical sources such as 3iE Development Portal / Cochrane, etc., check our prior reviews in our AirTable.
In this stage, a reviewer will examine an idea for one and a half to two hours, focusing on its evidence base, chance of success, and cost-effectiveness. Depending on the number of ideas available and team capacity, this exercise aims to select between 20 and 35% of the ideas for a narrower shortlist.
At this stage ideas should not be discarded without spending 2-5 minutes thinking if there is a close alternative idea that could work. If an idea does particularly badly at a specific earlier aspect of this stage (e.g., the strength of evidence base), it may be worth discarding without continuing the research.
We have a template [PRIVATE TO AIM STAFF ONLY] that explains much of this process, plus the criteria.
The main reviewer for an idea should spend no more than 50 minutes reviewing the evidence base for an idea. They will find evidence to support and evidence against the promise of an idea – all should be noted.
A second researcher ensures the quality of the evidence review by spending 5 to 10 minutes providing feedback on scoring, the process followed, alternative views, areas for future research, etc.
Design a CEA / BCR spreadsheet template. For comparing interventions of different types, we should hold some factors that are highly uncertain and highly likely to affect the BCA constant.
We can allow some of these factors to change with good reason. For example, scale up costs by a % for international or developed country interventions. Aim to avoid the CEAs varying by orders of magnitude based on speculative numbers.
We should pre-set options where possible to support easy decision-making.
The 2024 human-focused template is split into two sections: health interventions and (economic) consumption interventions. Each has multiple suboptions
Health ideas – impact can be modeled using:
Deaths averted: Interventions that reduce mortality
Prevalence: Interventions that are assumed to reduce the prevalence of a condition, especially conditions that cause disability, typically as a result of treatment
Incidence: Interventions that prevent new cases of a disease
Other (e.g., total DALYs)
Consumption ideas – impact can be modeled as:
One-off interventions (usually aiming for a policy change) vs continuous interventions (e.g., continuously delivering a service)
‘Top-down’ (where the beneficiaries are a whole group of people, such as a city or a state) or ‘bottom-up’ (where the number of beneficiaries grows proportionately with the charity’s activities)
Having only immediate effect (while the intervention lasts) vs having a longer-lasting effect (e.g., after a policy is passed or after skills are successfully built).
As a baseline comparison, include an example cost-effectiveness estimate of existing promising interventions, e.g., FWI 2.0 or corporate campaigns for egg-laying hens in HIC.
A second researcher should review the model.
If ideas do significantly worse than the comparison baseline. Assess if there is a key uncertainty or reframing of the idea that could be useful to research that might make the idea look more promising. Or cut the idea.
The round lead may hold meetings across stage 3 to advance the timeline and eliminate ideas earlier that clearly fail the bar in one of the three aspects being examined.
The round lead (or others) will chair a final decision-making meeting to reduce the list to between 20% and 35% of its current size. Scores from all three aspects are included in a spreadsheet and discussed to lead to conclusions for each idea. Depending on the topic area, scores for each sub-stage may be weighted differently (E.g., you may want to weight the one-line CEAs less than the evidence review, or vice versa).
Additional research: For each “probably yes,” aim to resolve the key uncertainty with <2h of research.
Aim: To resolve critical uncertainties and gain insight into the shortlisted ideas to identify the top ideas worth investigating to conduct a rigorous assessment.
Reach out to experts to set up interviews
Email 1-4 experts (we are aiming to talk to one expert in the space, but it is unlikely that all will reply so we reach out to more than we plan to interview, if you think it is very likely that an expert will respond to you and that they are the best to speak with on a given topic then you can just reach out to them). There is no need to email all experts – it may be better to talk to them at stage 5 to focus on a specific idea.
Expert interview (1-2 interviews, 1h15min-2h45min). Prep questions before the interview. Ensure the questions you ask are tailored to the ToC and that you have plans for what to probe or dig deeper into during the interviews. Although this round is focused on critical uncertainties, the interviews should be more general than that, cover a range of topics, ask for other people to talk to, and consent to make and use notes, etc. (Treat S4 interviews as if they were S5)
Expert view - Speak to 1-2 experts in the area to get broad information (20min to arrange and write questions + 30min-1hour interview each)
Record interviews
Summary write-up – 15min
Aim to list the top 1-6 crucial considerations/key assumptions that need research. Can do some (but not all of) the following to generate crucial considerations:
Look at key considerations (crucial considerations) from the previous stage (10min)
Look at stage 3 CEAs to figure out which numbers are most uncertain and which will make the biggest difference to the final answer.
More detailed ToC and intervention description (15 min)
And identify key assumptions (10 min)
Do a COM-B ToC assumptions analysis (30min)
Some typical critical considerations include:
Neglectedness (E.g., How many people are working in this space? Is there funding for more actors?)
Tractability (E.g., How likely is it that a charity can achieve X? Have there been any attempts to achieve a policy Y? Have those succeeded? Do people have the time and money to do Z?)
Externalities (E.g., Is it possible that promoting behavioral change Y backfires and causes harm X? Is there a chance that an increase in the consumption of Z displaces other purchases?)
ToC (E.g., Is it possible for a CE-style charity to have the technical skills to deliver this intervention?)
Are there high start-up costs?
Any limiting factors (E.g., Does this burden affect a large enough number of people? Would it be possible to conduct this intervention in a large number of countries?), etc.
Harms matter. Make sure you have considered if there are any potential harms to an idea.
What intuitively feels most likely to kill an idea (5min)
Google evidence against (30min)
Choose 2-4 critical uncertainties per idea (min 1, max 6).
Make sure they are truly decision-critical. Sometimes it can be tempting to research implementation questions that are not likely to be decision-critical at this stage.
Resolve these critical uncertainties in whatever way is best. This can mostly be done through desk research and interviews.
Remember: As with all AIM research, you can stop this process early if an idea no longer looks promising after investigating one key uncertainty.
Another reviewer will read other stage four reports and leave questions and comments on the analysis, particularly highlighting any crucial considerations that haven’t been addressed.
The round lead (or other) will convene a decision meeting (or more mid-stage meetings) to consider all research done up to this stage and prioritize the short list of ideas. At this stage, there is no specific need to select ideas; the decision meetings should give us a
Sense of the priority ideas on which to conduct deep reports.
Cutting ideas that are outright not worth reviewing.