Writing About Findings with Context
A guide for secondary researchers from dataset acquisition to publication
Written by the CHIRON Project Team
Published on July 8, 2024
A guide for secondary researchers from dataset acquisition to publication
Written by the CHIRON Project Team
Published on July 8, 2024
Very few scientists wake up in the morning wanting to do harm to communities with their research. However, communities continue to suffer harm from research at a surprisingly frequent rate.¹ These harms may be social in nature: dignitary² or representational; for example, if research reinforces painful stereotypes about a community.³ Research may also cause concrete harm to communities; for example, allocation harm if research findings result in a reduction of access to care³ as demonstrated by Obermeyer and colleagues in a landmark paper. Key to mitigating the risk of harm from our work is rigorous contextualization of our findings, which can begin from the very first steps in determining an appropriate study design, all the way to communicating findings to others. What does this mean in practice?
The first step in properly contextualizing study findings is actually at the very beginning of the study. Researchers must think critically about exactly who is in their dataset(s). Study population characteristics and potential biases gain particular importance to secondary researchers, who are using data collected by others in ways that may lead to other biases that the original study was not designed to handle. Data is valuable and expensive to collect, and the ability to re-use existing data opens researchers to a wide range of possible investigations and can lead to important discoveries that were not part of the original study design. However, any given dataset a researcher is re-using may be poorly designed to answer the specific scientific question that researcher is asking. Each researcher needs to fully understand the population under study (i.e., who is in the dataset? Who is left out? What scientific question was this data initially collected to answer? Is there a control or comparison population included? What are the potential biases? Is this population representative?) and think carefully about how to design their study to address potential bias. A given hypothesis might not be appropriate for the available data, and the researcher will need to think carefully how they can formulate a new hypothesis that is appropriate, or even perhaps whether the study is truly feasible given the available data. One last point when considering population characteristics is sample size; when sample sizes are too small, one or a few outliers can skew the results in surprising ways, even when using appropriate statistical models.
Study design is equally important to being able to properly contextualize findings. Researchers should include other scientists with expertise in statistics and study design when formulating an appropriate hypothesis and coming up with an analysis plan, if they do not have this expertise already. The scientific question of interest needs to be carefully considered, given the available data, the groups being compared in the repurposed study vs. the original study, and the analytical approach. Some bias can be accounted for through proper statistical modeling, though other types of bias should be transparently investigated and presented. For instance, many studies simply “adjust for sex” in their analyses, since many variables are different between sexes and genders, and many diseases are differentially impacted by sex. However, such studies may obscure potentially important variations in results. Furthermore, sex and gender data is often messy, depending on whether it was self-reported by the participant or clinician or what terminology was used. Sex and gender data raises the possibility of misclassification.
An appropriate study design also needs to carefully consider the outcome under consideration (such as onset of a disease, or a specific treatment effect), as well as the risk factors and other variables under study. Misclassification, while sometimes unavoidable, can be an insidious form of error and/or bias and can lead to erroneous results, or results that are not measuring exactly what one might think they are measuring. Some questions to ask as researchers examine their data are: How were these variables collected? Is there a clear definition of each variable? Does it match what might be expected and/or is it in accord with other studies? If variables are being used to “lump” or “split” participants, are these variables clearly enough defined to merit inclusion or exclusion from various groups?
An extremely important component of a study design is understanding what comparisons are being made. Most often, researchers are interested in comparing a group with a condition or risk factor to a group that does not have the condition or risk factor. Ideally, these two groups should be equivalent in all other variables (they should be similar in age, sex, other clinical markers of health, socioeconomic makeup, etc). Dhejne et al. 2011, a study which set out to investigate mortality and criminality in transgender people in Sweden, is an example of how an inappropriate comparison group can lead to group harm. In this study, trans people who had received access to gender affirming care were compared to cisgender individuals, yielding a result showing that transgender people who had received care had higher rates of suicidal behavior and psychiatric illness than cis people, opening the door to highly distorted and transphobic interpretations of the work. The results in themselves aren’t surprising, given the high rates of discrimination and abuse towards transgender individuals during the time period of study, regardless of whether or not they had sought care. A more appropriate control group would have been transgender individuals who did not receive transgender affirming care, or perhaps the authors could have assessed health outcomes in these same patients before and after care, in order to determine whether gender-affirming care in itself had any benefit. Instead, bad-faith actors interpreted the study to support their own claims that gender-affirming care causes suicide and should be banned. While the authors themselves did try to contextualize the findings, the worst distortions of the study would not have occurred had the inappropriate comparison not been made in the first place.
After researchers complete a transparent and carefully considered analysis, they must then work to properly contextualize their findings. Researchers often present findings through the publication of peer-reviewed papers. However, the communication landscape allows for far more opportunities for a researcher to communicate and publicize their findings than ever before, such as on social media and through professional online groups [For more information, see this piece about popular communication]. While the increasing number of communication pathways can be a boon for community outreach and inclusion, it can also pave the way for outside entities to interpret research findings according to their own agendas. Researchers may not have control over what others say, but they can present their findings thoughtfully and give outside actors as few opportunities as possible to misrepresent their work. Researchers should also interrogate their own biases as they write up their results. Were there any preconceived notions about how the analysis would turn out? Did the findings support the initial hypothesis? Is the way the findings are presented reinforcing stereotypes? It may be helpful to enlist individuals from the community under study to read over the results. One example of community input into the analysis and dissemination of complex scientific ideas is the African Ancestry Neuroscience Research Initiative. Researchers working with this initiative involved community leaders from recruitment all the way through to publication.⁴ The first major publication from this initiative was recently released, detailing how genetic ancestry impacts gene expression and DNA methylation variation in Black Americans. To avoid reinforcing old stereotypes about biology and race, and to shift the focus onto the complex interplay of environment and genetics, the researchers sought feedback from scientists from Black in Neuro to help communicate their findings.
As demonstrated in the example above, consideration of the social context of research findings should inform how they are presented, especially when investigating typically underrepresented communities. Numerous examples of social harms exist in the scientific literature, perhaps few as infamous as the genetic research performed on the Havasupai tribe blood samples that extended far beyond the initial scope of the study they had agreed to, and resulted in a lawsuit for monetary compensation and the return of DNA samples.⁵ Given the widespread impact of this case on AI/AN and First Nations communities, along with other documented harms of scientific research to AI/AN communities, many researchers are now embarking on fully collaborative and transparent research initiatives. One such initiative is the Northwest-Alaska Pharmacogenomics Research Network (NWA-PGRN).⁶ Researchers at the University of Washington with expertise in pharmacogenomics have partnered with community leaders across Alaska, Montana, and Washington to promote pharmacogenomic research to benefit AI/AN people. In this partnership, tribal authorities work with researchers and community groups to set research goals, educate community members, determine boundaries of research, and establish bidirectional communication and learning throughout the research process, including establishing culturally appropriate ways to disseminate research results. Tribal authorities also have oversight and approval of research processes at each community partner site (with any site having the option to opt out).
Lastly, scientists should consider how their own hopes and expectations, especially pertaining to their careers in science, might be important to the context of their research. Some research is “hypothesis-free” or exploratory, which can be incredibly valuable but can also lead to false positives, cherry picking, and confirmation bias. Exploratory research should be transparent, should be replicated whenever possible, and should use methods to try to control for false positives (like correcting for multiple comparisons). Many of the contextual pointers above apply to exploratory research, especially in study design and communication of results. When research has a set analysis plan with a hypothesis to test, researchers should prepare for what they will do if the results aren’t what they expect. Scientists typically include discussion about how their study fits into the context of studies that came before, and they report strengths and limitations of their study. However, these nuances are frequently overlooked once a research study is reported on by the media (whether science-oriented media or for the general population). Researchers must be mindful—if they overemphasize their findings or don’t present them in proper context, the results can easily be misinterpreted. The drive to publish something splashy is strong, but the potential downstream harms and repercussions of emphasizing an angle of the study findings at the expense of nuance should be considered.