How to Write Science - IV. Introduction

WRITING AN INTRODUCTION SECTION FOR A SCIENTIFIC RESEARCH PAPER

INTRODUCTION SECTION CHOREOGRAPHY—WHAT GOES WHERE AND WHEN

"SMALL KNOWNS," "TOPIC SENTENCES," "SANDWICH PARAGRAPHS," AND "HARD EVIDENCE"

MOTIVATION

Let’s address the elephant in the room: Yes, I really do prefer writing my Introductions last (although, alas, it's not always practical). Yes, I know that’s what readers reads first. No, I don’t think it's a common approach, but, yes, I think it should be!

To be clear, I outline my Introduction as soon as I can, often before I ever start my test. No one's saying you should sit down to write your Introduction without giving it any prior thought! But actually writing the Introduction section? Like, for keeps? Yeah, no, I don’t even want to touch the Introduction until I have to!

The reason is simple: the Introduction is the hardest section to write. That’s my earnest professional opinion! If you're reading this because you soon must write an Introduction of your own, you have my condolences! But there's good news here too: the Introduction is often the shortest section by length, and it's also very formulaic.

Wait…does that even make sense?? It might seem paradoxical—how can something be short and formulaic and hard to write? But that’s exactly it: because good Introductions are short and formulaic, but they also have such a massive job to do, they have to do a ton in a tiny space. They’re not hard because they're long, like the Discussion is—it’s the opposite: they're hard because they're so short!

Plus, as the lead-in to your paper’s story, an Introduction has to set the table for that entire story. That means you have to know what your story is (or likely will be)—you need a strong sense of how your story ends to start it meaningfully.

This is how I think Introductions often flummox us writers. Until our Results and Discussion are done, we usually don’t know exactly what our story is or how it ends! As you might imagine, that'll make writing our Introduction so much harder; at best, any Introduction draft we write before that point will be just our “best guess” for what the Introduction ultimately needs to be.

I’ve been wrong about my story enough times—and thrown out enough Introduction drafts because of it—to be leery of going into the Introduction too early. If you don’t mind that possibility, great—write your Introduction as early as you please! Me? I only want to write an Introduction once, so I wait until I have the best odds of success to do it.

MEETING THE PLAYERS

I said above that Introductions are very formulaic—good ones tend to contain specific things in a predictable order. So, let’s peek under the hood of an Introduction to see what these components look like. They are, in (more or less) this order:

1. The “Large Known” (1-2 sentences, rarely longer, very often the first sentence(s)).

2. The “Large Unknown” (Often, the rest of the first paragraph, perhaps some/all of the second).

3. The “Small Unknown” (Often, the first sentence of the second/third paragraph, rarely longer, often a topic sentence and a transition sentence rolled into one).

4. The “Small Knowns” (Usually, the next several paragraphs and the bulk of the Introduction as a whole).

5. “Justification” (Variable in length and placement. Sometimes as short as one sentence, other times a paragraph or more. Sometimes found early, but more often placed near the end).

6. Questions, Hypotheses, and Predictions (Almost always the last paragraph of the Introduction, rarely longer).

If this list of items feels unfamiliar, you wouldn't be alone—most scientists I know had to “discover” their existence as they developed as writers (myself included). This is one of those "unspoken patterns" in sciences I'm trying my darnest to speak explicitly here.

I’ll explain each part of this "Introduction formula," and I'll also how I tackle to write them, in a minute. First, notice that an Introduction almost always ends with a paragraph focused on your questions, hypotheses, and predictions (or, sometimes, related concepts like “goals,” “aims,” “objectives,” etc.).

Thus, these are what an Introduction “builds up to.” This is why we must be able to clearly articulate these items before drafting our Introduction. It’s hard to write a good Introduction draft if we can’t confidently state what it’s introducing, right? Even though I skip much of the rest of the Introduction until later, I still draft this last paragraph before writing anything else—it’s that important.

Another thing to note: If you’re keeping score at home, you’ll notice a well-crafted Introduction is only about 8ish paragraphs (sure, there’s a lot of variance, but 8 is a perfect target for a first draft). That’s only about 2 single-spaced pages. That’s not much! Like I said—good introductions are not long; they’re dense. They embody the golden rule of science writing: “Make it as long as it needs to be and no longer.” If you are no longer writing something new, relevant, and interesting, stop!

INTRODUCTION SECTION CHOREOGRAPHY—WHAT GOES WHERE AND WHEN

So, how do we use the formula I summarized above to write an Introduction? To answer that question, I think we first need to answer this one (it might sound familiar): What does an Introduction do? Why must we always have one?

Again, I think there are many valid viewpoints here, but, to me, this is what a good Introduction does. It...

Clearly states what we, the scientific community, do and don’t know. It does this by reviewing what previous studies have already demonstrated (“knowns”) and, just as critically, what they have not yet demonstrated (“unknowns”).
It brings the reader up to speed on all relevant past work. It summarizes what the reader needs to know about past work to understand and evaluate your current work. Think of the Introduction as a service: you're trying to save the reader the time of finding and reading all the same papers you’ve read! You don’t have to tell them everything, of course, but you should tell them the gist so they can consider your research in light of what has come before it.
It argues your study was effort well-spent, and the reader's attention will be too. Think of the Introduction as your “pitch”—why should the reader keep reading? How will doing so be worth it to them? Why should the field care?
It prepares the reader for the Methods they'll read next. In most journals (the not-weird ones, IMO!), the Introduction is followed by the Methods. By the time the Introduction is done, then, it should be super clear what your study's motivating questions, hypotheses, and predictions were. As a reader, I should then be able to guess what you might've done, before even starting to read your Methods. That’s the dream!

I think the fact that Introductions must satisfy all four of these demands to be successful is why they're so formulaic—by containing each element of the formula I outlined above, an Introduction almost can’t help but meet these demands successfully. It's an efficient and time-tested way to "check all the boxes."

So, let’s meet these formulaic elements. The first I call the “Large Known.” Science never starts at “zero.” Every project starts from a place of knowing something. So, the best Introductions almost always start with a simple, declarative sentence (or a couple) that confidently states something we are pretty sure is true.

For example: “The survival of a plant species depends on ensuring seed dispersal to microhabitats favorable for germination.” That is a highly defensible statement—I think Darwin himself would've loved it, and a hundred years of research back it up. It’s a firm foothold with which to start our collective climb.

Then, we hit the reader with what I call the “Large Unknown.” Of course, there's so much we don’t know yet. If we knew everything, science itself would be unnecessary! More practically, a paper that presents nothing new is not very interesting.

So, we need to explain to the reader what we don’t know, as a community, underpinning our current inquiry. What is the “grand question” our research is one small step towards answering? Not only does conveying this to the reader orient them to the purpose of your work, it also can capture their curiosity so they keep reading. That's key, as we'll discuss.

For example, the second and third sentences in an Introduction might look something like this: “However, how does a seed “know” it's “time” to germinate, once it's landed in a favorable microhabitat? While we know seeds can sense their environment, how they do so is still poorly understood.”

I call this element the “Large Unknown” because it’s a large knowledge gap that lies directly adjacent to something we do know—it’s the “frontier on the edge of civilization,” so to speak. The reader should read your Large Unknown and think: “Wow, I thought we did know more about that…we should know more about that!” If that’s their internal monologue, you’ve nailed this aspect.

Of course, if your “Large Unknown” is large enough, there's no way you’d be able to resolve it with just one study (or even your whole career!). You’re not a superhero, after all (are you?!).

So, once the "Large Unknown" has been introduced, we need to transition to the “Small Unknown.” This is your personal chunk of the "Large Unknown" that, by filling in, will get us one step closer to filling in the "Large Unknown." It’s your shovelful of dirt to pour in the canyon, so to speak.

To put an even finer point on it, the "Large Unknown" is the big, scientific “Grand Challenge” the field is grappling with—like how seeds sense when to germinate. But no single study can tackle a question that gigantic. That’s where your "Small Unknown" (and your study) comes in: it’s your research’s slice of that massive pie—specific, local, and achievable.

Continuing our example from above, our "Small Unknown" might be “In Minnesota, bloodroot (Sanguinaria canadensis) is a sparse native woodland wildflower that germinates in early Spring, when conditions may range from ideal to inhospitable. How this taxon’s seeds accurately sense that fatal nighttime temperatures have passed is unknown.”

See what we did there? We took our "Large Unknown" and we zoomed into a specific place, time, taxon, and “thing to be sensed." This "Small Unknown" is clearly a part of the "Large Unknown," even though it doesn’t come close to addressing it fully. However, if enough people researched similar questions in enough different settings, we might just put the Large Unknown to rest someday!

You’ll notice I implied some details in my "Small Unknown" too, such as that bloodroot is relatively rare, its seeds germinate in the Spring, and that freezing temperatures can kill young plants. These're all examples of what I call “Small Knowns”—things we do know that abut the "Small Unknown." They're things that help us “guess” what the answer to the "Small Unknown" ought to be—that is, they inform our hypotheses. Because one of our goals in the Introduction is to bring the reader up to speed and another is to prepare them for our questions, hypotheses, and predictions, a lot of what an Introduction must do is present the relevant "Small Knowns."

It’s no wonder, then, that doing so very often takes up half or more of the Introduction's volume! While many other formulaic elements of an Introduction take up less than a paragraph, the "Small Knowns" rarely occupy less than several whole paragraphs in total. Moreover, each "Small Known" usually deserves its own paragraph (at least). Packing too many concepts together often leads to shallow summaries, full of soft evidence or no evidence at all, that'll neither prepare nor convince the reader.

Of course, the reader doesn’t need to know everything; they just need to know what’s relevant to our study. It can be hard to know what's relevant at first, which is partly why I like saving the Introduction for last, when I have as much clarity on the issue as possible.

"SMALL KNOWNS," "TOPIC SENTENCES," "SANDWICH PARAGRAPHS," AND "HARD EVIDENCE"

Ok, so how does one present "Small Knowns?" In what I call “sandwich paragraphs.” If you’ve already read the Discussion page of this guide, these shouldn't be new! The structure of a “sandwich paragraph” is so universal you'll find dozens in the Introduction and Discussion of nearly every paper you read, and for good reason. Once you learn to recognize them, you’ll see them all over.

Although “sandwich paragraphs” can be more complicated, they usually have just three parts:

A topic sentence (more on this in a second!)
“Meat.” By this, I mean relevant (preferably hard) evidence (plus any relevant context!) from other studies, here to establish our "Small Knowns" as, in fact, "knowns."
A “so what” or “recap” or “transition” sentence to put a decisive punctuation mark on this key idea and move us to the next.

So, it’s a bunch of facts sandwiched between two structurally essential “slices of bread!” Maybe it has some toppings—like logical transitions and coordinating ideas—to make the whole thing go down easier, but those elements are smaller and secondary.

The first thing we need to unpack, then, is the concept of a topic sentence. What's a topic sentence? Oh, you know, it’s only the most important type of sentence in all of science writing.

Allow me to explain. On the Preface page, I argued creative writing is, by and large, almost entirely unlike science writing. When writing a fiction story, for example, it’s often desirable to not tell the reader things. You build suspense and intrigue and drama by withholding information, by misdirecting the reader, and by foreshadowing rather than outright stating things.

By contrast, in science writing, our only goal is to communicate knowledge. To do that, we must spoil everything and hide nothing. We embrace spoiler alerts in science!

In fact, that’s the perfect way to think about this: a great topic sentence is a spoiler—it gives away the entire point of the paragraph it starts. It should tell a reader enough that, if they chose to stop reading, they’d still walk away having learned something. And if your reader is getting bored, a great topic sentence might just buy you another paragraph to hook them again.

I can’t over-stress the importance of topic sentences. If there is a single improvement we could all make (veterans and beginners alike) to our science writing, it’s learning to write better topic sentences! In some ways, they're the thematic glue that make scientific papers palatable to read. In fact, I’ve heard it said that a reader should be able to read only your topic sentences and still summarize your paper accurately. That’s how central to the communication process they are.

Anyway, now that I've outlined the anatomy of a "sandwich paragraph," let’s see one in action. Here's one from another of my own publications (Spero et al. 2018):

Consistent with our predictions, student grades were also marginally correlated with self-reported interest in ES [environmental science] (using plans to pursue an ES-related career as a proxy). This finding is not new, as many studies show that interest in course content is a robust predictor of student performance across both subjects and grade levels (Ferrell, Phillips, and Barbera 2016; Harlow, Harrison, and Meyertholen 2014; Schiefele, Krapp, and Winteler 1992). Taken in total, these results [suggest] that ES course instructors can elevate student performance by employing teaching practices targeted at increasing student interest. These practices could include linking the relevancy of course material to students’ personal lives (Hulleman et al. 2010; Hulleman and Harackiewicz 2009), connecting course content to student interests (Taylor, Mitchell, and Drennan 2009), and transforming course structure from content-centered to learner-centered (Russell et al. 2016). As an example, Russell et al. (2016) successfully used active learning approaches to increase student interest in an introductory college ES course such that students earned final grades that were 0.75 letter grades higher than students in the traditional lecture-based course. Our results add to the ever-growing evidence that student interest plays a consequential role in STEM classroom performance, now extended to ES coursework specifically.

This example is from a Discussion rather than an Introduction, but the idea's the same. Here, I’ve put the topic sentence (here, it's really a sentence and a quarter) in italics. Notice how bluntly it states the entire message of the paragraph. The reader almost needn’t read the rest of the paragraph unless they want to.

The “summation sentence,” here taking the form of a “recap sentence,” is underlined. It repeats what we’ve just argued (it reiterates the topic sentence using new words) so the message really sticks. Sometimes, you just have to repeat yourself to get the point across! However, it also states what we believe is novel about our study, and it signals to the reader that we’ve said our piece about this topic, so they should expect to see a new idea in the next paragraph.

The rest of this paragraph is “meat.” Here, we bring together results from other studies (plus our own) to lend support and context and "spice" to our argument. Notice that many of these “meat” sentences take the general form of “Person(s) X found Y when doing Z, and that does/doesn’t align with our argument because….”

The meat is essentially an elaborate listing of places the reader could go to find more (or less) support for our idea, plus a little context and “mayo” to make the whole thing palatable. That's the case for a "sandwich paragraph" in the Discussion, anyway. In the Introduction, the same approach is used to flesh out the "Small Knowns" through summarization of how related questions to ours have been answered in the past.

The bolded sentence in the example paragraph above is, I think, a prime example of a “meat” sentence. It tells us the gist about a particular study and then presents hard evidence (some specific “numbers” from their Results, not their Introduction or Discussion!) from it to support our argument. Make a mental note of how this sentence works; it's a good model to follow.

I also cannot over-stress the importance of “hard evidence.” As I said passionately in the Discussion page of this guide, hard evidence is what's been shown, not merely argued. In other words, hard evidence come almost exclusively from Results sections (not Introductions or Discussions) and is often (but not always) numbers or statistics. Anyone can say a thing, but data showing a thing is much more incontrovertible!

In the example sentence called out above, we don’t just tell you that "student performance was increased" but by exactly how much (0.75 letter grades), relative to what (traditional lecture courses) and under what specific circumstances (an introductory environmental science course). That's what compelling evidence looks like in science, if I do say so myself!

A good target is one "sandwich paragraph" per critical "Small Known." Sometimes, you can work more than one into a single paragraph if they’re related enough. Other times, you might need more than one paragraph to flesh out a single complicated "Small Known." Think of each "Small Known" as a standalone fact or concept the reader needs to grasp and agree with to understand your question, hypothesis, prediction, and/or methods. Don't force unrelated ideas into the same "sandwich."

Another good guardrail is to not go past four "Small Knowns" paragraphs unless your study is really complicated and needs that much setup. If you feel you need more that that, I'd write the most critical four, stop, and solicit feedback. Maybe you’ve misjudged what’s essential and someone else can re-orient you.

Another consideration is: How much hard evidence do you have to fill your "sandwiches" with? A sandwich with no meat isn't very satisfying (an odd statement coming from a vegetarian)! The more sources and hard evidence you have, the more sandwich paragraphs you could have. That isn’t an excuse for not being judicial about what you present, of course, but it does establish the minimum length of this portion of the Introduction.

All this talk about “backup” reminds me: You have sources…don’t you?? I really wouldn’t recommend starting an Introduction draft without having completed your literature review! You can’t write good "sandwich paragraphs" without sources, so that means you can’t write at least half your Introduction without sources. In that case, what'd even be the point? Since I like to hold of (read: procrastinate) on conducting my literature reviews, this is another reason I tend to wait as long as possible on writing my Introductions.

CLOSING OUT THE INTRODUCTION

Whew—that’s the "Small Knowns" addressed (finally). We have just two more goals our Introduction must satisfy, and these usually take only about a paragraph each to nail.

The first is what I call your “Justification.” You’ve identified a “Small Unknown”—so what? Why was resolving this particular "Small Unknown" worth your time? Actually, more importantly, why is investigating this "Small Unknown" worth my time, as a reader? Or the field's time? Or your funder's money?

Again, imagine your reader is going to stop reading after this paragraph unless and until you convince them not to. Imagine no one will fund your next study unless you convince them, right here, this one was worth the money. Imagine your reader'll only keep reading if there is something they can do, when they're done, to act on what you’ve taught them.

In our bloodroot example from (way) earlier, you might argue, at this point in your Introduction, that your study can be used to issue recommendations for when to plant bloodroot seeds in residential gardens. For some readers, at least, this'll be enough to keep them hooked; it’s an application of what your study has found.

For others, and in other contexts, you may have to articulate the theoretical benefits of your work—what it tells us broadly about how many plant species operate, for example. For still others and in still other contexts, you may need to explain how solving this "Small Unknown" creates a solution to a problem. Maybe it’ll revolutionize bloodroot conservation, which has to date been limited by poor seed germination in captivity. This part of your Introduction is where you really need to know your audience and provide the specific motivation they need to invest emotionally in your work.

The "Justification" portion of an Introduction can often be brief (often just one paragraph), but there needs to be one. It also should be grounded—don’t oversell your project! It’s especially tempting for junior scientists to do this; I know because I was one and I totally did this. Don't make your Justification sound like a Nobel Prize acceptance speech! Resist that temptation to go much beyond the "Small Unknown" you're seeking to conquer.

After all the rest is sorted, we arrive at the last paragraph of the Introduction. It's both the most important one and also the easiest to write. In it, you state, as simply as you can:

Your question(s), often as more or less a straight list. “In this study, we sought to answer these three questions: 1) A, 2) B, and 3) C.”
Your hypotheses, also often just a list. “We hypothesized sensing of nighttime temperatures occurs via…”
Your predictions. You may need a few sentences each to do these justice. “We expect that, under abnormally warm spring conditions of 18 degrees C, seed germination will occur at significantly higher rates, resulting in abnormally high seedling mortality as measured by…”
Optionally, a “sneak preview” of your test. If this is included, it often precedes the predictions. Here, you'd say just enough that your predictions and methods don't feel entirely untelegraphed. “We subjected seeds to a range of nighttime cold temperatures under controlled greenhouse conditions and measured germination rates and seedling mortality over a course of six weeks…”

This order isn't totally hard and fast, but, if you deviate from it, it should be with good reason.

The key is nothing presented in this paragraph should be feel totally unexpected to the reader. The rest of the Introduction should have logically built to this moment, so your audience is ready. If, for example, there’s a fancy term you’ll use in this paragraph, it should have already been defined. If there's a core piece of evidence supporting a hypothesis, it's already been unpacked.

You’ll know this paragraph is succeeding in its mission when a reader feels ready to jump into your Methods after reading it, confident they can already guess a lot of the things you’re about to tell them.

Table showing, in column one, the six core "ingredients" of an Introduction section. The second column houses shortened descriptions of these "ingredients." The third column lists how long these "ingredients" tend to be and where they are typically found within an Introduction.

SECOND OPINIONS (INTRODUCTION SECTIONS)

Here are some other great resources about writing Introduction sections.

Page updated

Report abuse