Use the Table of Contents below to quickly navigate to specific case studies!
The truth is that most published graphs are valiant efforts, falling a ways in between “flawless” and “terrible.” That means most graphs succeed in many respects! However, it also means that most graphs have room for improvement.
On this page, I critique some graphs I have pulled haphazardly from ecology papers recently published in Open Access journals that I think prove the point above well. For each graph, I ask: What does this graph do well? Where would I have made changes, if it were my graph? Where are some areas I’d do one thing and another reasonable person would maybe do a different one?
Then, I remake each graph the way I’d make it, using simulated data (my graphs will not match the originals perfectly for this reason!). These remade versions are not objectively “correct;” instead, they reflect my interpretation of the graph’s intended purpose as well as my view of how to best apply the principles laid out on the Principles page.
You might disagree with some of my choices! I’ll be the first to admit that I am apt to take design risks and break conventions compared to the average graph designer. However, I try to be upfront whenever I think I’ve done something “radical.” If you do disagree with some of my decisions, that’s ok—it means you’re thinking intentionally about graph design, and that’s what we want! Plus, there is room for subjectivity, personality, and experimentation within the data visualization space. Not everything I choose to do may "work," in your opinion, and you'd be entitled to that opinion! However, I lay out my reasons for every decision so that you can trace my choices back to the (more objective) principle they reflect.
I’ll also be the first to concede that I am probably more skilled with ggplot2 than the average user. There’s, of course, a big difference between knowing the graph you want to design and actually being able to make it! However, none of the graphs on this page took more than ~3 hours to produce from start to finish, even with multiple feedback cycles! Also, with a few optional exceptions, none of them use features found in a package other than ggplot2. So, my versions are not “highly advanced” in an implementation sense; anyone could learn to make graphs just like these in a day or two.
Plus, I want to stress that each of these graphs was as much an exercise in trial and error for me as it was an exercise of knowledge—all great graphs are, no matter your skill level! I frequently had to ask ChatGPT how to manipulate specific features or achieve a desired effect. It's not like I knew how to achieve everything I wanted without having to look it up. So, recognize that you may struggle at times when designing your graphs, you may have to ask for help, and you may have to sink a few hours into each new graph you want to make, but I’d argue that those are fair prices to pay for producing top-notch graphs you can be proud of!
All code needed to create the data and graphs on this page is available through my Github account.
The original figure caption is as follows:
"Proportions of different nest material types in the entire nests of (a) the four smaller bodied bird species, and (b) the three larger bodied species, collected from Montague, MA, USA in 2017 and 2018. Bars represent means and error bars are ±1 SE. AMRO, American robin; CHSP, chipping sparrow; CSWA, chestnut-sided warbler; EATO, eastern towhee; FISP, field sparrow; GRCA, gray catbird; PRAW, prairie warbler."
This graph seems to me to be well-connected to its purpose—the purpose seems to be to show that material types varied widely across species and a wide variety of nest materials were used in general, and that is the vibe I pull from this graph.
The caption differentiates between the subpanels and provides location, methodology, and time context about the data shown. The caption also alludes to the fact that “MA” refers to a US state (though the state should be spelled out fully), and it clarifies the species abbreviations, the types of data shown (subgroup means rather than raw data), and the statistics shown (standard error bars).
The use of subpanels here partitions the data into more manageable servings, using a key grouping variable (species size) to do it.
Capitalization is consistent throughout, always using “Sentence case.”
Subpanels possess shared design features.
The axes titles are understandable.
The x axis labels are sorted alphabetically, making it easier to locate elements of interest (though this could be explicitly highlighted by the caption, as I did not realize it at first).
Axis labels and legend key labels are a readable size (though all text could be larger still, and titles could be bolded to increase contrast).
To be honest, this graph made me do some soul-searching! What I finally arrived at is this: In my opinion, this graph violates a cardinal rule of great graph design: it overwhelms. There are 17 materials * 4 species = 68 bars on this graph. To try to compare 68 bars to each other, that'd be 2,278 pairwise comparisons. That's far more comparisons than anyone would be able to hold straight in their mind. If it takes 300 milliseconds to make each comparison (so, about 3 per second), that's ~760 seconds spent just comparing bars. That's over 12 minutes! This is just comparing means; we haven't even added consideration of the error bars into the mix yet.
I assume this graph contains so much data in an effort to be transparent and complete. However, a graph is not an efficient nor effective way to convey this volume of data; graphs should primarily be about vibes. We can communicate the vibe of these data (i.e., that different species have different patterns of nest material usage) through use of a curated subset of the available data, making the rest available in some other format. In this case, I think it'd suffice to show one material that had the highest inter-species variation, one with the lowest such variation (for transparency/perspective), and ~2 materials with in-between variation. This would give the reader a sense of how much (and also sometimes how little) the species vary without presenting a lot of the same patterns over and over and inundating the reader in the process.
Alternatively, to reduce cognitive load, you could bin related material types together, or you could divide the plot into more subplots to make the volume in any one panel more "bite-sized." But, here, I think being more curated is the best approach.
A second significant issue with this graph is that it's the wrong type of graph for the data shown. Bar graphs are inappropriate for continuous data (i.e., if we're plotting means and standard errors instead of frequencies or counts, it's the wrong graph). This is because bar graphs hide the shape of the underlying data; many different data sets can yield the exact same mean and standard error.
One could, in this case, question whether or not that matters. Perhaps the means (the foci of the accompanying analysis) are all that's really necessary for the reader to get the "vibe." However, since there seems to be such a significant effort made here to be transparent, I'd like to keep that spirit and try to get the raw data on here too, if I can do so without overwhelming the viewer. I attempted this by making the raw data thin, light gray vertical bars in contrast to the prominent dots used to plot the means. So, my version is more of a "dot plot/bar code plot hybrid."
Even if you disagree with the previous bullet and believe a bar plot is permissible here, I'd argue you should still consider eliminating the "bar" part of the bar plot. The only part of a bar plot that actually conveys information is the tip/end. So, unless we are representing length-based data, which makes assessing the length of the bar more intuitive, the bar itself is wasted ink, and marking the mean with just a line or point ("dot plot") is preferrable.
The original graph has symmetrical standard error bars. I suspect this was an (understandable) oversight; variance is usually not symmetric for proportional data because they have "walls" at 0 and 1, and all these data are close to 0. Calculating an asymmetric equivalent is possible, but, here, I'd argue it's unnecessary because I don't believe the error bars are adding enough context to justify the complexity they add. I think we are better off plotting the raw data to show the variance instead.
Reading the x-axis labels in the original plot is difficult because they are long and rotated 90 degrees. Since there is not a strong independent-dependent variable relationship between our x- and y-variables here (the real independent variable of interest here is probably species, which is mapped to color hue), I'd flip the axes to put the material types on the y-axis instead. This'll make it easier for us to orient these labels horizontally for cleaner reading.
I'd expand font sizes and bold axis titles a bit for greater readability.
Color hue is an efficient channel to use to separate groups here, but the palette used in the original graph is not colorblind-friendly. I’d switch to the viridis palette, though I’d avoid that palette’s yellow tone because it would lack contrast against the white background.
The use of species abbreviations in the legend is unnecessary and will lead to “eye-darting” between the figure and the caption. I’d use the full names instead, and I'd switch these to the scientific names to respect a potentially global audience who may not recognize the American common names for these species. Lastly, I think the legend's original position was space-inefficient and put the legend too far out of view, so I'd move it to be a horizontal stripe above the graph so that it draws the reader's eye first rather than last.
Vertical gridlines in the original graph both increase cognitive load and interfere with reading individual bars while providing no essential benefit. The horizontal gridlines, meanwhile, added some value but I'd de-emphasize them via desaturation to make them easier to ignore.
The y-axis labels were too crowded in the original to be easily read. There's also a missing label at the upper end, making the scale feel "truncated." I’d expand the axis limits and adjust the break points so that a label appears at the end of the axis and the labels aren't too frequent. I’d keep the new labels equally spaced and at round numbers. I’d also switch the units for this axis to percentages, which can be whole numbers instead of decimals and thus less bulky.
In my opinion, the x-axis title ("Material type") is redundant here. Given that the other axis title already notes that what is shown is fractions of a nest's total mass, we can deduce that the x-axis shows things that could be used in a nest. Thus, I think this title could be safely omitted here. The same is true for the legend title; if we use full scientific names, it'll be obvious enough that the labels are species that we needn't use a title to clarify this further.
The caption lacks a call to attention, an encapsulation of the message this graph is trying to convey and where to look for it. Here, just a brief repeating of a part of the surrounding text could tie the graph to the argument this graph is supporting (see my version of the caption below).
This figure features different aesthetic choices than Figure 4 in the same paper. The authors could align their design choices more across figures so readers need to learn (more or less) only one set of visual “rules.”
The bounding box around the plotting area in the original has two lines (on the top and right) that don’t convey information even though they have equivalent visual weight as the two lines that do (the x- and y-axis lines). I’d eliminate those two unnecessary lines to reduce cognitive load.
The channels used (what visual aspects convey which information) could be stated explicitly in the caption via parentheticals. This is technically “repeating information,” but it can help some readers who parse that type of information more quickly from a text description than by interpreting that information visually.
It'll be important, as we make our version of this plot, that we pick an appropriate final width. Here, I'll size everything such that the figure will be a full-page width.
My revised figure caption (assumes this a standalone graph and not a subpanel):
*FIGURE XX. Percent of total nest mass (x-axis) comprised by four representative nest materials (y-axis) used by four, smaller-bodied bird species (colors). These four materials were selected to span the range (including the third and first quartiles) of interspecific mean differences among all 17 materials measured. Circles are species-specific means; vertical gray bars show observed values from individual nests. Horizontal gray lines link all observations from the same species and material type. Data were collected from Montague, Massachusetts, USA (2017-18). In a multivariate analysis (ANOSIM) of all 17 material types, mean proportional masses differed significantly between species (r = 0.93, p < 0.001), indicating high interspecific variation in nest material usage.
The original figure caption is as follows:
"Community comparisons between closed (dark grey) and open ecosystems (light grey) at the archipelago scale. (a) Species accumulation curves, (b) bar plots showing species frequencies as a function of the number of sites present, in the closed (left) and the open ecosystems (right). Boxplots showing (c) differences in the estimated number of species per site, (d) differences in the proportion of estimated endemic species per site and (e) differences in the percentage of ballooning species per site, for the open ecosystem and the closed ecosystem. "
This graph connects well to its apparent purpose—I think the purpose is to show that endemic species made up a greater percentage of all species within closed ecosystems, and that is the sense I get from this graph (although it could be emphasized somehow).
The caption clarifies differences between the subpanels and notes that this particular subpanel contains proportions of estimated endemic species per site type, not raw data.
This graph has a similar design ethos as other figures in the paper, such that readers can take what they've learned about how to read one graph into all the other graphs, giving them a head start towards understanding.
Boxplots provide more detailed distributional information than a bar plot of means and error bars would, providing greater transparency, so it's a good choice here. However, given that the purpose is to show a mean difference, and given that boxplots do not show means by default, it could be argued the latter would have been a simpler and thus better choice here. Alternatively, we could simply add the means to this plot to facilitate their comparisons.
The text is a readable size (though it could be larger still, in my opinion).
The y-axis labels extend to both ends of the axis, and the y-axis limits were not distorted to make the difference seem more substantial than it is (i.e., 0, the null hypothesis value, is still the low-end anchor instead of a value like 25%, which would have reduced the amount of void space within the plotting area but would have also distorted the results).
There is adequate white space between the axis labels and the axis titles and between group elements within the plotting area for easy parsing.
In abstract, I like that the authors used color saturation instead of color hue to differentiate between groups, but, in practice, it comes with some baggage...see below.
Technically, the statistical comparison that accompanies this graph is one of a mean difference, but the means are not shown (just the medians), and the reader’s attention cannot thus be drawn to that difference here. I'd keep the plot a boxplot for transparency regarding the underlying distributions, but I'd desaturate them to de-emphasize them and, instead, add and emphasize the means so these can be quickly and easily compared.
The x-axis labels are vague in the original, and the caption does not clarify their meaning enough for the figure to stand alone. From the text, what’s meant by these terms, I think, is “non-forests” and “forests,” so these less-encoded terms could be used in the labels for greater clarity.
Several statistics appear on the graph (whisker length on the boxplots, outliers, and a significance code [asterisks]), but none of these are clarified in the figure or caption.
There’s substantial void space within the plotting area. Some of this is good; it's the result of choosing an appropriate lower y-axis limit. However, I'd seek to use or eliminate the rest, within reason. One might fairly argue I've gone too far in my version, though I’d defend all my choices. For example, I'd add significance data from the accompanying analysis as well as sample size info, both of which are currently missing from the figure. More radically, I'd eliminate the original x-axis line and place the labels within the plotting area to increase information density and reduce "eye-jumping."
There are two issues with the use of saturation as a channel for differentiating between the two groups here. First, there's not enough contrast between the fill color of the right-hand boxplot and its median lines. Second, the use of saturation as a channel here is redundant; we've double-mapped ecosystem type to both x-axis position and saturation here, which needlessly complicates the presentation. Eliminating saturation and using black and white would provide the greatest contrast between all elements.
Bounding lines (strokes) on the boxplots, as well as their whisker lines, could be thickened to increase contrast between them and their backgrounds, and the outlier points could be enlarged and filled transparently for easier interpretation.
The original y-axis title is difficult for me to interpret, and the caption does not clarify it well enough. It shouldn’t start with a symbol because it cannot be in Sentence case then. Lastly, the meaning of “Chao” (short for “Chao index”) is not sufficiently clarified in the caption (or in the text, for that matter!).
Speaking of the caption...
It lacks location and time context information as well as a brief description of the methodologies used to yield the data shown.
It says the graphs shown are boxplots and bar graphs, which is self-evident and should be removed.
It lacks a call to attention. Here, just referencing the statistical significance of the comparison suffices, I think.
This subpanel has a title, which is unnecessary given that that title conveys the same information as the y-axis title (actually, it does so inaccurately!).
The upper and right bounding box lines around the plotting area do not convey information, but they carry equal visual weight as the x- and y-axis lines, which do carry information. The former two lines could be removed to reduce cognitive load and “ink” usage.
In my opinion, at least, tick marks should appear on the y-axis to more clearly link specific values with specific break points (even though graphs are more about “vibes” than exact values). It just looks like something is missing without them, and their absence makes it harder to read the values of outliers, if that is of importance (and, if not, why plot them at all?).
Alternatively/in addition, some de-emphasized horizontal grid lines could be included, which would increase the reader’s ability to discern the values of outliers, median lines, etc., if we thought that our readers would want to do that. I've chosen to add both elements in my version, but I could see a valid case for using just one other the other (or even neither).
This is a little radical (though absolutely not without precedent!), but I would move the y-axis title to the top of the graph and turn it to read horizontally so that it is easier to read, given that it's a little long. This aligns with the graph design principle that, in general, we read graphs in Z-shapes from top-left to bottom-right, so we'd be providing key information in the upper-left corner where readers will encounter it quickly. After all, in some ways, the y-axis data are what our graph is "about!" I suspect the authors would like this option, too, as the original featured a plot title and a y-axis title conveying much the same information. Now, we have just one element fulfilling both roles.
My revised figure caption (assumes this a standalone graph and not a subpanel):
Figure XX. The percentage of spider species that are endemic (y-axis) differed significantly between open and closed Canary Islands ecosystems (x-axis). Values were calculated by dividing estimated endemic spider species richness (Chao indices; c.f., Oksanan et al. 2022) by total estimated richness based on specimens collected and genetically sequenced in March-April of 2019-21. Diamonds are group means; thick bars are medians; boxes show the interquartile range (IQR); whiskers extend to non-outlier minima/maxima; and circles are outliers (> |1.5 * IQR| beyond the quartiles). Statistics are from a two-sample t-test.
The original figure caption is as follows:
"Relationships between community weighted mean values for N – Ellenberg's nitrogen indicator, LDMC – leaf dry matter content (mg g−1), and SLA – specific leaf area (mm2 mg−1) and the linear regression slopes from the time series analysis of NDVI changes through time. Vegetation type abbreviations: AFM Agrostis–Festuca–Molinia, Cl Cliff, DH Dry heath, Ea Eriophorum, Fr Festuca rubra, Hl Holcus, HA Holcus–Agrostis, HAS Holcus–Agrostis–Sphagnum, HP Holcus–Poa, La Lair, Lu Luzula, Mc Molinia, NJ Nardus–Juncus, NR Nardus–Racomitrium, Pl Plantago, PAF species-poor Agrostis–Festuca, RAF species-rich Agrostis–Festuca, Ru Rumex, Sp Sphagnum, WH Wet heath."
A scatterplot here is an appropriate graph choice; both the number of points and the amount of overplotting seems modest enough.
A black-on-white design ensures good contrast and is conventional for journal articles--there's no need to invoke color here.
Suppressing gridlines keeps cognitive load low (scatterplots, in particular, benefit from omitting gridlines because information density tends to be high in the plotting area already).
Points are reasonably large (though they could be larger still). No obvious instances of overplotting are occurring, so use of a solid fill color for the points is fair.
The x- and y-axis titles and labels are a readable size.
The x- and y-axis lines are sufficiently striking; the other border lines have been removed from around the plotting area to decrease cognitive load.
Axis tick marks are large enough to be noticeable without being distracting.
Use of plant community codes here for point labels instead of full names is appropriate, given the large number of communities and the lengths of their full names.
Ostensibly, the y axis labels feel cumbersome. However, here, they are probably as good as possible, given the complex and partly unitless nature of the data being plotted (slopes from regressions).
The trendline does not extend past the range of the data, which communicates to the reader that extrapolating the trend beyond that range may be inappropriate. That's a nice touch.
The caption defines all the community abbreviations in a logical order (alphabetically).
The subpanels of this figure all feature the same y-axis scale, making comparisons between graphs easier. Their design choices also seem similar.
The purpose of this figure seems to be to emphasize a trend: As a plant community’s leaf dry matter content (LDMC) increases, the rate of increase in another trend decreases. This is a complex message for a single plot to convey, but I think this graph nearly does it. However…
The trendline feels visually de-emphasized relative to the points (thanks to the dashed style), which feels backwards.
The y-axis title is vague enough that I had to consult the caption several times to remind myself what it referred to.
The message is muddled by the addition of point labels, which clarify which plant community is represented by which points. This inclusion suggests that I should additionally consider how different communities vary in their leaf dry matter contents, and focusing on that source of variation conceptually de-emphasizes that there is a cross-cutting trend that applies reasonably well to all communities. It begs the question: What are the point labels for?
As noted above, point labels clutter the graph here, and it isn’t always clear which labels go with which points (a typical problem). They are also not large enough to be easily read. My read of the paper is that these labels do serve an important (though secondary) purpose in this graph, so we should try to improve them rather than eliminate them. One approach I'd try is to use the abbreviations (the current point labels) as the points themselves. This'd halve the number of elements plotted. For this, it’d be more consistent if the labels were more consistent in length, so we could reduce them all to exactly two letters. Also, mixing upper- and lowercase letters can sometimes be confusing (e.g., lowercase l looks like capital I in many fonts), so use of all caps would be clearer and more consistent in length. We can even slap a stroke (border) around the labels so they look a little like conventional points.
In my opinion, I think the authors have distorted the data somewhat by not including a y-axis value of 0 within their limits. 0, in regression analyses, is the null hypothesis value, and it's appropriate for that value to be visible in most cases. While extending the axis limits in this way adds considerable void space, it also communicates that the observed slope values are all some distance away from 0, which may support other arguments the authors are making. Plus, it makes the plotted trend seem only as dramatic as it actually is and not more so.
Statistics (a slope, an associated p-value, an R-squared, etc.) are implied by the inclusion of a trendline, but these are not presented in the figure or caption. At a minimum, the significance of the trendline should be presented somewhere. Given the ample void space available in the plotting area, it could be added there, where it would reduce the need for “eye-jumping.”
The associated text actually draws readers attention to the amount of variance explained by the trendline in this graph, but that isn't reflected in the graph's design. An R-squared or CV value could be added to the ample void space in the plotting area to remedy this. Also, an uncertainty band around the trendline could be added to emphasize this message. One could argue that plotting the raw data is enough and the uncertainty band is overkill, but I’ve included it in my version to show you what’s possible, even though I’ve de-emphasized it by making it light gray.
In the caption, the formatting for the plant community codes makes them difficult to parse. Equal signs could be used to separate codes from names, and semicolons could be used in place of commas to better mark separate items in such a long list.
Relatedly, the caption clarifies three community abbreviations that do not actually appear in the figure; these should be removed.
Both the x- and y-axes labels are truncated; they lack labels at both ends. The axes limits should be expanded and new break points assigned so that both axes have ends anchored by labels.
There should be more space between the x- and y-axis titles and their labels, especially given how bulky the current labels are on the y-axis.
Given that there’s ample room for a longer x-axis title, use of the abbreviation LDMC is unnecessary encoding. Using the whole term and providing the units would reduce the need for “eye-jumping” between the figure and caption.
The caption lacks quite a bit of context info (location, time, methodology, etc.) and also lacks a call to attention. NDVI is also mentioned in the caption but is not defined.
If we were keeping the original points, they could be larger to be highly visible, especially given that there isn't substantial overplotting occurring.
I didn't do it here, but I think turning the y-axis title horizontal and placing it beside or above the graph is always a valid design consideration. Here, the title felt brief enough to me that keeping it turned 90 degrees felt like a small burden to place on the reader. Notice, however, that I did use multiple lines for the x-axis title, allowing me to increase information density.
My revised figure caption (assumes this a standalone graph and not a subpanel):
Figure XX. Slopes of Normalized Difference Vegetation Index (NDVI) over time (1985-2020; y-axis) for plant communities (points) on the island of Hirta, Scotland as a function of their diet quality for sheep (x-axis). Each slope is from a community-specific linear mixed-effect regression. Communities with higher leaf dry matter content (LDMC) exhibited significant lower slopes (black best-line from a linear regression; gray region is the 95% Confidence Interval). Points have been modestly jittered to avoid overplotting. Community abbreviations are according to dominant taxa or ecosystem type: AF = Agrostis–Festuca–Molinia; CL = Cliff; DH Dry heath; ER = Eriophorum; FR = Festuca rubra; HA = Holcus–Agrostis; HO = Holcus; HP = Holcus–Poa; LA = Lair; LU = Luzula; MO = Molinia; NR = Nardus–Racomitrium; PA species-poor Agrostis–Festuca; PL = Plantago; RA species-rich Agrostis–Festuca; RU = Rumex; SP = Sphagnum; and WH = Wet heath.