Notched Box Plots
Last update April 28, 2025
Last update April 28, 2025
Notched Box Plots (aka Notched Box Whiskers) are a very useful way to present and view the data of a sample group or groups and is a modification of box plots aka box whisker plots. The range-bar method was first introduced by Mary Eleanor Spear in her book "Charting Statistics" in 1952 (Spear 2024) and later in the book "Practical Charting Techniques" in 1969 (Spear 1969). The box-and-whisker plot was introduced in 1970 by John Tukey, who later published on the subject in his book "Exploratory Data Analysis" in 1977 (Wickham,2011). The notch was introduced by Robert McGill, John Tukey, and William A. Larsen in their 1978 paper, "Variations of Box Plots".
The whiskers aka Tukey's fences
Upper - Maximum value excluding outliers with outlier defined as a value greater than 3rd Quartile + (1.5 x IQR)
Lower - Minimum value excluding outliers with outlier defined as a value less than 1st Quartile - (1.5 x IQR).
Possible Outliers - Data that are greater than the 3rd quartile plus 1.5 x the interquartile range or the 2nd quartile minus 1.5 x the interquartile range. This is the Turkey Fence method which represent approximately 2.7 standard deviations of the mean and should capture 99% of the data. Note, this assumes a normal distribution. For more information on outliers see the Outliers - The Problem Children of Data.
The Line - Is the median of the data.
The Notch - displays the a confidence interval around the median which is normally based on the median +/- 1.57 x IQR/sqrt of n. According to Graphical Methods for Data Analysis (Chambers, 1983) although not a formal test, if the two boxes' notches do not overlap there is ‘strong evidence’ (95% confidence) that their medians will differ. Note Here are the pages from the book. Citation Chambers, John M., William S. Cleveland, Beat Kleiner, and Paul A. Tukey. "Graphical Methods for Data Analysis", 62. Belmont, California: Wadsworth International Group;, 1983. ISBN 0-87150-413-8 International ISBN 0-534-98052-X
The graph to the right shows a box plot in relationship to a normal distribution. Click for the original full scale image.
How to Read / Interpret Notched Box Plots
One important use of notched box plots is to compare groups of data. In the example below we have some groundwater data for barium. MW01 is the background well and MW02, MW03, and MW04 are the downgradient aka compliance wells. We can gather the following information from the graph.
Populations:
MW01 and MW02 have IQR that are similar in range and the notches overlap. This indicates they may be from similar populations.
MW03 does not overlap at all with MW01 and its IQR is higher and beyond the IQR for MW01. This indicates they are different populations and MW03 is higher.
MW04 has an IQR that starts near MW01 but extends well above MW01. Also, the notch for MW04 does not over lap and is higher. This indicates MW04 median is different (higher) than MW01 with a confidence of 95%
Based on this:
MW01 and MW02 are most likely from similar populations and
MW03 and MW04 are from populations greater than MW01.
Outliers - We see a possible outlier in MW02 at 0.15 mg/L.
Variance - for MW01 and MW02 the max/min and the IQR are similar indicating they have equal variances
Skewness - MW01, MW02, and MW04 have medians near the center of the IQR and the max/min. MW03 has a median much closer to the 25th percentile quartile than the 75th percentile quartile. This indicates that MW03 is skewed to the low side.
One thing that can provide even more data in the graph is the addation of violin plots ontop of the notched box plots. The violin plots add kernel density plot to show the distribution of numerical data. This provides a graphical representation of the dataset distribution. Now what can we see:
Wells MW01 and MW02 appear to have distributions and MW02 has one point that appears to be an outlier.
MW03 appears to have bimodal distribution. This can be caused by a population that changes with the seasons.
MW04 has wide distribution
How can I make Notched Box Plots???
R and RStudio Most of my examples were created in R which is free but has a steep learning curve. RStudio is a user interface that make R a little more user friendly. Here is an excellent video on doing notched box and violin plots in R.
ProUCL is a free statistical package by the U.S. EPA which can easily generate notch box plots.
Online Generators - Just search for them or here is one by Free Statistics Software
References:
Chambers, John M., William S. Cleveland, Beat Kleiner, and Paul A. Tukey. "Graphical Methods for Data Analysis", 62. Belmont, California: Wadsworth International Group;, 1983. ISBN 0-87150-413-8 International ISBN 0-534-98052-X
Mcgill, R., Tukey, J. W., & Larsen, W. A. (1978). Variations of Box Plots. The American Statistician, 32(1), 12–16. https://doi.org/10.1080/00031305.1978.10479236
Spear, Mary Eleanor (2024). Charting Statistics. McGraw Hill. p. 166.
Spear, Mary Eleanor. (1969). Practical charting techniques. New York: McGraw-Hill. ISBN 0070600104. OCLC 924909765
Wickham, Hadley; Stryjewski, Lisa. "40 years of boxplots" (PDF) 2011. Retrieved April 26, 2025.