An application based guide to choose visual components for plots and charts

Bar Chart:

Bar charts are useful for comparison of data in different categories or over time as well as comparing a part of the dataset to the whole information available at hand. There are two types of bart charts as:

  1. Vertical (column chart): It is suitable for chronological dataset such as time-series or visualizing negative values below the x-axis.

  2. Horizontal: this type is more convenient to use for dataset with long labels.

There are two more additional types of bar chart, described as:

  1. Stacked bar chart: this bar chart type is best used to compare multiple parts of the dataset to the whole information available, which could be used for continuous or discrete dataset as well as be combined with vertical or horizontal types.

  2. 100% Stacked: it is useful when the percentage distribution of subcategories of the dataset is the most information that needs to be communicated with the users.

There are several best practices to choose visual components of the bar charts, summarized as:

  • Use horizontal labels: it is much more convenient to read horizontal labels in comparison to vertical ones.

  • Space between bars: appropriate space between bars is usually considered as half a width of the bars to give the best visual insight.

  • Start the y-axis value at zero: this suggestion should be taken as a rule of thumb to start the y-axis value at zero but for more skewed dataset, it might be more convenient to start from a non-zero value for y-axis. So overall, it depends on the dataset at hand and its numerical values.

  • Use consistent colors: use consistent and unified color for all the bars, but if you want to bold or emphasize on a certain data point, you can change the color of that particular bar.

  • Order data appropriately: it is more convenient and visually pleasing to follow the bar chart if the bars are ordered sequentially for numerical variables or sorted alphabetically for categorial parameters.

Pie chart:

Pie charts are primarily designed for making part-to-whole comparison of discrete or continuous dataset. It is important to keep in mind that pie charts are more informative if the number of categories be kept small. Also, based on opinions of some critics, pie charts are only useful if the dataset will be represented in terms of percentage to have consistent portion of angle from a circle, otherwise comparison of non-normalized dataset by using pie chart might be misleading. There are two common types of pie charts, described as:

  1. Standard pie chart: this type is used primarily to show part-to-whole relationship in the dataset.

  2. Donut pie chart: it is a more stylistic and visually appealing version of standard pie chart that enable users to put numerical values on categories or have a centric design.

There are several best practices for creating pie charts and choosing its visual components, summarized as:

  • Keep the number of categories per pie chart below 5: it is important to keep number of pie chart categories below 5 to have a more careful comparison. It is useful if it is possible to combine smaller groups together and form a “miscellaneous” group in order to decrease the number of categories.

  • Don't use multiple pie charts for comparison: if you need to compare multiple pie charts together, it is better to use stacked bar chart because comparing pie chart might be difficult or even misleading.

  • Make sure all the data points in a pie chart adds up to 100%: as we said before, using pie charts for non-normalized data points might be misleading. As a result, make sure all the pieces in a pie chart adds up to 100%.

  • Sort pie chart pieces correctly: it is important to keep in mind that ordering or sorting pie chart pieces in a ascending or descending order might help users to have a better insight from your visualization.

Line chart:

Generally, line charts are suitable to visualize continuous data which show trend, acceleration, deceleration, and volatility in the data:

There are several important visual components that need to be used for line charts to have an informative and compelling visualization summarized as:

  • Include a zero baseline if possible: this point heavily depends on your dataset, but if it is possible it is useful to have zero or non-zero baseline for better comparison. Sometimes, adding zero baseline might neglect small fluctuations available in your dataset, where you might avoid adding this zero baseline for the sake of better clarity and showing more detail about dynamics of your data points.

  • Don't plot more than 4 lines: it is more convenient to compare lines in a line chart that their number is 4 or less. If you want to compare more lines, it is better to have a separate line chart.

  • Use solid lines only: using dashed or dotted lines might be misleading, so if it is possible try to keep using solid lines as much as possible.

  • Label the lines directly: attaching labels to lines might help users to identify different categories more easily.

  • Use the right height: choosing correct ranges for x and y axes are important. As a rule of thumb, choosing the ranges to take two-third of the plot area might be appropriate for have a more appealing line chart visualization.

Area chart:

Area charts, similar to line charts, are useful for showing time-series relationship, but in addition to dynamics of data points, they are useful for showing the volume or area under the line for better comparison. There are three types of area charts, summarized as:

  1. Area chart: it is useful to show or compare a dynamics progression over time for continuous dataset.

  2. Stacked area: it is useful for showing part-to-whole relationship as well as dynamics progression over time.

  3. 100% stacked area: it is useful for showing the distribution of categories in a part-to-whole relationship visualization.

Some points for have a compelling area chart visualization:

  • Arrange stack area charts appropriately: one visual component that might be taken importantly is that put the most variable stack of area chart at the top and least variable one at the bottom to have a more appealing data visualization.

  • Start y-axis value at zero: it is important to choose an appropriate range for y-axis to make sure all the details captured successfully.

  • Don't display more than 4 categories: keep number of categories in area chart below 4 to have an easy understandable data visualization.

  • Use transparent colors: in order to make sure some details of the area chart is not masked by other areas, it is important to make some of the colors transparent to have full visual insight of the area chart.

  • Don't use area charts for visualization of discrete dataset: it is emphasized here that area charts are designed for continuous data visualization.

Scatter plot:

Scatter plots are suitable for showing large-amount of data and visualization correlation between two sets of variables or parameters.

Some practical points for scatter plots summarized as:

  • Start y-axis at zero or have an appropriate range: it is important to have an appropriate range x and y axes to make sure all the details in the data visualization is captured correctly.

  • Include more variables: use size and marker color for encoding and including more information about your dataset.

  • Use trend lines: in order to find a correlation between plotted variables, using trend lines might help to find the trend.

  • Don't compare more than 2 trend lines: having more than 2 trend lines, makes comparison and finding the correlation more difficult or misleading.

Bubble chart:

Bubble charts are suitable for visualization of nominal comparisons and ranking relationships. There are two types of bubble chart as:

  1. Bubble plot: this chart might be categorized as a sub-category of scatter plot to display an additional variable.

  2. Bubble map: best suitable for visualizing variables on geographical maps.

Practical points related to bubble charts:

  • Make sure labels are visible: having a visible labels help users to identify groups or categories more conveniently.

  • Size bubbles appropriately: usually size and colors could be used to encode additional variables into the bubble chart so using appropriate size and color might help to have a more compelling visualization.

  • Don't use odd shapes: adding too much details or using non-circular shapes might be misleading.

choropleth map:

Choropleth maps are useful for visualization of multi-dimensional dataset or encoding data on geographical maps.

Practical points for heat map:

  • Use a simple map outline: the outline of choropleth map or geographical map should be visually appealing and not distracting.

  • Select colors appropriately: selecting colors that shows trends intuitively might help to have a better visualization such as using red colors for high values or blue colors for small values. Using odd colormaps might undermine the message of data visualization and mislead the user.

  • Use patterns sparingly: use patterns to encode a second variable into your choropleth map visualization, but using more than 2 might be distracting.

  • Choose appropriate data range: using appropriate data range such as finding the maximum and minimum as well as linear or logarithmic distribution helps people to have a more informative visualization.

In the next section, some practical tips would be provided for best practices of data visualization generally.