The Need for Data Journalists

We have discussed data visualization, data driven infographics, and data storytelling. For the most part, we have highlighted the ways that incorporating data into a piece of writing can ultimately form stronger connections with the reader when done properly. But just how prevalent is data visualization in the field of journalism? In addition, what does it look like when data visualization is used to deceive rather than inform?

Perceptions of Data Journalism

Google conducted a two-phase study on the role of data visualization in journalism as of 2017 (Rogers). The first phase of the study included interviews with 56 journalists from the U.S., U.K., France, and Germany. The second phase consisted of an online poll of over 900 media writers from the U.S. and the U.K. The researchers were interested in journalists’ perceptions of data visualization, including how often data is used in media writing and how important data representation is to writing stories. The qualitative data from the interview phase yielded responses indicating that data skills have become mainstream in the field of journalism. In addition, the online polling results indicated an increasing reliance of journalists on data when writing stories in the everyday work environment. 42% of reporters were found to use data visualization at least twice per week. 51% of news outlets in the U.S. and Europe were found to have at least one data journalist included in their staff. The poll results also found that politics, finance, and investigative reporting were the primary subjects where data journalism was utilized. Another key statistic was that 41% of subjects viewed data visualization as a so-called “mainstream” skill that most journalists should have, while 53% of the subjects viewed data journalism as a “specialized” skill that is challenging to learn.

It felt appropriate to include a study conducted by Google in this part of the e-text, considering just how important the company is for journalism in 2021. The results make clear just how vital data visualization has become in an age where data is available around every corner. It is also clear that data visualization is not a skill that is held by a small minority in the field of journalism. Despite widely held views of data journalism as a “specialized” skill that is reserved for a small minority, the results indicate that almost half of journalists make use of data at least twice per week (Rogers). Even in 2017, so-called “data journalism” was on the rise in major media companies. Yet the increasing usage of data representation in journalism does not imply that all data representations are accurate. Although more and more journalists are incorporating data into their stories, there is no guarantee that what you are seeing is a truly valid interpretation of the numbers.

Within the Google survey study, “politics” was found to be the subject with the highest amount of data visualization present in journalism coverage– journalists used data to enhance their political stories more than any other topic in 2017. Just three years later, however, a new subject would take the lead in most media outlets– a novel viral outbreak that would quickly become politicized and ultimately give rise to a global health crisis (Miller and Jarvis). The spread of COVID-19 was understandably a hot topic throughout 2020 and into 2021. It was also the perfect breeding ground for inaccurate and outright deceptive data representations. Although the COVID-19 outbreak yielded some truly inspiring data visualizations to spark substantial social change (see Data Storytelling), the pandemic also gave rise to horrendous data representations that likely caused more harm than good. Below are just a few examples of COVID-19 related data representations that are misleading for the viewer.

View the original source here

The Dark Side of Data Visualization

The first graphic above was created by the Georgia Department of Public Health during the early months of the pandemic (Brown). The bar chart was heavily criticized for its random ordering of months on the x-axis. Notice how there is no chronology to the dates on the x-axis as the values increase. For example, April 27-29 is presented in a non-chronological order. The graph lists April 28, April 27th, and April 29th from smallest to largest (can you see the problem here?). Failing to accurately order the dates on the x-axis is what makes this graph completely misleading. This graph makes it seem as though the number of COVID-19 cases over time, which is located on the y-axis, actually decreases in a relatively steady fashion. In reality, the number of cases over time did not decrease in this way.

The second graphic is a screenshot from a Fox 31 news report (located in Denver, CO) demonstrates another way that slight changes to graphs can alter the way we interpret data (Brown). It is important to analyze not the x-axis in this case, but the y-axis. Notice how the intervals of each tick mark vary as you ascend the axis. For example, the first two tick marks increase by 10 values, going from 50 to 60. Now look at the upper portion of the y-axis. Notice how the intervals do not increase by ten anymore. Some intervals increase by 50 cases. In this case, manipulating the intervals of the y-axis ultimately makes the tail-end of the plot look like a less drastic increase than in actuality. The reality of this data was that the number of cases in Denver were rapidly increasing by the end of March.

The above graphs demonstrate how slight tweaks to our data can completely change the meaning of our interpretations.

Of course, inaccurate representations of data existed long before COVID-19. Yet the onset of the global health crisis, in combination with an already polarized political landscape in the U.S., made for the perfect storm of horrendous data representations within the realm of data journalism (Miller and Jarvis). The real lesson to be learned is that effective data representation is made of two fundamental components: presentation and accuracy. Methods like data storytelling and the usage of infographics can potentially strengthen the presentation quality of your data. However, a true understanding of the data is obviously unattainable if your data is being inaccurately represented. In short, avoid approaching the data as though you are trying to find the story within the numbers– let the story find you first.

Writers are obligated to make the data understandable to the viewer as well as to ensure that the data is being represented in a statistically valid manner. In a sense, writers must develop both visual literacy and data literacy in order to successfully incorporate data into their work.

Page updated

Google Sites

Report abuse