Data Journalism: How to Unlock Value in Government

When you first think of journalism and governments, scandals and exposés come to mind. With increased access to data and tools that find connections, data journalism makes governments wary. Here’s why governments should tell their own stories.

The value of government data

A government is one of the largest repositories of data on its citizens, partners and other stakeholders including its own agencies. Yet we often do not recognise data as an asset. It is something:

  • to be kept in triplicate in those ever-present file cabinets.
  • that if there are too many questions about or too much attention to it means scandals and resignations.
  • that when reports are due, we create pie and bar charts and then use many words to describe those charts.

Due to the changing technology landscape, our citizens now expect us to engage online and on their terms. For example, governments’ use of social media is a direct response to how their citizens want to engage and access information.

Based on the number of stories other people tell with our data, it is obvious that government data is in high demand.  Yet, it is not our standard to analyze our data. Neither do we analyze it the way others do.

Government data can be used for:

  • creating economic opportunities
  • furthering the organization’s core mission
  • increasing accountability and responsiveness
  • improving public knowledge of the organization and its operations
  • responding to needs and demands as identified through public consultation

What is data journalism?

There is no standard definition for data journalism. It has been described as journalism using data. It has also been defined using the skills required for data journalism. Despite different points of view, the data journalism process begins either with:

  1. a question that needs data or;
  2. a dataset that needs questioning

The data journalism process


First, we need the data. This means data that is in high demand and not the data in which we are interested. We can collect data by doing it ourselves because governments love forms and surveys. However, because we don’t like to share our data we can also:

  1. Convert documents into something that can be analyzed using tools like CometdocsInstant Data Scraper, and Tabula.
  2. Pull information from application programming interfaces (APIs).
  3. Search other government sites using advanced search techniques.

It is best to store the data in a tabular format as this is ideal for use in spreadsheets. This way, we strip away the unnecessary data like report headings and descriptions.

Currently, more governments are making data more accessible in the open format. While open government data is typically published for others to use, there is nothing stopping us from using it to tell our own stories.


Next, to ensure the quality of the data, it needs to be cleaned. We can do this by:

  • removing/fixing human error; this could include duplicate and empty entries, incorrect formatting, and corrupted entries
  • converting the data into a format that is consistent with other data that is being used; e.g. converting all datasets for use in spreadsheets

In addition, not all data we collect is meant to be shared. An example would be like personally identifiable information (PII) for medical records. Therefore, it is recommended that the data be anonymized. This does not mean that anonymized datasets cannot sometimes be used to reveal personal details. Therefore, we must ensure that we don’t inadvertently reveal private details about individuals or organizations without good cause.

Putting it in context

Furthermore, we must keep in mind that we cannot always trust data. This is because it comes with its own histories, biases and objectives. Therefore, before using the data we should ask:

  • Who gathered it? This may speak to who owns data or is responsible for maintaining it.
  • When was it gathered, and for what purpose? This speaks to the relevance, timeliness and intended use of the data.
  • How was it gathered? This may reference the usefulness and quality of the data.
  • Who can explain the data? This is to ensure we understand the jargon. We also use a large number of acronyms and initializations. It would also help us to be factual/truthful, rather than being subjective based on your own interpretation.

Ultimately, when considering telling a story with data, it should answer the question “So what?” For example, from some data, we know that 1000 persons died in road accidents last year. So what? If the population is 5 million, there may not be much interest in that story. The story changes if say, 100 of those deaths happened in a particular neighborhood, with a smaller population of 1500. What if more than half of those deaths were children or the elderly?


We already know that a single data source can make a good story. We normally use one source when creating graphs for reports. And it is usually a pie or bar chart. So what if, for example, we can show on a map how the national education budget is distributed instead of on a pie chart?

Subsequently, we get better stories when we can make linkages and show the impact. Ultimately, this can increase analysis for better planning and decision making.


Finally, we publish the story. With data journalism tools we have powerful new ways of telling the stories we find in data, through charts, maps and visualizations. When combined with the new narrative it will: generate new knowledge, better engage our stakeholders and improve decision making in government.

A major challenge for governments is overcoming the general distrust and dissatisfaction of its citizens. It is important to note that the modern tools used to analyze data will reveal the good, bad and ugly.  Therefore, like traditional journalism, we must stick to the facts and the truth. Besides, we can be more accountable and transparent when we ask questions of ourselves before others do.

Some practical uses of data journalism in government

  • Aiding investigations: This would provide more effective analysis for auditors, law enforcement, case and social workers, etc.
  • Enriching reports: We can create interactive visualizations that can be accessed from many devices. Imagine presenting a budget where all stakeholders would be able to visualize the responses to questions they have from the same dataset? What if you were able to accurately benchmark your organization’s progress with others for national goals?
  • Explaining data: A picture is worth a thousand words. What about a visualization? When combined with story-telling, visualizations improve how we engage our citizens and other stakeholders.

What are some of the stories you could tell with data you collect and made publically available? I’d be interested to hear your thoughts and ideas in the comments below.

Further reading:

Erica Harris is part of the GovLoop Featured Contributor program, where we feature articles by government voices from all across the country (and world!). To see more Featured Contributor posts, click here.

Leave a Comment


Leave a Reply

Nazli Cem

It’s amazing how a simple data analysis can explain phenomena and even inform possible solutions to problems. My personal favorite when it comes to data journalism is Mona Chalabi, who visualizes the data in a very clever and playful way, thus engaging the audience in a great way as opposed to just throwing numbers at them that don’t mean much. Thank you so much for sharing, I find this very valuable.

Erica P. Harris

Thank you for the encouragement. I took a peek at some of Mona Chabli’s visualisations and have added her to my Pocket list. *thumbup*