When Open Data is Not Enough


Open data portals around the world have certainly provided lots of data to anybody with the skill to download a dataset.  If you don’t have a particular analysis in mind, but want to develop an app to make the available data more useful, you are in luck.  If one of the available datasets hits the nail on the head for a problem you want to work on, even better.  However, the data that sits in open portals is there for a number of reasons. Opendatahandbook.org provides many good reasons as to why.  Many data sets are the result of Freedom of Information Acts (FOIA) actions that some government leaders see as an opportunity to put up additional data.

However, if it’s not in the open data portals, the data one might need to address a particular problem may not be readily available. To address some of the most important problems that states and cities have, open data may not have the richness needed to identify the most important factors.  The crime data is a good example of this.  While the crime incident data that many jurisdictions are putting on their data portals tells us when, what and where a particular crime was reported, it does not tell us anything about the who–perpetrator or the victim.  In many cases, we don’t know who the perpetrator is until they are arrested, so that data may be missing anyway. However, we should have good data on the victim.  To learn about who is victimized in particular places in what ways is critical to addressing the crime problem.  Having historical information on crime incidents at a particular place and knowing whether similar types of individuals are being victimized (elderly, youth) could be valuable for better addressing crime.

Police departments do have this type of data and some large city police departments make good use of this data to fight crime (see CompStat).  However, it’s not only police departments that can make good use of this data.  Civic hackers, social or data scientists could be working with local community leaders, community and business groups, health and mental health professionals, schools and human service agencies to identify strategies to reduce crime and make our neighborhoods safer.  Public safety is a collective effort.  Therefore, putting more data into the hands of skilled individuals working with neighborhood stakeholders (and the police!) may allow us to make more progress.

Of course, politics and bureaucracy are always involved.  The 2014 State CIO survey reported that 68% of states were characterized as “fairly protective and risk averse,” which makes this the norm.  It’s clearly also true for government agencies at other levels.  It’s normal to want to prevent others from using your data to innovate and potentially make you look bad.  Also, much of this data is private and the privacy of individuals needs to be protected.

There are ways, however, to address both the risk aversion and the privacy issues in order to get access to the data that is needed for a particular problem. Data can be de-identified or grouped in ways that protects the disclosure of individual identities.  Also, it is clear that working with the data providers (in this case—the police) and not having the goal of “gotcha” is clearly in everybody’s interest. Trust between data providers and data receivers is very important.  It’s not always the best move in developing any kind of a problem-solving strategy that involves a public-private partnership to run to the media every time you learn something.  That’s probably the quickest way to break the trust and lose access to the data.

Robert Goerge is part of the GovLoop Featured Blogger program, where we feature blog posts by government voices from all across the country (and world!). To see more Featured Blogger posts, click here.


Leave a Comment

One Comment

Leave a Reply

Katarina Hong

Interesting post! You’ve definitely given me some things to consider while looking at open data. Thanks for sharing!