As many readers are likely aware, two weeks ago The Journal News, a newspaper just outside of New York city, published a map showing the addresses and names of handgun owners in Westchester and Rockland counties. The map, which was part of a story responding to the tragic shooting in nearby Newtown, Connecticut, was constructed with data the paper acquired by submitting Freedom of Information requests with both counties. Since their publication the story has generated enormous public interest and the newspaper and its staff have received death threats, had their home addresses published and details of where their kids attend school published – all using publicly available data.
However, I don’t think this is a debate about open data. This is a debate about privacy and policy.
Let me clarify.
There is lots of information governments collect about people – the vast majority of which is not, and should not be available. As both an open data advocate and a gov 2.0 advocate I’m strongly interested in ensuring that – around any given data set – people’s sense of privacy is preserved. There are of course interests that benefit from information being made inaccessible, just as there are interests that benefit from it being made accessible, but when it comes to individually identifying pieces of information, I prefer to be cautious.
So it is critical that this debate not get sloppy. It is not about open data. It is about personable identifiable data – and what governments should do with it. Obviously “open” and “personal identifiable” data can overlap, but they are not the same. A great deal of open data has nothing to do with individuals. However, if we allow the two to become synonymous… well… expect a backlash against open data. No one ever gave anyone a blank check to make any and everything open. I don’t expect my personal healthcare or student record to be downloadable by anyone – I suspect you don’t either.
This is why – when I advise governments – I try to focus on data that is the least contentious (e.g. not even at risk of being personally identifying) since this gives public servants, politicians and the public some time to build knowledge and capacity around understand the issues. And this is, in part, what I believe triggered gun owners.
Of course, while I believe in privacy, this do believe data should be made available in aggregate – I have no problem with researchers using large data sets to try to find out how age or other factors might effective a terrible medical condition, or look at large aggregate student records to identify how graduation rates of risk groups might be improved. Nor is this to say that no personalbly identifiable data should be made available – the question is, to what end? And the question matters. I suspect privacy played a big part if the outcry and reaction to the Journal’s gun map. But I suspect that for many – particularly strong pro-gun advocates – there was a recognition that this data was being used as a device (of VERY unclear efficacy) to accelerate public support for stricter gun laws.
In the case of guns, I don’t know. But here is an example I feel more confident about. Personally I (and many others) believe businesses license data should be open, including personal identifiable data. But again, these are issues that need to be hammered out, debated and the public given choices. This is not where the open data discussion needs to start, and this is certainly not how it should be defined in the public, as it is much, much more in areas that are far, far less contentious. But we need to be building the capacity – in the public, among politicians and among public servants – to have these conversations, because disclosure, or the lack thereof, will be increasingly be a political and policy choice.
In the meantime, if you are an open data advocates out there – please do let people confuse Open Data with Personal Data. The two can and almost certainly will overlap at times. But that does not make them the same thing. If these two terms become synonymous in the public’s mind in ANY way, it could take years to recover. So educate yourself on privacy issues, and be sure to educate the people you work with. But above all, help them get ready for these debates. More are coming.
Some additional Thoughts
Of course, when it comes to data, if you are really worried about personally identifiable stuff, there is a lot more to fear that isn’t maintained by governments. The world of purchasable data contains a lot of innocuous staff (think maps, stock prices, etc…) but there is also plenty of overlap with personal data as well.
Indeed, much of the retaliatory data about the employees of The Journal News was data that was personally identifiable and readily available. A simple look at who I follow on twitter would likely reveal a fair bit about my social graph to anyone. And this isn’t even the juicy stuff. One wonders how many people realize just how much about them can be purchased. Indeed accessing some information has become so common place people don’t even think about it: you know that pretty much anyone can get a copy of your credit score, right?
I for one, recognize the difference between data the state forces you to disclose and that which you “voluntarily” submit and cede control over so as to take part in a service. I don’t always like the latter, but I recognize it is different from the government – it is one thing to have a monopoly on violence it is another to have a terrible EULA. That said, I suspect that many people would be disturbed if they saw exactly how many people were tracking all of the things they do online. Mozilla’s Collusion project is a fun – if ultimately fruitless – tool for getting a sense of this. It is worth doing for a day just to see who is watching what you watch and do online.
I share all this not because I want to scare anyone – indeed I suspect these additional notes are old hat to anyone still reading, but recognize how complex the public’s relationship with data is. And as much as it will upset my privacy advocacy friends to hear me say this: my sense is that public is actually still quite comfortable with vast amounts of data about them being collected (Facebook seems to be able to do whatever it wants with almost no impact on usage). Where people get finicky is around how that data gets used. Try to use it to take away their guns… and some of them will get very angry. It has all the makings of an epic public policy and corporate policy nightmare.