The Value of Open Data – Don’t Measure Growth, Measure Destruction

Alexander Howard – who, in my mind, is the best guy covering the Gov 2.0 space – pinged me the other night to ask “What’s the best evidence of open data leading to economic outcomes that you’ve seen?”

I’d like to hack the question because – I suspect – for many people, they will be looking to measure “economic outcomes” in ways that I don’t think will be so narrow as to be helpful. For example, if you are wondering what the big companies are going to be that come out of the open data movement and/or what are the big savings that are going to be found by government via sifting through the data, I think you are probably looking for the wrong indicators.

Why? Part of it is because the number of “big” examples is going to be small.

It’s not that I don’t think there won’t be any. For example several years ago I blogged about how FOIed (or, in Canada ATIPed) data that should have been open helped find $3.2B in evaded tax revenues channeled through illegal charities. It’s just that this is probably not where the wins will initially take place.

This is in part because most data for which there was likely to be an obvious and large economic impact (eg spawning a big company or saving a government millions) will have already been analyzed or sold by governments before the open data movement came along. On the analysis side of the question- if you are very confident a data set could yield tens or hundreds of millions in savings… well… you were probably willing to pay SAS or some other analytics firm 30-100K to analyze it. And you were probably willing to pay SAP a couple of million (a year?) to set up the infrastructure to just gather the data.

Meanwhile, on the “private sector company” side of the equation – if that data had value, there were probably eager buyers. In Canada for example, interest in census data – to help with planning where to locate stores or how to engage in marketing and advertising effectively – was sold because the private sector made it clear they were willing to pay to gain access to it. (Sadly, this was bad news for academics, non-profits and everybody else, for whom it should have been free, as it was in the US).

So my point is, that a great deal of the (again) obvious low hanging fruit has probably been picked long before the open data movement showed up, because governments – or companies – were willing to invest some modest amounts to create the benefits that picking those fruit would yield.

This is not to say I don’t think there are diamonds in the rough out there – data sets that will reveal significant savings – but I doubt they will be obvious or easy finds. Nor do I think that billion dollar companies are going to spring up around open datasets over night since – by definition – open data has low barriers to entry to any company that adds value to them. One should remember it took Red Hat two decades to become a billion dollar company. Impressive, but it is still a tiny compared to many of its rivals.

And that is my main point.

The real impact of open data will likely not be in the economic wealth it generates, but rather in its destructive power. I think the real impact of open data is going to be in the value it destroys and so in the capital it frees up to do other things. Much like Red Hat is fraction of the size of Microsoft, Open Data is going to enable new players to disrupt established data players.

What do I mean by this?

Take SeeClickFix. Here is a company that, leveraging the Open311 standard, is able to provide many cities with a 311 solution that works pretty much out of the box. 20 years ago, this was a $10 million+ problem for a major city to solve, and wasn’t even something a small city could consider adopting – it was just prohibitively expensive. Today, SeeClickFix takes what was a 7 or 8 digit problem, and makes it a 5 or 6 digit problem. Indeed, I suspect SeeClickFix almost works better in a small to mid-sized government that doesn’t have complex work order software and so can just use SeeClickFix as a general solution. For this part of the market, it has crushed the cost out of implementing a solution.

Another example. And one I’m most excited. Look at CKAN and Socrata. Most people believe these are open data portal solutions. That is a mistake. These are data management companies that happen to have simply made “sharing (or “open”) a core design feature. You know who does data management? SAP. What Socrata and CKAN offer is a way to store, access, share and engage with data previously gathered and held by companies like SAP at a fraction of the cost. A SAP implementation is a 7 or 8 (or god forbid, 9) digit problem. And many city IT managers complain that doing anything with data stored in SAP takes time and it takes money. CKAN and Socrata may have only a fraction of the features, but they are dead simple to use, and make it dead simple to extract and share data. More importantly they make these costly 7 and 8 digital problems potentially become cheap 5 or 6 digit problems.

On the analysis side, again, I do hope there will be big wins – but what I really think open data is going to do is lower the costs of creating lots of small wins – crazy numbers of tiny efficiencies. If SAP and SAS were about solving the 5 problems that could create 10s of millions in operational savings for governments and companies then Socrata, CKAN and the open data movement is about finding the 1000 problems for which you can save between $20,000 and $1M in savings. For example, when you look at the work that Michael Flowers is doing in NYC, his analytics team is going to transform New York City’s budget. They aren’t finding $30 million dollars in operational savings, but they are generating a steady stream of very solid 6 to low 7 digit savings, project after project. (this is to say nothing of the lives they help save with their work on ambulances and fire safety inspections). Cumulatively over time, these savings are going to add up to a lot. But there probably isn’t going to be a big bang. Rather, we are getting into the long tail of savings. Lots and lots of small stuff… that is going to add up to a very big number, while no one is looking.

So when I look at open data, yes, I think there is economic value. Lots and lots of economic value. Hell, tons of it.

But it isn’t necessarily going to happen in a big bang, and it may take place in the creative destruction it fosters and so the capital it frees up to spend on other things. That may make it potentially harder to measure (I’m hoping some economist much smarter than me is going tell me I’m wrong about that) but that’s what I think the change will look like.

Don’t look for the big bang, and don’t measure the growth in spending or new jobs. Rather let’s try to measure the destruction and cumulative impact of a thousand tiny wins. Cause that is where I think we’ll see it most.

Postscript: Apologies again for any typos – it’s late and I’m just desperate to get this out while it is burning in my brain. And thank you Alex for forcing me to put into words something I’ve been thinking about saying for months.


Original post

Leave a Comment

Leave a comment

Leave a Reply