Cap Gemini estimates that open data was worth 32 billion euros in 2010 to Europe, growing at 7% per year, while McKinsey estimates the global value at US$3 trillion per year and the UK estimated earlier this year (PDF) that the value of releasing their geospatial data alone as open data would be 13 million pounds per year by 2016.
There’s been a range of similar reports across the world (as aggregated by the Open Data Institute) – all of which point to a similar conclusion.
However realising this economic value, in productivity, efficiencies and direct revenue, is dependent on governments doing one thing that they’ve so far failed to do – releasing open data in a planned, consistent and managed way.
Thus far most governments have followed a haphazard route to open data, releasing the ‘low hanging fruit’ first (data already in releasable form, with few privacy concerns and considered ‘low risk’ as it doesn’t call into question government decisions), and then progressively releasing esoteric and almost random data sets at inconsistent intervals.
Many governments have clear processes for individuals and organisations to request the release of specific data sets – however a clear process which doesn’t support the goal is of little value.
These requests have little influence on agency decisions on releasing data and I have yet to see any government mandate that these requests need to be officially considered and actioned or responded to within a set timeframe.
Without any real weight or structure, processes for requesting data sets can’t be relied on by people seeking to build a business on open data.
Data consistency is an even bigger issue. In nations like Australia the federal and state governments each have their own open data sites. However there’s no agreed national strategy on data release. Every jurisdiction releases different sets of data, with few attempts to aggregate state-level data into national datasets covering all jurisdictions.
Even when similar data sets are released by different states this is complicated at the back-end by different laws, different collection techniques and frequencies and differences in analysis and measurement approaches – not to mention differences in formats and naming conventions. This can make it costly, if not impossible, for businesses or individuals to aggregate data from different states and use it for a national goal.
On top of this, many agencies still resist calls to release data. Some due to a closed culture or a hope that ‘open data’ is a passing fad, others due to the costs of reviewing and releasing data (without any ability to offset them in fees or additional funding) and some due to concerns around data quality, political impact or reputational damage to the agencies themselves.
My fear is that we’re reaching a chicken and egg impasse – agencies and governments are reluctant to do the work and spend the money required to develop consistent data release approaches and mandates without seeing some the economic value from open data realised. Meanwhile individuals and organisations are reluctant to build business models on a resource that is not reliably available or of a consistent quality.
There’s no commercial model for open data if governments can turn off specific data, or entire open data initiatives on at a whim (as we saw data.gov shut down recently in the US Government shutdown). Businesses need to be able to count on regular publication of the data they use to build and inform their enterprise.
There’s also a lot less value for governments in releasing their data if companies are reluctant to use it (due to a concern over the above situation).
So how should countries avoid the chicken and egg issue in open data?
There’s two approaches that I have considered that are likely to work, if used in tandem.
Firstly, governments must mandate open data release and take appropriate steps to develop ongoing data release approaches, which clearly and publicly state what data will be released, at what frequency and quality level. This should include a data audit establishing what an agency owns (and may release) and what it doesn’t own, as well as the collection costs and frequency of specific datasets.
To maximise the value of this approach for states within a nation there needs to be a national accord on data, with states (or as many as possible) developing and agreeing on a consistent framework for data release which works towards normalising the collection, analysis and release of data so that it can be aggregated into national datasets.
Secondly there needs to be thought put into the difference between open and free data. Individuals and organisations who use government open data for personal, educational or not-for-profit use should be able to access and reuse the data for free. However where they are using open data for profit (at an appropriate threshold level), there should be the scope for financial contracts to be put in place, just as there is for most other resources used to generate profits.
This approach would provide a revenue stream to the government agencies releasing the data, helping offset the collection and publication costs. Contracts should also be structured to provide insurance for the data users that the data will be released on a set timetable and to a defined quality level throughout the life of the contract.
There would need to be significant thought into how these financial contracts would be structured with significant flexibility built in – for example allowing cost-recovery for developers, who may spend many hours developing and maintaining the services they build with government open data and avoiding the upfront fee model which becomes a barrier to new entrants to make profitable use of open data. There would also need to be consistency in these contracts nationally for state data – potentially a major challenge in Australia.
However if implemented thoughtfully and with significant consultation and ongoing review, a combination of rigour in data release and cost-recovery for profitable use of government open data would avoid the emerging chicken and egg issue and provide a solid and sustainable foundation for realising economic value from open data – value that would help support Australia’s economy, social equity, education and scientific research into the future.