GovLoop - Knowledge Network for Government

Recently Andrew Krzmarzick started a discussion regarding the key takeaways from the last city camp in Chicago.  And after reading it I got two themes out of the discussion: one that open datasets - just by themselves - aren't enough and, the other, that there are questions about who is responsible for publishing that data. Very interesting discussion. It brought out few things that got my attention.  

One of them comes from a comment by Josh Kalov where he suggested that 
"maybe one area to look at instead of just developing applications that make use of the data- help create applications that make it easy for govt to put the data out there. Start researching the software/hardware/database setups that various departments of a govt organization use. Find out what features the govt employees need/want in order to do their job properly"

So, here I am...asking you to roughly give an idea of whether you're publishing data today and, if so, how?  And, if you're not publishing data - how do you expect you would do it?

I am not looking for technical answers; although, I won't mind them :) ... 

Tags: citycamp, data, gov20, government, local, municipal, open, open government, opendata, opengov, More…publishing, webservices

Views: 32

Reply to This

Replies to This Discussion

Eager to see the responses...
Our GIS section in the City of Tacoma is publishing GIS data in shapefile format: These are the most commonly requested data, so it's a big time saver to point customers to the site. I'm pushing to provide the data in KML format as well. And eventually as services that customers could consume in desktop GIS and web applications.

Michael Stoddard
I love to see cities opening up their GIS.

Adding KML will make the data easier to read for users without the software or knowhow to deal with shapefiles. Even though there are many free tools (for all platforms), they're not always easy to use or intuitive.

Re: Tacoma, That site could use some improvements, like sorting the table, expanding the description (perhaps a preview of the metadata or attribute table), and adding more datasets. I've added the page to my growing directory, Find urban data.
The Chicago Bicycle Program publishes its data on our website in a XLS file, which means it is compatible with Microsoft Excel,, and Google Docs. In the near future, you will see an XLS file with geographic coordinates, and then XML and KML formats.

You can perform your own search of the data, and download your results:
Thanks for your replies guys...and thanks for the links.

How do you deal with you guys replace what's "out there" with the new data or do you archive previous versions?

Steven, so you get the XLS file from the Chicago Bicycle program and post it on your website? And then you make it searchable or is it that you have the data in database and you allow people to export it out as an excel file? Very interesting. Btw, I liked how you used google maps...well done.

Michael you said that the GIS data is the most commonly requested.....what other data is being requested?

I was actually expecting to see other data being published - like budget data. I say this because I recently started learning about "participatory budgeting" and find it quite interesting.
All the date on the Bike Parking website is LIVE.

All of the XLS or KML or other files are generated in real-time using the most current data.
cool...thanks Steven.
<========Just saw this on Twitter:

Here's the Link from the Tweet
Nice...Thanks Andrew...very interesting.
Edward, thanks for posting the discussion. This is a really good question to ask and it’s interesting to read the responses. In terms of “how” data is published, I think you’re getting at what kind of process people are using rather than just what output formats are useful. At least, that’s my read of it and what I am responding to here.

I think that the demand for “Raw Data Now”, which is trumpeted by the Open Data community, ignores the seriousness and diligence that should go into deciding what data is actually safe for release. Without the right processes in place, lack of data quality greatly reduces the usefulness of the data, and raises the risks of misuse and privacy breaches. This creates a lot more work for govt depts and could seriously delay progress of open data within govt.

Even if you set aside the risks, depending on how you define it, I think that making “raw data” available would in many cases be fairly useless or even counter-productive. I suspect that many agencies will probably find that the sweet spot for a sustainable program supporting open data is to produce some kind of statistics from their data. In terms of how to do a good job with that, one place to start looking is the official statistics entry on wikipedia, which includes a link to process models that support the process of producing statistics. Another place to look is the tools that support these models. DDI is an XML format that has excellent support for producing official statistics from various kinds of data, from administrative data sets through to surveys. SDMX is another format that has strong credentials as a mechanism for exchanging statistical data, as well as supporting various key parts of a statistical production process. These formats are extremely comprehensive and not easy to start using from scratch but with the right set of tools (some of which exist, many more coming), they can help an organization put in place repeatable processes to safely and reliably produce quality statistics.

I’ll note that I haven’t mentioned RDF, the format most heavily promoted by the Open Data community. The reason why is that this is only useful when you want to find a needle (say, some of your organization’s data) in a haystack (the Web). However, if you want to implement a process to produce the statistics in the first place, RDF isn’t a good fit. It’s just one of the output formats you may want to pump out in order to make your data more accessible through the Semantic Web (and get better support from search engines, which are implementing RDF support).
I'm responding to "I think that making “raw data” available would in many cases be fairly useless or even counter-productive."

Making the raw data available would be useless or counter-productive for whom? The data owner or provider (e.g. city government)?

The data owner shouldn't decide what's useful.
Hi Steve,

I'll try to clarify. One of the reasons it would be counter-productive is privacy. Few people would want everybody to access their tax details, personal health records, social security payments etc. But that makes up a great deal of what raw govt data is. Even when you strip away personal details, data matching and even incredibly simple inference techniques can end up revealing a lot of sensitive personal information. So that's one reason: it damages the reputation of the data provider, which make it much more difficult for them to fulfil their mission. There are also issues around putting enough context around the data to ensure appropriate use - when I mention that it might be counter-productive, that can also be due to the costs of people misinterpreting the data.

In terms of who decides, I agree that it shouldn't be just the govt but there are significant costs involved in getting quality data out and ironically, I would say for any data that has any kind of potential privacy issues, the more "raw" the data is, the more costly it is to get out. Interesting to hear any thoughts - from you or others - about what would be a good way to determine what data should be released.

I'm curious about how you refer to both the data provider and the data owner. I'd take the data provider to be the govt agency responsible for releasing the data. As for the owner, there is a notion about that I agree with quite strongly that the owner is you, me and everyone else. The govt agencies are simply custodians of our data.


© 2014   Created by GovLoop.

Badges  |  Report an Issue  |  Terms of Service