No Mention of PDF in Executive Order Making Open and Machine Readable the New Default

The first thing I did was look for a mention of “.pdf” in the document, but I didn’t find it. That made me wonder how effective this will be if specifics aren’t incorporated about requiring data formats that are more amenable for manipulation.

Yes, I know that many tools exist for scraping and extracting structure from .pdf documents, but that adds another and potentially costly step. While the .pdf source — or the document it is based on — could be treated as the “golden” original, it also means that there might always be a question of the agreement between the source data and the extracted data.

Such data conversion issues have always been with us, of course, and there will be those who look at this new Executive Order as opening up opportunities for outside groups to “add value” to data being made by the Government.

I’m all for that, and I applaud the language of the order for requiring accessibility. But the devil is in the details and, given the precarious state of Government finances, we’re bound to see some confusion in the coming months. On balance, though, making accessibility the default condition has to be considered a good thing; “Better light than darkness,” as they say.

