The LII crew has been looking pretty closely at the world of regulatory information lately. We’re trying to build applications that make it easier for people to find and understand the regs, like
searching the CFR by product or brand name. Want to know what regulations affect Nyquil?
developing “legal-360” views of real world objects. Want to know how many places the legal system touches pseudoephedrine?
linking science to regulations. Want to know who’s studied arsenic contamination levels in drinking water?
linking agency guidance and interpretations to the regs. Want to know how the agency thinks about whether or not you’re affected?
All of that stuff would be vastly easier to build if the issuing agencies would do a few simple things. As always with agencies, the “simple” is in scare quotes, because, well, very little is actually simple, but bear with me. I know you have no budget for this. I know you have no mandate for this. I know you have no time for this. But I am pretty sure that if you do it your compliance and enforcement costs will go down and the quality of the commentary that you get in notice-and-comment periods will go up. And it ain’t that expensive, and what the hell, let’s go wild here, public understanding is really part of your mission.
Here’s another thing: the perception of excessive regulatory burden is most often about costs, but for political purposes it is just as often expressed in terms of the difficulty of finding and understanding obligations. Two minas, a shekel, and two parts, if you want to talk about where the costs really are. Ask the folks at the Obamacare site.
Everything that follows is based on a single idea: the stuff you write and put on your websites is now reaching your regulated communities almost exclusively through an intervening technology layer that is only possible if your information is consumed and arranged by machines. Google is a sophisticated example of such a layer — one that functions pretty well without much work on your part. But you and I both know that for some purposes it is a blunt instrument. We can build better stuff, more helpful and more aware of context and the substance of what you do, if you help us. Machines don’t do language all that well, and while they’re getting better at it, they need your help. Your audience needs your help even more.
Here are some concrete suggestions.
Make machine-consumable site maps and clearly identify guidance documents
Most agencies have site maps of some kind. They help human readers discover what’s on the site and navigate to it. There’s a standard for doing the very same thing in ways that help machines find stuff and index it. My suggestion: make two of these — one for the whole site, and one that identifies guidance documents specifically. And that means the documents, not some program description that leads to a narrative essay that leads to an index page on which a PDF link has been concealed. That means a map of the documents themselves. Some folks already do a good, concise job of this in various kinds of listing pages meant for humans but easily scraped by machines — IRS is good at it, for example. Many don’t — common problems include hiding the documents behind “searchable” databases, as Commerce does, scattering them through a welter of program descriptions and disconnected stuff, as EPA does, and so on.
All of those sins can be more than atoned for by providing separate, organized maps using the standard described here. Make two: one describing the whole site, called sitemap.xml. Make another called guidancemap.xml and place both in the root document directory for your whole site. The whole site. The one identified with the agency, the one with the agency acronym in the domain name, not some subsite associated with the Office for the Regulation of Left-handed, Red-headed Inheritors of Somebody Else’s Problem.
PS: APIs are not necessarily the answer to harvesting problems like this, but they help. Federal Register 2.0 is a pretty good example.
Use titling to identify guidance and put in links to help find it
A significant amount of interpretive information appears in the Federal Register, either on its own hook or as preambles to final rules or as notices of the availability of various kinds of publications, including print. Final rules are easy enough to find, and separating out the preambles for indexing should not be that hard (that’s next on our list of things to try here). Other interpretive information is titled using the word “Guidance” somewhere. Maybe all of it is, but we can’t be sure.
One thing we do know is that offline interpretive materials — pamphlets, for example — are often mentioned in a “notice of availability” or some such. These need to be clearly titled as interpretive material. And if — as is often the case — the material is intended for print distribution, but a digital version is available somewhere on the web, please put a URL for the digital version in the notice. Please, please, please.
Put your goddamned ALJ opinions up
Res ipsa, homies. The following fun-size reenactment illustrates the experience so far:
Gummint official: What would the legal community like to see in our open data collection?
Legal community: Everybody wants ALJ opinions. More than life itself, they want ALJ opinions.
Gummint official: Squirrel!
Use citation instead of insider dialects and references to the Act
A man from Mars reading guidance documents would wonder if we actually have codifications — of either statutes or regs. References like “section 101 of the Act” or “Regulation M” are not easy for outsiders to follow. We have citations for such things, expressed as references to the US Code or to the Code of Federal Regulations. Where possible, use them. If it’s too hard to scatter them through the text where they appear, add a header to the document called “CFR Parts Interpreted” or some such. FR provides this as “parts affected” information. I know that some of you are using references to the popular name of the legislation to send a your-elected-officials-did-this-to-you-not-us message in your guidance. But really, somebody who is trying to find out what your regulations require her to do is not going to be helped much by repeated references to her “obligations under RCRA”, in toto — but approaches like this one help a lot.
Incidentally, the same is true of other kinds of information, such as enforcement data. For example, EPA categorizes all of its information about enforcement actions using the name of the program under which enforcement is taking place. It does not specifically say which rule(s) are being enforced. That information would be nice to have, in the form of citation to a CFR Part. If that practice risks a short shelf life, or is otherwise too inflexible to be used, providing a cross reference between program names and the chunks of CFR for which the program is responsible would be a good compromise.
I see a hand at the back of the room with an objection having to do with the accuracy or (snif!) appropriateness of codifications. Hell, if you want, cite to Stat. L. or the Federal Register, or whatever uncodified version you think is authoritative. But cite to something that is both specific and machine-retrievable with a reference that can be understood outside your immediate community of practice.
A final word
It’s not my intention to cause heartburn here, and if the witticisms are a little over the top, it’s because they’re intended to make the substance memorable but not bruising. I know all too well that anything that can be negatively interpreted in any way by the most bitter enemy of a federal agency will be seen by that agency as causing more harm than good. That’s simply incorrect, and in any case it’s hard to make suggestions about how people can do their jobs better without opening them to charges that they’re not doing their jobs. But we’ve reached the point where a certain amount of boldness is called for. It’s not a bad time for Federal agencies to show some forward-looking mastery of technology. Anybody who’s read a newspaper or a blog in the last three weeks knows what the alternative looks like.