, ,

A Visual Solution to a (Public) Big Data Problem

Data is very valuable to public agencies. Unfortunately, it can also cause problems. And when those problems interfere with your ability to process information in a meaningful way – either because the data is too large, coming in too quickly, or captured in a multitude of different formats – you have what is known as a ‘big data problem.’

To help agencies better utilize their data effectively, GovLoop and industry leaders Software AG, MarkLogic and Cloudera hosted an in-person training event on big data on March 26th. Officially titled, “Big Data: Examining the Big Data Frontier,” the training featured a number of talks and panel discussions from leaders inside and outside of the government. You can read a recap of the top lessons from the event, as well as a sample of big data challenges from attendees.

The closing keynote of the day presented a very unique big data question: What do you do when you are faced with a public big data problem? In other words, how do agencies publish their large and unwieldy public datasets in a way that is functional and accessible?

Michelle Tepper, from the Design & Technology office at the Consumer Financial Protection Bureau (CFPB), discussed one such problem.

The Problem: HMDA Data, A Story Without a Narrative

The Home Mortgage Disclosure Act (HMDA, 1975) was designed to provide more transparency in the home loan market. HMDA (pronounced hum-dah) requires lenders to report on all mortgage or loan refinance applications – the type of loan, the type of property, the loan decision, as well as some basic demographic information about the lender. The primary goal was to identify trends in the market, as well as inform policy decisions. However, a secondary goal was to release the information to the public so communities could use the data as evidence in the event of redlining and housing discrimination.

The problem was that the data, released once a year in September, contained tens of millions of records – and was traditionally released in hard-to-process PDF files.

Until now. Tepper’s team at the CFPB adopted a user-centric design process to ‘tell the story’ of HMDA data beyond simply presenting the numbers.

The Old Site: A Tangle of PDFs

The main issue with the previous site is that it was not intuitive for people who weren’t deeply embedded in the organization. (Note that public data is buried down at the bottom of the page.) Tepper noted that during the project she kept a cheat sheet by her computer to help navigate the sea of datasets embedded in the site. Unsurprisingly, the data was only used by HDMA researchers, bankers and regulators.

The Solution: A User-Centric Approach

The most significant change is that the new site is designed for the user, rather than the agency. Perhaps equally important is that the new site takes a layered approach, which accommodates a number of different users. It focuses on entire spectrum of users, from those who want to see the data visualized to those want to download the data to those who want to develop their own tools and apps. Tepper described the continuum in these terms:

I want the story —> <— I want to use the data to tell my own story

The CFPB approach provides a few key lessons. They are summarized below.

1. Tell the Story (Literally)

On the new HMDA website, there is a link to a three-minute video clip describing HMDA, the data variables available and how the general public can get involved and start using the data. “Government is very good at putting out public data, but not always as good at telling you why you should care about it,” said Tepper. She noted that, during their user experience research, her team found that even data experts expressed a desire to know where the data originated or its context.

2. Adopt A Layered Approach

The site provides three access portals for the data, represented by the three circles above. For those who want visualizations, they can use the site’s API to browse through key variables (such as the example below).

However, even when users decide to download the data for their own purposes (represented by the ‘get the data’ option), they only receive more complicated choices after they’ve made certain initial selections. The goal is to lead people forward, said Tepper, rather than overwhelm them with the entire universe of options from the beginning. In this way, people can stop once they’ve achieved their goal or once they’ve ventured beyond their individual comfort levels.

This approach also helped CFPB roll out products in batches, starting with those that are relatively easy (visualizations), and then following with those that are more difficult (the tools). “By breaking it up into small pieces, that allows us not to rise or fall on a given guideline,” said Tepper. “It also allows us to serve different user groups effectively.”

3. Use an Iterative Development Process

Part of the user design approach was to use an Agile development model, which is becoming much more common in government. The team also integrated user experience feedback into the iterative model, which allowed them to learn from user experiences and make alterations accordingly.

The HMDA model provides an excellent case of using good visual design to solve a big data problem. For other examples of public agencies adopting innovating approaches to solve big data problems, download GovLoop’s latest guide: “Big Data: Exploring the Big Data Frontier.”

Presenter Slides from the Event

Social Insurance in the Age of Big Data, Herb Strauss, Assistant Deputy Commissioner for Systems and Deputy Chief Information Officer, Social Security Information

Key Findings from the GovLoop Guide, Pat Fiorenza, Senior Research Analyst, GovLoop

Industry Insight I, Michael Doane, Technical Director, MarkLogic

Industry Insight II, Michael Ho, Vice President, Professional Services, Software AG

Event Sponsored by:

Leave a Comment

Leave a comment

Leave a Reply