, ,

Census Bureau Uses AI to Curtail Data Imputations

This article is an excerpt from GovLoop’s recent e-book, “How Artificial Intelligence Combats Fraud and Cyberattacks.” Download the full e-book here.

Securing good, clean data to make decisions at the federal level is increasingly challenging considering the manual labor and human error that contribute to faulty calculations. Randall Knol, an IT specialist at the Census Bureau’s Application Support and Innovation Branch of Demographic Systems Division, shared with GovLoop how his agency is implementing artificial intelligence (AI) to improve data necessary to government decision-making.

GOVLOOP: How does your team use AI to meet the Census Bureau’s mission?

KNOL: We look at datasets, and they can be in a large number of different formats, and we profile them to evaluate whether or not they have the data in them that would make it useful to join to our existing datasets that would either extend it, allow us to fill missing data, or check other data for its reliability. With AI and ML, we can change the weight of our evaluation depending on how we assess the accuracy and the value of the data.

And why is having such clean, correct data important to the Census Bureau’s mission?

It’s the heart of the census. There’re the political indications of the way voting in Congress is distributed and the way that government aid is distributed across the country, that’s why it’s important for everybody to respond to the census. It’s how the government makes decisions about the economy. If data’s bad, then the government could be wasting time and money on policies that are not productive.

We pride ourselves on having the best data. That makes it a little slower and a little more expensive to accumulate, but our most important attribute is the quality of our data. If AI can help us do that, and not only make it faster and cheaper, but actually improve the quality of the data, then that’s really important to us.

How recently was AI implemented into the way that your team does things?

We’re just getting started using machine learning and AI. Where it’s first being implemented is on the research side. The census is the repository of the administrative data for many government agencies, not just the census data. It’s made available under very strict guidelines to researchers. That’s where the real usage is being made now. People are using machine learning to speed up their research with a goal of once this research is done and validated, then it will actually move into the actual census surveys.

How does your agency manage large datasets, both from the analytical and storage standpoint?

If you want good data, you have to have good metadata. Metadata is the data about data. Because it does no good for me to have a dataset if I don’t know what’s in it. We’re developing methodologies that, using AI, allow us to generate the metadata from our many datasets. In the past, metadata has been a manual process. It’s very slow when it’s manual and it’s very, very expensive. AI is helping make all these things faster and cheaper. Metadata can change as the data changes. AI makes it faster. We’re able to update our metadata and make it more accurate in a quicker way.

Why is AI’s ability to speed up the data analytics process important to the work that you do with the census?

Data needs to be good, but it also needs to be timely. If it takes me a year to produce the information you need to make a decision at the end of this month, that’s not very valuable for policymaking. I not only have to give you good data, but I have to give it to you in the window where it’s viable for actually influencing decision-making.

How is AI helping your agency produce valuable, error-free data?

Most errors come from data entry. The more questions you have people answer manually, the higher your error rate is going to be. The more we can reduce the number of questions we have to ask people, the lower our error rate is and the faster the process is.

What do you see for the future of AI in your department?

Aside from reducing costs and increasing accuracy, I’m hoping that we’ll be able to make use of classic big data that comes from the web. It’s really not a lot of use to us right now because the data quality is so low, the accuracy is so low and you have to do so much work to clean it up. If we can get AI to do more of this work for us, we can make this much more accurate.

It will never be a completely automated system. But we will be able to focus our trained analyst resources on the really challenging questions. At Census, there’s a real big concern about people seeing other people’s data. If you’ve got the AI doing it, then you don’t have a person looking at your stuff. It provides a level of privacy and security for people.

To find out more about how AI and ML can improve cybersecurity, download GovLoop’s recent e-book, “How Artificial Intelligence Combats Fraud and Cyberattacks,” here.

 

Leave a Comment

Leave a comment

Leave a Reply