GovLoop’s hosting its Fifth Annual Government Innovators Virtual Summit, an all-day, virtual event with five different online trainings, networking opportunities and resources to help government do their job better — and we’re recapping each session for you. Head here to read write-ups from the other trainings.
With the explosion of data in the government, increased collaboration and information sharing are important goals for any agency. Recent legislation, such as the DATA Act and the Cybersecurity Information Sharing Bill, provides extra incentive for agencies to achieve those goals. However, data often resides in disconnected silos, making that collaboration and sharing difficult.
Avi Bender, the Chief Technology Officer at the U.S. Census Bureau, and Kevin Morgan, the Senior Director of Sales Engineering at MarkLogic, sat down for GovLoop’s Annual Government Innovators Virtual Summit to discuss how agencies can overcome the hurdle of silos and achieve data’s full potential.
Data is central to the Census Bureau. It is the largest statistical agency in the federal and serves as a survey bureau for other agencies. Accessing, searching, and using data from internal and external sources is key to the Bureau’s role, both in terms of the work it does and for the people who use their data.
To maintain internal access to data, the Census Bureau has established a data management system and a data governance process to allow agency officials to access the data they need for their mission goals. Externally, they release their public data to citizens through application program interfaces (API), letting people interact and use the data as they wish.
Bender said the Census wants to extend the innovative use of their data outside of the agency, bringing it to the citizens. The agency has achieved this through their City Software Development Kit (SDK), which allows people to develop programs and applications based on that data. This information sharing has been especially important for the current Smart Cities Initiative, which employs data from the Census as well as transportation, energy, and other agency information in order to develop smart city insights.
Compiling data from disparate sources can lead to actionable insights that can improve government services and citizen wellbeing, but this is not easy to do. Talking about the current state of data, Morgan noted that it can be difficult to get an answer for what seems like a simple question. According to a New York Times article, data scientists spend up to eighty percent of their time wrangling data before they can garner any useful insights from it.
One of the biggest challenges in eliminating data silos is also one of the biggest challenges in the public sector overall: doing more with less. Recent legislation as well as increased demand for new services pushes agencies towards increased collaboration and information sharing but without any increase in resources. These pressures outpace the speed of data integration and development. Adapting traditional legacy infrastructures to modern needs and expectations is difficult and requires new innovative solutions.
There are current solutions, but often they create their own problems. One solution is an enterprise data warehouse, a central repository for different data sets. This option is very time consuming and focuses on downstream analysis, relying on defined requirements. This means that there is little agility and flexibility for incorporating new data models.
Another approach is enterprise application integration, where a central framework connects systems and applications across an organization to create a business flow. This has been successful to an extent but does not address the fundamental problem of how to integrate data, as the data itself still often exists in incompatible forms.
With all of these options, the data system grows more complex and the gap between analysis and operation continues to grow. As Bender mentioned, a data governance plan is essential to understanding and managing shared data, but data governance is increasingly difficult with more data sources and individual integration solutions. The complexity leads to rigidity, and organizations have trouble knowing what and where the authoritative source is.
Morgan described an ideal user wish list for a data integration solution. It should be agile and flexible, accommodate all types of data, enrich the data without duplication and redundancies, maintain security, and support scalability.
One option that works to meet these goals is an operational data hub, which implements a data-first enterprise architecture for the organizational data. This structure brings applications to the data, freeing agencies from the task of moving and copying data between applications, and integrates at the level of the data, not just the function.