, ,

How the Cloud Powers Consolidation With Data Lakes, Data Hubs


This article is an excerpt from GovLoop’s recent guide, “7 State & Local Tech Trends to Watch.” Download the full guide here.

State and local governments often function like an engine made of multiple departments and programs. Data benefits the whole organization by making its parts work more efficiently together.

Unfortunately, government data is typically siloed by organization or initiative. This prevents higher value analysis as insights are lost and opportunities for new citizen services are missed. Data warehouses have traditionally solved this dilemma by giving analysts organized, processed and structured data. State and local governments are increasingly realizing that this infrastructure can be too expensive and inflexible for their needs.

Expanding their data storage capacities enables governments to better perform their missions and serve the public. For a growing number of agencies, cloud computing provides the elasticity and scale needed to correctly consolidate data. It also provides the foundation for data lakes and data hubs, two options that are more cost-effective and flexible than data warehouses. Both models are inspiring new innovations across state and local governments.

In a recent interview with GovLoop, Jeremiah Dunham, Manager of Solutions Architecture at Amazon Web Services (AWS), explained how the possibilities are endless with cloud-enabled data consolidation. AWS is a leading provider of on-demand cloud computing infrastructure.

“What we’ve done is shifted from a place where data storage technology made analyzing and using data for innovation hard and moved to a place where your data’s centralized and cataloged,” Dunham said. “The only limit to the applications that governments can develop is the imaginations of the data scientists, business analysts and developers working with those systems.”

Governments have long used data warehouses to store data from a source in an orderly format. This approach costs significant money and time, and it’s also limiting.

“It’s not a great setup for data scientists who need more flexibility,” Dunham said. “Those databases will definitely serve the needs of one program or service delivery, but not having that data consolidated with other data is a miss.”

Data lakes and data hubs solve these issues by housing both structured and unstructured data for quick analysis. This skips the step of organizing and structuring the data for a data warehouse.

“In order to make sense of data lakes, you must define an architecture for the data that you’ve consolidated for an analysis you’re doing,” Dunham said. “They’re good for data discovery, machine learning and predictive analytics.”

Data hubs go further by collecting data from multiple sources and adding an index to standardize access. They combine the volume of data lakes with the structure of data warehouses.

“They offer a data warehouses’ analytics and unified interface, but with the flexibility of data lakes,” Dunham said. “Data hubs have the unique advantage of being a hybrid between data lakes and data warehouses. To users, it looks like they’re talking to one source of data, but behind the scenes, it’s all of the original data sources in their native formats.”

The cloud provides the underlying infrastructure that powers data lakes and data hubs. It also offers the flexibility and scalability needed for both with lower costs. The cloud additionally reduces IT maintenance for workforces, letting them focus on improving public services.

“The most important enabler is having a large, diverse dataset that can be analyzed,” Dunham said. “Having such a large volume of data allows you to have a comprehensive view of something. For governments, that comprehensive view is the citizens that they serve.”

Cloud services like those AWS provides are necessary for data lakes and data hubs. Both styles easily deploy in the cloud as they work with governments’ data as is. The cloud holds nearly limitless potential for how each model ultimately handles data for state and local governments.

Take Johns Creek, Georgia. The city’s government utilized a data hub to develop a municipal Amazon Alexa skill. Citizens can ask Alexa questions about key information on the city’s operations and receive answers instantaneously. Alexa now addresses 200 common citizen questions for Johns Creek’s government. Examples include where fire and police activity has occurred, what current traffic conditions are and what the zoning of any property in the city is.

“This not only democratizes access to the city’s data, but it helps save valuable time for government staff,” Dunham said.

Leave a Comment

Leave a comment

Leave a Reply