This blog post is an excerpt from GovLoop’s recent guide Embracing Data Analytics: Common Challenges & How to Overcome Them. Download the full guide here.
One of the challenges agencies face in their quest to adopt analytics is managing the volume of data stored in various silos. Locked away in those disparate systems are datasets that could help explain trends or enable agencies to plan for future events — if only people could be access them.
But even the data that agencies can account for may be outdated or inaccurate, which is often referred to as “dirty data.” For the data that is usable, agencies must decide how best to store, manage and analyze it.
Ben Snively, Senior Solutions Architect on Amazon Web Services’ (AWS) public sector team, recommended that agencies start their analytics journey with a set of business questions that they want to answer with data analytics. “That may not be a complete set of questions, but a smaller set that can lead to quick wins and buy-in from multiple stakeholders,” Snively said. “Agencies should involve engineers, analysts and other business stakeholders who can benefit from agency system data.”
From there, having a game plan and knowing what cost effective solutions are available to support data analytics is key. For example, NASA’s Jet Propulsion Laboratory and the Food and Drug Administration are among the agencies using AWS, a leading cloud platform provider, to simplify workloads and ensure they are paying only for the resources they use. AWS offers tools and services that enable agencies to quickly migrate data into its secure environment and benefit from what Snively and his colleagues call the “undifferentiated heaving lifting” of technology infrastructure that AWS provides.
“With AWS, agencies are able to use various services that will perform a lot of the common tasks for analytics and data science, such as run Hadoop as a service,” Snively said. “This approach allows them to really focus on what matters for the mission, rather than the heavy lifting of standing up the infrastructure and analytical services.”
The exponential growth of existing data sources makes it challenging to predict storage requirements even a year or two down the road. The same is true for projecting the need for computing resources.
If agencies get that projection wrong, they’re either paying for resources they don’t need or they have unhappy employees who don’t have access to the analytics capabilities to do their jobs. With AWS, agencies don’t have to worry about guessing how many servers or software licenses they will need for the duration of a project.
“You really get to grow into what you need based on the demand that you have at a given time,” Snively said. “We see agencies coming to AWS to simplify big data in their organizations by leveraging the tools and techniques they use to run analytics.”
Government agencies can now easily stand up projects and experiment using tools from various AWS partners or managed services. “By using Amazon Kinesis, customers are able to focus on real-time analytical needs, rather than focusing on standing up a real-time stream,” Snively said.
On the data storage front, NASA shifted to a new infrastructure model that uses AWS for cloud-based enterprise infrastructure. This model supports a variety of web applications and websites in a secure environment, while providing nearly $1 million in annual cost savings for NASA.
Amazon Redshift, a petabyte-scale data warehouse, is another area that’s growing in popularity. Think of it as processed and structured data that allows analysts to run business intelligence tools to gain mission insight. This is different from a big data lake, which often includes a variety of datasets. These can be structured or unstructured and can contain raw and unprocessed data.
Federal agencies aren’t the only ones benefiting from easy access to their data. The Financial Industry Regulatory Authority (FINRA), an independent nonprofit organization authorized by Congress to protect America’s investors, migrated about 75 percent of its operations to AWS. The move enabled FINRA to create a flexible platform that adapts to changing market dynamics and provides its analysts with tools to interactively query multi-petabyte datasets. The organization estimates it will save up to $20 million annually by using AWS instead of a physical data center infrastructure.
“Once you have the cloud computing environment, you’re able to run new analytics with a simple click,” Snively said. “You can choose from the AWS Marketplace or managed services and experiment with many tools quickly. These services are transforming the way agencies are able to run analytics.”