Imagine creating twice the amount of data of the entire printed collection of the Library Congress – every single day. That’s 19 terabytes of information – and it’s the amount that the National Oceanic and Atmospheric Agency (NOAA) collects and produces every 24 hours. With so much data at the agency’s disposal, they have an unprecedented opportunity to look at data as means to drive better public sector outcomes.
NOAA is certainly not alone in the massive amounts of data they supply, manage and collect. Many other government agencies are currently working on collecting, storing, and organizing huge amounts of data as well. For agencies to capitalize on these important data assets, they must start by re-thinking the way they store and manage and analyze information.
For many agencies, the enterprise data hub (EDH) architecture has become a prime solution. With an EDH, organizations have a single place to store data, regardless of form, volume, or retention — and the data gets stored in its original fidelity. And the EDH brings multiple computing engines – like interactive SQL, search, machine learning, and more – right to the data itself, which is shared among these computing capabilities. With an EDH, agencies can release data long sequestered in standalone, single-purpose data siloes and create a continuum of analytics that analysts can employ with big data.
As a foundation for big data, the EDH presents many opportunities for use as well as other considerations, like compliance, security, and management. To help you explore the potential of an EDH, Cloudera is hosting a monthly breakfast series at two locations in the DC metro area.
- The Tower Club: Tyson’s Corner, VA – The first Wednesday of every month from 7:30am–9:00am. Please register to confirm your spot today.
- The Courtyard Ford Meade, Annapolis, MD – The second Thursday of every month from 7:30am–9:00am. Please register to confirm your spot today.
As a preview of what you’ll learn at the series, GovLoop spoke with Rob Morrow, senior systems engineer at Cloudera. He discussed the challenges when adopting a converged analytics strategy for your agency, and some best practices to get started.
“Many of the challenges have to do with policy, and security is also a big source of concern,” said Morrow. “Several of my customers are in the intelligence community and they follow security boundaries in terms of what they are allowed to look at legally and population of employees verse another agency’s population of employees.”
Overcoming security and policy challenges are essential, as leaders need a way to assure they are in compliance with who can view the data. Another challenge is being able to trace back to the raw source materials to show how a decision was arrived at, or how data was modified.
Morrow said the first step to get started with analytics is to get as much organizational data as possible consolidated in an EDH architecture, and to then use professional services to help implement and instrument advanced analytics strategies. After that, organizations should focus on making sure that a foundation has been set to allow the data and information to be accessible across the agency.
The final step is to be sure to provide continuous training to staff so that skills will match new workforce demands that will emerge around analytics.
“Cloudera helps with setting up the platform, making sure the platform works right and making sure we can get staff transitioned in a way that supports them, whether that be platform integration, data science topics or traditional IT topics like SQL,” said Morrow.
Morrow’s insights are just the start of what you will learn at the breakfast series and how Cloudera helps you meet your data needs. The series will also host discussions on:,
- Next Generation Information Management – Why information-driven organizations are embracing the technical strategy and components of an enterprise data hub for their data management needs.
- Data and Resource Management – Which building blocks of an enterprise data hub, from YARN to Apache Spark, are necessary to ensure data visibility, analytics, and resource management for agency IT solutions.
- Compliance and Security – How an enterprise data hub fulfills the security, governance, data protection, and operations needs of today¹s missions.
- Operational Efficiency – How an enterprise data hub offers these capabilities at a far superior price relative to traditional data management approaches.
|Cloudera is revolutionizing enterprise data management with the first unified Platform for Big Data: The Enterprise Data Hub. Cloudera offers enterprises one place to store, process and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Learn more here.