Big data continues to be a hot topic in government. Yesterday, GovLoop, EMC, and Carahsoft took the discussion to the next level by hosting a live webinar Government Big Data: What’s Next? Topics ranged from how agencies are using analytics to manage data, making data more accessible, and innovative solutions to storing mass quantities of data.
You can access an archive of the presentation here.
Attendees heard insight from Marina Martin, Head of the Education Data Initiative at the Education Department, Shawn Kingsberry, the CIO of the Recovery Accountability and Transparency Board, and Gary Newgaard, the Director of Federal Solutions for EMC Isilon.
Marina Martin led an overview of the difference between data that is machine readable and human readable. Data that is machine readable can be used by developers and innovators, usually in the form of CSV (spreadsheet) or an API. The benefit of an API is that developers can use it and agencies can have their computers talk to one another. Human readable data in a format that people can access and actually understand. She spoke of an instance at the Department of Education where metadata did not match up with search queries people were using. A search for “list of schools” did not actually bring up the department’s list of schools because internally it wasn’t referred as such. Changing the format to be human readable is something the agency did to allow more open access to the available data.
Today, data.gov is moving in a direction asking all agencies to have data sets be standardized with common descriptive meta data. Doing so will make data.gov infinitely more powerful as humans and machines can read and manipulate data.
Shawn Kingsberry discussed how the Recover Accountability and Transparency Board looks across the entire federal government to see how agencies are using data and how it is consumed. They provide public and private partners distinct websites running off common software to support millions of concurrent users. Clarification of data governance is a tricky issue. These dilemmas are associated with who owns the data? Who analyzes it? Who reviews it? Who is the user of the data? And who can access the data?
The site FederalAccountability.gov allows federal agencies and Inspector Generals the ability to review and evaluate risk assessments of entities, companies, and universities receiving federal funds. Also, by allowing outside organizations a way to gain access to data in a secure environment through an app-centric way creates a streamlined process for customer needs.
EMC’s Director of Federal Solutions, Gary Newgaard, spoke about the EMC Isilon Scale-Out NAS for federal sector big data storage. Different organizations have different meanings of “big data.” To some big data is 100 GB and to other big data is in petabytes. Most challenges faced for big data projects are when storage spots are inadequate and lead to going through expansion projects later when more space is needed. EMC Isilon is broadly used across government from healthcare, life sciences, surveillance, physical security, defense / intelligence, and cyber. The driving force behind the wide adoption rate is the way it is architected. Isilon dramatically reduces operational expenses of managing large sets of data, speeding up workflows, and unlocking new capabilities. On average, storage costs can be reduced by 40% per terabyte annually. The storage space can be scaled up to 20 petabytes in a single file system. Their “pay as you go, pay as you grown” model creates a clear advantage to organizations to adapt to changes in IT infrastructure needs.
It’s clear that the government is using big data in ways never experienced. Innovative techniques and usage will allow the information to be harnessed by the public sector and citizens.
Want More GovLoop Content? Sign Up For Email Updates