Government agencies today recognize the benefits that come from a modern data analytics platform–a single, central system that stores unlimited data regardless of type or volume and offers users a wide array of tools to work with that data. Improving process efficiencies, tracking performance insight from IoT sensors, and recognizing trends in citizen behavior are just a few of the benefits of data analytics, but all this is moot without strong security features in place to safeguard the vast array of data. And in the public sector, data can be highly sensitive and subject to heavy regulations and governance controls designed to ensure data security. Agencies should have the ability to choose the platform that best suits their need for flexibility and scalability whether that is on premise, in the cloud or a hybrid model, though data security must be a priority regardless.
Early developers of data analytics did not prioritize data security, as the initial focus was on workloads and data sets not considered sensitive and subject to regulation. Initially, security controls addressed user error such as accidental deletion rather than malicious use. But the data management environment and needs are different today, with data-driven insights enabling a more efficient and engaged government than ever before. The maturity of today’s data analytics platforms means that there are many data paths, and while data ecosystems are loosely coupled, often times each has separate channels for getting data in and out of the platform. In order to ensure that all data remains secure at all times, there are four common factors of security that agencies should take into account.
- Authentication – Authentication focuses on guarding access to the system, its data and its various services. It mitigates the risk of unauthorized usage of services and confirms that a user or service is who they claim to be. Agencies typically manage identities, profiles and credentials through a single distributed system. A strong data analytics platform should require robust authentication in order for a cluster to operate securely. There are many approaches to authentication in order to ensure security. This includes perimeter authentication, which secures the data cluster itself, or access that is external to or on the perimeter of the data analytics platform. There is also cluster authentication, whereby internal checks prevent rogue services from introducing themselves into cluster activities via impersonation and subsequently capturing potentially critical data as a member of a distributed job or service.
- Authorization – Authorization manages who or what has access or control over a given resource or service. Since a robust data analytics platform merges together the capabilities of multiple, varied and previously separate IT systems as an enterprise data hub that stores and works on all data within an agency, this necessitates multiple authentication controls with varying granularities. Each control should be modeled after controls that the IT team is already familiar with, enabling an easy selection of the right tool for the right task. Common data flows often require different authorization controls at each stage of processing.
- Data Protection – The goal of data protection is to ensure that only authorized users can view, use and contribute to a data set. Such security controls add another layer of protection against potential threats by end-users and also thwart attacks from administrators and malicious actors on the network or in the data center. This type of protection is especially critical for the government, where insider threats pose arguably the greatest risk to data security. Agencies must protect data at all times, both when it is persisted to disk or other storage mediums, known as “data-at-rest,” as well as “data in transit” when it moves from one process or system to another. The form that data protection takes can vary depending on the task at hand and data type. For example, protecting data-at-rest can be achieved through data encryption and key management at the application level.
- Auditing – Auditing aims to capture a complete and immutable historical record of all activity within a system and plays a central role in three key activities within the government. First, auditing is a part of a system’s security regime and can explain what happened, when, and to whom or what in case of a breach or other malicious intent. For instance, if a rogue administrator deletes a user’s data set, auditing provides the details of this action, and the correct data may be retrieved from backup. Auditing also helps with compliance by satisfying the core requirements of regulations associated with sensitive or personally identifiable information, including HIPAA and FISMA. Auditing provides the touchpoints necessary to construct the trail of who, how, when and how often data is produced, viewed, or manipulated. Finally, auditing offers the historical data and context needed for data forensics. Audit information leads to the understanding of how various populations use different data sets and can help establish the access patterns of these data sets.
Next-generation data management requires authentication, authorization, audit and data protection controls in order to establish a place to store and operate all data within the government. This is made even more critical by the rigorous security requirements necessary to keep sensitive government information safe, and thus agencies are in need of a complete data management solution. As the volume and type of data being utilized for analytics continue to grow, the proper precautionary steps must be taken to establish this next generation of data management through an integrated and comprehensive security strategy.