The State of Disaster Recovery and Cloud in Government

This blog post is an excerpt from GovLoop’s recent industry perspective, “How the Cloud Powers Disaster Recovery for Government.” Download the full report here.

There’s a growing demand across government to manage massive amounts of data, whether it’s citizen data for processing benefits claims, workforce data to make hiring decisions, or intelligence information for sensitive missions. The bottom line is federal, state, and local agencies increasingly rely on the availability of this data to make decisions that have far-reaching impacts.

Take state and local law enforcement agencies, for example. They rely on federal databases, such as sex offender registries and criminal background check systems to accurately identify individuals who may have carried out serious offenses. But what if law enforcement officers weren’t able to properly identify a suspect because they couldn’t access centralized databases? When lives and safety are at stake, there is no room for error or incomplete information.

To ensure data is accessible when and where they need it, a growing number of agencies have invested in IT disaster recovery solutions. In the event of a natural or manmade disaster, agencies still need to quickly recover. That’s where disaster recovery planning and execution comes into play.

Think of DR as the documented processes and procedures for how your agency will recover from a disaster that impacts IT infrastructure. This includes networks, servers, desktops, laptops, wireless devices, data, and connectivity.

“Disaster recovery falls in line with the concept of business continuity planning, or the ability to continue to run, operate, and do business in the event your agency’s data center is no longer operational for a variety of reasons,” said Jerimiah Cox, a cloud solutions architect with NetApp. Think of DR as a subset of business continuity.

The two key drivers for any DR plan are recovery time objective (RTO) and recovery point objective (RPO). RTO is the time it takes after a disruption to restore a business process to its service level, as defined by the operational level agreement. RPO is the acceptable amount of data loss measured in time.

A thorough DR solution also enables agencies to maintain data compliance and protect against negative events that may result in data loss.

To better manage costs and move away from building and owning underutilized disaster recovery sites, agencies are exploring cloud computing as a viable option. With the cloud, agencies don’t have to worry about paying for infrastructure that they rarely use.

But from a cost standpoint, DR tends to be very expensive for agencies, said Lori Barber, a hybrid cloud business development manager at NetApp. Traditionally, the government’s approach to DR has involved maintaining a separate facility with infrastructure that mirrors what’s in an agency’s primary data center. The concern is that agencies are paying to maintain such facilities even though they sit idle most of the time.

Considering the critical nature of disaster recovery planning, it’s vital that agencies hash out the DR capabilities they need long before an incident occurs.

Mihail Sadeanu June 27th, 2018

Some additional, constructive, considerations in case of a DR/BC solution, using a „cloud computing model“ and a „cloud infrastructure“ (as defined by NIST SP 800-145).
The RTO, as its full name implies, is a goal or an ideal time in which you need a specific function or service to be available following an interruption. It must be considered in conjunction with RPO to have a full picture of the total time a business may lose due to a disaster. Both indicators together are very important requirements when designing Disaster Recovery and Business Continuity (DR/BC) plans for cloud-based solutions.
Also, there are two (sub)types of RTO when the BC Plan is activated, the “RTO of transaction integrity recovery” (which includes the “RPO for data recreation”) and the “RTO of hardware data integrity”.

RPO is often thought of as time between last backup moment and the moment when the outage occurred, indicating the amount of data lost. For many major outages or disaster cases, often some part of work is unfortunately lost. Thus, RPO = Time since the last backup of complete transactions representing data that must be re-acquired/(entered) or RPO =Toutage – Tbackup

As applications and business processes are not all equal, RTOs and RPOs must be categorized and segmented by recovery time. For example, in case of a 2 hour RTO this is the SLA to have everything operational back online, while for a 30 minute RPO this is the window for acceptable data loss.
Other key drivers for a DR/BC plan in case of a are:
Recovery Distance Objective (RDO) – represents how far away copies of data need to be located. “Service Recovery Time” (SRT) – the time until full IT operations with all protection and redundancies in place are available again.
“Lost Business Time” (LBT) = RTO – Tbackup

From a cost standpoint, improving RTO and/or RPO generally increases the cost of a solution. This is why it is needed to define the minimum RPO and RTO requirements for an agency data up front, and why it is needed to know the value of agency data before you can do that.
Finally, in order to maintain data compliance and protection against negative events that may result in data loss, the strategic DR/BC cloud solution offered should fulfill one or more of the following compliances: GLBA Compliance, HIPPA Compliance, SOX Compliance, FDIC Compliance, etc.

The State of Disaster Recovery and Cloud in Government

Leave a Comment

One Comment

Leave a Reply Cancel reply

Related Content

Centralizing Security for a Distributed Environment

Data Modernization Is About More Than Infrastructure

Tips for Dealing With Data, Cloud or No Cloud

Leave a Comment

One Comment

Leave a Reply Cancel reply