Skip to main content

Disaster recovery plans explained

Develop a disaster recovery plan that boosts your cyber resilience and recovery capability

How does a disaster recovery plan work?

A disaster recovery plan (DRP) is a formal document created by an organization that contains detailed instructions on how to respond to unplanned incidents such as natural disasters, power outages, cyber attacks and any other disruptive events. The plan contains strategies on minimizing the effects of a disaster, helping an organization to quickly resume key operations or continue to operate as if there was no disruption.

Disruptions can lead to lost revenue, brand damage and dissatisfied customers. The longer the recovery time, the greater the potential adverse business impact. A good DRP enables rapid recovery from disruptions, regardless of the source of the disruption.

Support business continuity with cloud disaster recovery within minutes of an outage with Disaster recovery as a service (DRaaS)

Explore DRaaS

A DR plan is more focused than a business continuity plan and does not necessarily cover all contingencies for business processes, assets, human resources and business partners.

A successful DR solution typically addresses all types of operational disruption. These disruptions can include power outages, telephone system outages, temporary loss of access to a facility due to bomb threats, a "possible fire" or a low-impact non-destructive fire, flood or other event. A DR plan should be organized by type of disaster and location and must contain scripts  or instructions that can be implemented by anyone.

Before the 1970s, most organizations only had to concern themselves with making copies of their paper-based records. Disaster recovery planning gained prominence during the 1970s as businesses began to rely more heavily on computer-based operations. At that time, most systems were batch-oriented mainframes. Another offsite mainframe could be loaded from backup tapes, pending recovery of the primary site.

In 1983, the U.S. government mandated that national banks must have a testable backup plan. Many other industries followed as they understood the significant financial losses associated with long-term outages.

By the 2000s, businesses had become even more dependent on digital online services. With the introduction of big data, cloud, mobile and social media, companies had to cope with capturing and storing massive amounts of data at an exponential rate. DR plans had to become much more complex to account for much larger amounts of data storage from a myriad of devices. The advent of cloud computing in the 2010s helped to alleviate this disaster recovery complexity by allowing organizations to outsource their DRPs and solutions, also known as disaster recovery as a service (DRaaS).

Another current trend that emphasizes the importance of a detailed DRP is the increasing sophistication of cyber attacks. Industry statistics show that many attacks stay undetected for well over 200 days. With so much time to hide in a network, attackers can plant malware that finds its way into the backup sets, infecting even recovery data. Attacks may stay dormant for weeks or months, allowing malware to propagate throughout the system. Even after an attack is detected, it can be extremely difficult to remove malware that is so prevalent throughout an organization.

Business disruption due to a cyber attack can have a devastating impact on an organization. For example, cyber outage at a package delivery company can disrupt operations across its supply chain, leading to financial and reputational loss. And in today’s digitally-dependent world, every second of that disruption counts.

Why is a DR plan important?

The compelling need to drive superior customer experience and business outcome is fueling the growing trend of hybrid multicloud adoption by enterprises. Hybrid multicloud, however, creates infrastructure complexity and potential risks that require specialized skills and tools to manage. As a result of the complexity, organizations are suffering frequent outages and system breakdown, coupled with cyber-attacks, lack of skills, and supplier failure. The business impact of outages or unplanned downtime is extremely high, more so in a hybrid multicloud environment. Delivering resiliency in a hybrid multicloud requires a DRP that includes specialized skills, an integrated strategy and advanced technologies, including orchestration for data protection and recovery. Organizations must have comprehensive enterprise resiliency with orchestration technology to help mitigate business continuity risks in hybrid multicloud, enabling businesses to achieve their digital transformation goals

Other key reasons why a business would want a detailed and tested DRP include:

  • To minimize interruptions to normal operations
  • To limit the extent of disruption and damage
  • To minimize the economic impact of the interruption
  • To establish alternative means of operation in advance
  • To train personnel with emergency procedures
  • To provide for smooth and rapid restoration of service

To meet today's expectationof continuous business operations, organizations must be able to restore critical systems within minutes, if not seconds of a disruption.

How are organizations using DRPs?

Many organizations struggle to evolve their DRP strategies quickly enough to address today’s hybrid-IT environments and complex business operations. In an always-on world, an organization can gain a competitive advantage or lose market share depending on how quickly it can recover from a disaster and recover core business services.

Some organizations use external disaster recovery and business continuity consulting services to address a company’s needs for assessments, planning and design, implementation, testing and full resiliency program management.

There are proactive services, such as Kyndryl IT Infrastructure Recovery Services to help businesses overcome disruptions with flexible, cost-effective IT DR solutions.

With the growth of cyber attacks, companies are moving from a traditional or manual recovery approach to an automated and software-defined resiliency approach. The Kyndryl Cyber Resilience Services approach uses advanced technologies and best practices to help assess risks, prioritize and protect business-critical applications and data. These disaster recovery solutions can help business rapidly recover IT during and after a cyberattack.

Other companies turn to cloud-based backup services, such as Kyndryl Disaster Recovery as a Service (DRaaS), to provide continuous replication of critical applications, infrastructure, data and systems for rapid recovery after an IT outage. There are also virtual server options, such as Kyndryl Cloud Virtualized Server Recovery to protect critical servers in real-time. This enables rapid recovery of your applications at an Kyndryl Resiliency Center to keep businesses operational during periods of maintenance or unexpected downtime.

For a growing number of organizations, the solution is with resiliency orchestration, a cloud-based approach that uses disaster recovery automation and a suite of continuity-management tools designed specifically for hybrid-IT environments. For example, Kyndryl Resiliency Orchestration helps protect business process dependencies across applications, data and infrastructure components. It increases the availability of business applications and helps companies to access necessary high-level or in-depth intelligence regarding Recovery Point Objective (RPO), Recovery Time Objective (RTO) and the overall health of IT continuity from a centralized dashboard.

Downtime could cause your business revenue loss, reputational damage, and regulatory penalties. Learn how Kyndryl Cloud Resiliency Orchestration can help transform your IT recovery management through automation to simplify disaster recovery process, increase workflow efficiency, and reduce risk, cost, and system testing time.

How is a DRP used in industry?

Hyundai Heavy Industries (HHI) was faced with that harsh reality when a 5.8 magnitude earthquake struck in September 2016. Since the company’s backup center was located near headquarters in Ulsan City, Korea, the earthquake served as a wake-up call for HHI to examine its disaster recovery systems and determine preparedness for a full range of potential disruption, including its mission critical IT infrastructure. After the earthquake, HHI's IT leadership responded quickly, working with Kyndryl Business Resiliency Services to implement a robust disaster recovery solution with a remote data center.

What are the key steps of a DRP?

The objective of a DRP is to ensure that an organization can respond to a disaster or other emergency that affects information systems and minimize the effect on business operations. Kyndryl has created a template to produce a basic DRP. The following are the suggested steps as found in the DR template. Once you have prepared the information, it is recommended that you store the document in a safe, accessible location off site.

Step 1: Major goals: The first step is to broadly outline the major goals of a DRP.

Step 2: Personnel: Record your data processing personnel. Include a copy of the organization chart with your plan.

Step 3: Application profile: List applications and whether they are critical and if they are a fixed asset.

Step 4: Inventory profile: List the manufacturer, model, serial number, cost and whether each item is owned or leased.

Step 5: Information services backup procedures:Include information such as: “Journal receivers are changed at ________ and at ________.” And: “Changed objects in the following libraries and directories are saved at ____.”

Step 6: Disaster recovery procedures: For any DRP, these three elements should be addressed:

  • Emergency response procedures to document the appropriate emergency response to a fire, natural disaster, or any other activities in order to protect lives and limit damages.
  • Backup operations procedures to ensure that essential data processing operational tasks can be conducted after the disruption.
  • Recovery actions procedures to facilitate the rapid restoration of a data processing system following a disaster.

Step 7: DRP for mobile site: The plan should include a mobile site setup plan, a communication disaster plan (including the wiring diagrams) and an electrical service diagram.

Step 8: DRP for hot site: An alternate hot site plan should provide for an alternative (backup) site. The alternate site has a backup system for temporary use while the home site is being reestablished.

Step 9: Restoring the entire system: To get your system back to the way it was before the disaster, use the procedures on recovering after a complete system loss in Systems management: Backup and recovery.

Step 10: Rebuilding process: The management team must assess the damage and begin the reconstruction of a new data center.

Step 11: Testing the disaster recovery and cyber recovery plan: In successful contingency planning, it is important to test and evaluate the DRP regularly. Data processing operations are volatile in nature, resulting in frequent changes to equipment, programs and documentation. These actions make it critical to consider the plan as a changing document.

Step 12: Disaster site rebuilding: This step should include a floor plan of the data center, the current hardware needs and possible alternatives, as well as the data center square footage, power requirements and security requirements.

Step 13: Record of plan changes: Keep your DRP current. Keep records of changes to your configuration, your applications and your backup schedules and procedures.