2015
RETHINKING BUSINESS CONTINUITY AND DISASTER RECOVERY
Prepared for: Mr. Terry Linkletter Professor, Central Washington University Submitted by: Chris Schwab IT486: Critical Issues in Information Technology March 13, 2015
Contents Executive Summary ....................................................................................................................... iii Introduction Problem Statement ...............................................................................................................1 Purpose Statement ................................................................................................................1 Research Questions ..............................................................................................................1 Methodology ........................................................................................................................2 Discussion of Findings Companies Fail to Test Their BC/DR Plan..........................................................................2 Some Companies Choose Not to Have a BC/DR Plan ........................................................3 Stakeholders at all Levels Should be Involved in BC/DR ...................................................5 Test BC/DR Plans Regularly ...............................................................................................5 Leverage Cloud Technologies for BC/DR ...........................................................................6 Conclusions Conclusions ..........................................................................................................................7 Recommendations ................................................................................................................7 References ........................................................................................................................................9 Appendices Appendix A: Research Questions Asked to CWU Students in IT486 ...............................11
Figures 1. Primary Causes of Data Loss .......................................................................................................4 2. Company Sizes Adopting Cloud Computing ...............................................................................6
ii
Executive Summary
Many issues exist in today's Business Continuity and Disaster Recovery (BC/DR) Plans. Too often businesses are unable to return after a disaster. This report discovers the main reasons BC/DR plans fail and provides recommendations to repair those issues. Research began as a discussion with Central Washington University (CWU) students in the IT486: Critical Issues in Information Technology course. Many BC/DR plans fail due to a lack of planning and testing. Many businesses have the "it won't happen to us" mindset. Some businesses incorrectly assume their third-party vendors are taking care of BC/DR plans for them simply because they pay them for various services. Proper BC/DR plans need to be developed with input from internal and external stakeholders at all levels of the business. New technologies such as cloud computing is available to businesses of all sizes and can be leveraged to revolutionize the entire BC/DP plan. The key recommendations for businesses are to update BC/DR plans with input from a variety of key stakeholders. Once the plans have been updated, a documented testing procedure and schedule must be created. The tests must be run on a regular basis using a variety of testing methods. Lastly, businesses need to invest in cloud computing to enhance the BC/DR plan.
iii
RETHINKING BUSINESS CONTINUITY AND DISASTER RECOVERY
Whether or not a business can survive a disaster depends on how well it has planned for the unthinkable. While every business should create and maintain a Business Continuity (BC) and Disaster Recovery (DR) Plan, it is widely known that this is not the case. Twenty five percent of businesses never reopen their doors after suffering a major disaster ("Disaster planning," n.d.). Business Continuity and Disaster Recovery, typically referred to as BC/DR, defines how a business continues operating through a crisis, and how it recovers after the event (Slater, 2012). Preparing for both of these processes is the difference between staying in business during a disaster, and closing to become another statistic. Valuable data can be learned from companies that have both failed and succeeded in their BC/DR plans. The data can be used to improve existing processes and look at alternative methods for executing the plan during a disaster. Problem Statement How can companies better prepare for business continuity and disaster recovery? Purpose Statement The purpose of this report is to discuss how companies can improve their plans for business continuity and disaster recovery. Research Questions This report answers five research questions. The first two are cause-oriented questions while the last three are solution-oriented questions. 1. How does failing to plan for BC/DR affect a company? 2. What issues might a company have by focusing too much on BC or too much on DR?
2 3. Who are the key business stakeholders in developing, testing, and maintaining the BC/DR plan? 4. How does the pace of IT affect BC/DR planning? 5. What impact has cloud computing had on BC/DR planning? Methodology Data was collected through primary and secondary sources. Six questions were asked to Central Washington University (CWU) students in the IT486: Critical Issues in Information Technology course (See Appendix A for the exact wording of the questions presented to the students). Some of the CWU students responded with first-hand accounts from current or previous places of employment. Other students responded with secondary sources describing current, real-world examples that provided insight into the question. Most of the data in this report is from secondary sources.
Discussion of Findings Responses from CWU students to the questions were varied, but common themes could be found. This helped to focus the research on specific areas of BC/DR. The next five sections discuss the problems companies currently have with BC/DR and explores solutions to those problems. Companies Fail to Test Their BC/DR Plan The research from CWU students provided numerous examples of how businesses are affected by issues in their BC/DR plans when a disaster strikes. Companies tend to assume that having a plan is good enough because a real disaster will never strike. When something happens to these companies, their BC/DR plan is executed, but issues are immediately discovered. One
3 example is from a 2012 disaster to the Allied Building Products' datacenter in New Jersey. The datacenter was completely wiped out (Rubens, 2013). Operations were quickly transferred to a recovery site in Philadelphia. Some critical applications were overlooked in the BC/DR plan, so they did not properly make the transition to the recovery site. The lesson learned by Allied Building Products was to fully test the BC/DR plan before needing it in the next disaster. Waiting for a disaster to test the BC/DR plan is not a recommended practice, but it is something many companies seem to do. Pixar, the company behind the "Toy Story" movies, also learned the valuable lesson of failing to test BC/DR plans. During the development of Toy Story 2, an accidental deletion erased all the work for the movie (Troxel, 2012). When attempting to store from a backup, they found that the backup system had been malfunctioning for over a month. The BC/DR plan did not cover this scenario. A separate copy was eventually found and the movie was completed. This shows the importance of testing a BC/DR plan regularly. The proper schedule for how often to test the plan depends on the needs of the business. For Pixar, the need was great, so testing should have been done more often. Some Companies Choose Not to Have a BC/DR Plan Finding research on companies that focused too much on just business continuity, or just disaster recovery, proved to be difficult. When finding a company that ran into a problem with their BC/DR plan, a lack of testing was claimed in the majority of the examples. However, some cases were of companies that failed to maintain a BC/DR plan at all. Companies without BC/DR plans are highly vulnerable to critical data loss, which can be caused by any of the reasons seen in Figure 1 below ("Statistics reinforce", n.d.).
4 Figure 1 Primary Causes of Data Loss
For businesses operating without a BC/DR plan, data loss can force it to close and become another statistic. However, if a company can somehow survive a disaster without a BC/DR plan, they are likely to rethink that practice post-disaster. In the aftermath of Hurricane Sandy in 2012, a datacenter run by Datagram flooded and was shut down (Stern, 2012). Datagram hosted major websites including Gawker, The Huffington Post, and Buzz Feed. Datagram only had the one datacenter, so it did not have a BC plan in the event of a total outage. Buzz Feed and The Huffington Post had BC plans including the use of backup datacenters that allowed them to stay online, but Gawker did not. Gawker relied on Datagram to manage their entire BC plan. They did not have a BC plan of their own to execute in the event of their one and only datacenter going offline. According to Stern, Gawker is reviewing their policies for future disasters.
5 Stakeholders at all Levels Should be Involved in BC/DR The problems with businesses developing and testing BC/DR plans are widespread. Developing better BC/DR plan requires stakeholders at all levels of the business to be engaged in the planning and design. Typically, departments such as IT and building facilities are commonly involved in these plans. Other groups that aren't thought of at first, but who are critical, include customers, vendors, shareholders, and community members (Estall, 2010). The Federal Emergency Management Agency (FEMA) also recommends including "local, state, and federal government agencies" as key stakeholders as part of their Holistic Disaster Recovery training ("Stakeholders", 2014). Employees at all levels of the business, from regular staff to executivelevel, should be represented to make sure all parts of the business are covered in detail in the BC/DR plans. It requires many people to properly document all of the critical business processes. Test BC/DR Plans Regularly According to Musaji, the greatest risk to an organization's BC/DR plan is complacency (2002). Organizations state that disasters won't affect them, or that they have very complex systems that can't easily be tested. The staff is trained enough to handle anything that happens anyway. There are multiple methods for testing the BC/DR plans. Each business will have unique situations and capabilities that dictate how and when testing can be performed. Four standard testing methodologies used are walkthroughs, simulations, parallel tests, and full outage tests (Salama, 2014). Walkthroughs are the easiest since they only require reading through the plan. Simulations require testing of notification systems and verifying backups. The parallel test uses old data to verify old backups at an alternate site. The full outage test covers an array of possible disasters and must be planned to limit the impact on normal operations. By performing
6 these tests, holes in the BC/DR plan may be discovered. The holes can be filled and the plans can be updated without the added pressure of a real disaster. Leverage Cloud Technologies for BC/DR Cloud computing is a much talked about technology for the IT department, but it can also change the way a business of any size develops its BC/DR plans. The mindset used to be that only large enterprises could afford cloud computing, but today even small businesses can afford similar services (LaChapelle, 2014). Figure 2 shows the types of businesses investing in cloud computing (Pham, 2011). Figure 2 Company Sizes Adopting Cloud Computing
The largest growth area in cloud computing is with the numerous small- and medium-sized businesses. The cost depends on how much data is used, so smaller companies will pay less than larger companies. Cloud computing also allows for speedier recovery, which translates to less revenue lost due to a disaster. According to Rubens, one of the top ten reasons DR plans fail is the inability to access the BC/DR plan during a disaster (2013). A benefit of cloud technologies
7 is the ability for employees to access the BC/DR plans during a disaster, from any Internetconnected device. Cloud computing is a relatively new technology, so this area will continue to grow in the following years.
Conclusions and Recommendations After researching the data included in this report, the following conclusions and recommendations have been created. Conclusions Businesses claim to have BC/DR plans that are ready to be executed on a moment's notice, but in reality many businesses never test their plan. During a disaster, problems quickly surface, making the BC/DR plans essentially useless. Many testing methodologies can be used on a regular basis to find gaps and fill them. Other businesses have been found to not even have a BC/DR plan. Perhaps they are relying on a third-party, but this can also lead to a disaster that quickly escalates out of hand. Every business needs to have a well-thought out BC/DR plan that involves critical stakeholders from various levels inside and outside the organization. Finally, new technologies, such as cloud computing, should be investigated to see how existing BC/DR plans could be updated to take advantage before the next disaster strikes. Recommendations Businesses operating without an up-to-date BC/DR plan must immediately work to change that. Involve key stakeholders from all levels within the company as well as external stakeholders. Include key vendors, partners, community members, and government agencies. Gather input from all parties, and then update the plans accordingly. These additional recommendations should be included.
8 Test BC/DR plans regularly. Businesses must test their BC/DR plans on a regular basis. There is no correct answer for how often a business should test. Each business will have specific needs that dictate how often to test. The frequency needs to be determined by the business. Since there are multiple ways to test, it should occur on a frequent basis. Utilize all the testing methods from simple walkthroughs up to planned, full-scale outages. Investigate cloud computing. Business leaders agree that the "[c]loud is here to stay and more and more of our customers are going towards it" (Dix, 2013). Large businesses have already taken advanced of the cost-savings cloud computing has to offer. Today, all businesses should investigate how to integrate cloud computing into existing BC/DR plans. In addition to cost savings, the benefits include speedier recoveries after disasters and possibly being able to maintain business operations during the event.
9 References Disaster planning. (n.d.). Retrieved March 8, 2015, from https://www.sba.gov/content/disasterplanning Dix, J. (2013, July 30). Cloud computing causing rethinking of disaster recovery. Retrieved March 8, 2015, from http://www.networkworld.com/article/2168624/cloudcomputing/cloud-computing-causing-rethinking-of-disaster-recovery.html Estall, H. (2010, March 25). Business continuity and your stakeholders. Who has a stake in your business? Retrieved March 8, 2015, from http://www.pslinfo.co.uk/assets/pdfs/Business continuity and your stakeholders.pdf LaChapelle, C. (2014, February 5). Disaster recovery options for smaller companies. Retrieved March 8, 2015, from http://www.networkworld.com/article/2174112/techprimers/disaster-recovery-options-for-smaller-companies.html Musaji, Y. (2002, January 1). Disaster recovery and business continuity planning: Testing an organization's plans. Retrieved March 8, 2015, from http://www.isaca.org/Journal/archives/2002/Volume-1/Pages/Disaster-Recovery-andBusiness-Continuity-Planning.aspx Pham, T. (2011, August 23). 2011 cloud & IT disaster recovery statistics. Retrieved March 8, 2015, from http://resource.onlinetech.com/2011-cloud-it-disaster-recovery-statistics/ Rubens, P. (2013, September 27). 10 reasons why disaster recovery plans fail. Retrieved March 8, 2015, from http://www.datamation.com/security/10-reasons-why-disaster-recoveryplans-fail.html
10 Salama, A. (2014, April 29). Best practices for testing your disaster recovery plan. Retrieved March 8, 2015, from http://www.latisys.com/blog-post/best-practices-for-testing-yourdisaster-recovery-plan Slater, D. (2012, December 13). Business continuity and disaster recovery planning: The basics. Retrieved March 8, 2015, from http://www.csoonline.com/article/2118605/pandemicpreparedness/business-continuity-and-disaster-recovery-planning-the-basics.html Stakeholders and their roles in recovery. (2014, May 23). Retrieved March 8, 2015, from http://training.fema.gov/hiedu/downloads/hdr/session 4 powerpoint.pdf Statistics reinforce the importance of a strong company backup and disaster recovery plan. (n.d.). Retrieved March 12, 2015, from http://www.coreipsystems.com/statistics.html Stern, J. (2012, October 30). Hurricane Sandy takes down Gawker, Huffington Post and other websites. Retrieved March 8, 2015, from http://abcnews.go.com/Technology/hurricanesandy-takes-york-city-data-center-gawker/story?id=17601425 Troxel, C. (2012, May 17). When Pixar deleted Toy Story 2 - Cloud disaster recovery case study. Retrieved March 8, 2015, from http://www.zetta.net/blog/pixar-deleted-toy-story-2cloud-disaster-recovery-hero/
11 Appendix A Research Questions Asked to CWU Students in IT486 1. Find a company that failed to plan (or plan adequately) for business continuity and disaster recovery. What did they fail to do and how did it affect the company? 2. What issues might a company have with focusing too much on just business continuity or just disaster recovery, instead of focusing on both? 3. Who are the key stakeholders in developing and maintaining a business continuity and disaster recovery plan? What impact do they have on the process? 4. How has cloud computing helped companies with their business continuity and disaster recovery plans? 5. Since technology changes very frequently, what can businesses do to maintain up-to-date business continuity and disaster recovery plans? 6. Who should be responsible for developing, testing and maintaining BC and DR plans? Should they all be the same person or people?