Posts Tagged ‘Disaster’

What Is the Difference Between Hot, Warm and Cold Disaster Recovery?

December 17th, 2011

When it comes to implementing your business continuity plan what strategy do you adopt for the disaster recovery element? (for a description on the difference between Disaster Recovery and Business Continuity please see my article on Disaster Recovery or Business Continuity?).

You may have heard the terms hot, cold and warm recovery, but what do they mean, and what are the advantages and disadvantages of each service?

  • Hot Standby

Hot standby is normally available to the users within minutes of a disaster situation. This level of service is achieved by total duplication of the computer systems covered (hardware, software and data). There will also be a requirement for a resilient network connection into the Hot Site.

Benefits – Available immediately; dedicated to (customer).

Disadvantages – Cost; Complexity, management.

  • Warm Standby

Warm standby is normally available to the users within hours of a disaster situation. This is by far the most common type of service utilised by for I.T. disaster recovery, and typical recovery times range from 8 hours to 24 hours (dependant on complexity, location and data volumes).

The service can be delivered from a remote recovery centre, or alternatively, delivered to site in the event of a disaster. Depending on the equipment involved the configuration may be installed within an existing facility or a mobile recovery unit.

It should be noted that whilst the Hot standby option is normally dedicated to one customer, Warm standby is delivered on a subscription basis. Industry standards are between ten and twenty five subscribers per configuration. Availability is therefore not guaranteed in the event of a disaster. Testing is also normally to a predefined number of days P.A.

Benefits – Lower cost; reasonable availability.

Disadvantages – Availability; recovery timescales are longer; limited testing available; only available for a limited period following a disaster.

  • Cold Standby

Cold standby is the provision of computer and people facilities that are made available to the client within a few hours of the incident. Unless the service is backed up by a contract to supply the necessary computer equipment, the recovery period is likely to be several days. It is not unusual for Warm and Cold standby services to be combined, giving a very flexible approach to recovery.

Fully serviced office space is also available on a subscription basis. These are usually equipped with PCs, servers, printing facility and a network infrastructure. These would be described as Business Recovery Centres, and could also incorporate Cold space for central systems.

Benefits – Lower cost; large amount of available space (can accommodate large systems). Business recovery Centres can accommodate several hundred people.

Disadvantages – Availability; recovery timescales are longer; limited testing available; only available for a limited period following a disaster; additional recovery services needed.

Disaster Recovery Invocation Procedures

December 16th, 2011

The following procedure illustrates at high level the first 24 hours following disaster invocation. This procedure is based on a “warm” recovery service.

Following a disaster, clearly defined steps/actions need to be taken to enable business continuity. During the first 24 hours these steps will fall into the following categories.

Initial Assessment

Timescales – Immediately (T + 0)

Following a disaster situation the first step that must be taken is to assess the current situation. This will be carried out by the Disaster Co-ordinator, who will decide if the Disaster Management Team needs to be assembled. The team will need access to a Disaster Command Facility, if the primary location is not accessible for any reason. The Disaster Management Team and Command Centre should be detailed, along with relevant phone/mobile numbers and directions in the Business Continuity Plan.

The relevant emergency services should have already been notified of the situation. The Disaster Management Team would act as the main focal point for the emergency services.

It may be necessary to make a pre-invocation call to put the Disaster Recovery service on standby, thereby reducing the response time should the service be formally invoked.

Disaster Management Meeting

Timescales – within 1 hour (T + 1 hour)

If it is necessary to call a formal Disaster meeting, this should happen within 1 hour of the event. It may not be possible to get all members of the team together in these timescales, therefore all essential members should be agreed upon and documented in the plan.

The Disaster Management Team’s main role would be to:

­ Define the problem
­ Define the extent of the disruption
­ Determine the likely impact on your business
­ Estimate outage length (where possible)
­ Invoke Disaster Recovery service if applicable
­ Formally set up Disaster Command centre
­ Agree team’s objectives for next three hours
­ Agree formal verbal report for senior management
­ Agree on staffing levels needed at the present time
­ Send non-essential staff home (if during office hours)
­ Contact non-essential staff at home (if out of hours)
­ Call in additional staff (if out of hours)
­ Set up next meeting for T + 4 hours

Disaster Review Meeting

Timescales – within 2 hours (T + 2 hours)

At this stage you should have a much more detailed understanding of the situation. This will enable a full written report to be produced for senior management.

The Disaster Management Team will have by this time:

­ Invoked the disaster Recover Service (if applicable)
­ Set up a temporary Disaster Command centre
­ Mobilise essential staff members

If applicable the warm standby (Disaster Recovery) services should be available by this time to start configuration of the standby systems.

Configuration of Standby Equipment

Timescales – within 2 hours of invocation (T + 4 hours)

Warm Disaster Recovery configurations are normally scheduled to be available within 2 hours of invocation. By this time the site should be ready to receive the equipment. Power and Communications should be enabled and facilities for the essential staff should be available. Additional equipment needing to be purchased may arrive some time after this. The backup media will also have arrived onsite.

Restoration of Data and Testing

Timescales – within 20 hours of invocation (T + 22 hours)

Up to 8 hours may be required to restore and test the system. Comprehensive user acceptance test (UAT) procedures should be documented in your Disaster Recovery Plan to ensure the systems are fully operational before they are announced to be live to the end user.

Systems available to end users

Timescales – within 22 hours of invocation (T + 24 hours)

At this stage you should be able to resume some (or all) of your business activities (depending on the scope of the disaster). It is critical at this stage to plan for full business restoral. These steps should include:

­ Interim requirement such as larger temporary accommodation
­ Refurbishment of damaged offices (if applicable)
­ Identification of new premises (if applicable)
­ Replacement of damaged equipment

A full Business Resumption plan should also be produced, detailing the transition from the standby facility to permanent offices.

Disaster Planning For Small Business

November 6th, 2011

No one knows just how many small businesses owners lost everything in Hurricane Katrina. No one knows how many will be able to come back from disaster. But the odds are that the ones who successfully rebuild there businesses will be the ones who had a disaster plan in place before the hurricane struck. A solid small business disaster plan has three components, protecting human resources, protecting physical resources and planning for business continuity.

If you’re a sole proprietor, your plans to protect your human resources probably dovetail easily with your plans to protect your home and family. However, if you have employees, you need a more detailed plan to estimate how long employees will be unable to get to work, what your policy will be for compensation while employees are out of work after a disaster, and how you will make payroll if computer systems and banks are inaccessible.

You will also need an immediate disaster response plan to cover what you and your employees will do in an emergency and during its aftermath to protect life and limb. This plan should include things like administering first aid, food and water storage, establishing a company-wide meeting place and proper safety precautions that you and your employees should take during and after a disaster.

Protecting your physical resources is more complicated depending on whether you own or lease the building that houses your business. If your building is leased you will need to work with your landlord to develop a solid property protection plan for the building. You and your employees will need to develop a separate plan to protect assets in the leased space such as furniture and computers. If you own your building, consulting with an architect or engineer about your building’s capabilities in a disaster can help you plan what measures need to be taken to protect it. Your local chamber of commerce or Small Business Administration can provide you with a property protection checklist to incorporate into your disaster plan.

Having employees present in an undamaged building after a disaster won’t do any good if you don’t have the critical records you need to run your business. A business continuity plan will ensure that you have procedures in place to protect your vital paper and electronic records. A business continuity plan also needs to address issues like interruption of deliveries from upstream suppliers and estimates of your company’s ability to deliver to your customers after a disaster.