|
|
|
DR
Overview | DR Objectives | DR
& Integrated Backup | IT Outages
& Risk Assessment |
|
|
hh |
Disaster
Recovery Overview
Disaster
Recovery (DR) focuses on the continuity of IT operations
in the event of disaster scenarios, and is a logical subset
of the Business Continuity Planning (BCP) process. The
design of any IT infrastructure should be compatible with
that of the organisation's Disaster Recovery Planning
(DR Planning) procedure. As most modern businiess place
a heavy reliance upon electronic data to perform essential
tasks, it is critical that key IT systems and components
are given adequate protection from preventable loss.
Although
the information here primarily focuses on critical data
storage and backup systems, the importance of individual
IT and network components (e.g. specialised transaction
servers, bespoke development platforms, access etc.) must
also be assessed and addressed.
However,
regardless of business model, the most critical component
of any IT structure is the business data
it contains and processes. Without the availability of
critical business data, the business cannot operate properly,
and, in the cases of total and irretrievable loss, full
business recovery is unlikely. Invariably, the loss of
critical business data carries higher financial penalties
than the loss of the systems that stored and processed
it.
DR
Planning needs to address any integral flaws in the existing
data storage infrastructure that present barriers to fundamental
Business Continuity. The primary objective of any proposed
storage network is to move the business IT environment
to a more stable and reliable platform that is capable
of actively participating in DR and BCP.
DR
Objectives:
Getting the Basic Data Protection Strategy Right
The
primary
objectives of a data storage strategy are complex. At the
basic level, it needs to address capacity management and
availability, ensuring that it is devised in a manner which
guarantees flexible headroom (storage capacity will be able
to scale in order to meet increasing data growth demands),
provides a level of protection against hardware failure
(fault tolerance), and also permits capacity be added without
disruption or downtime (transparent dynamic expansion).
Similarly, the backup sub-system must be devised in a manner
that is able to cope and scale alongside the primary storage
systems.
At
the highest level, a data storage strategy needs to comply
with the rigorous demands set by disaster recovery and
business continuity planning, offering options for off-site
storage, remote replication, automated backup to increase
reliability, and restoration procedures that encompass
every scenario - from single file recoveries triggered
by user requests, to the ability to retrieve and provide
access to critical data in disaster situations where primary
storage systems are temporarily or permanently lost.
DR
strategies and solutions are themselves also very complex,
and to help categorise the various solutions and their
characteristics, definitions of the varying levels and
required components can be defined. The model used is
a standard industry-wide tier structure that breaks down
the varying strategies into a series of seven defined
and escalating tiers. Although each component of the strategy
can be individually implemented through a series of phases,
to ensure appropriate equipment procurement, software
compliance, active procedures, and strategy designs, are
all integrated seamlessly, there is an essential need
for a organisational data strategy that these components
should be designed to comply with.
The
first step is to focus on ensuring the basic structure
meets a certain minimum level of requirements. An exmple
of these could be as follows:
-
A
reliable internal data storage platform providing scalable,
fault-tolerant storage systems
-
A
shared data network to reduce LAN congestion, and provide
fast, reliable access to servers, users, and data backup
services
-
-
Automated
backup copying services with off-site vaulting
-
Scalability,
with the system providing adequate capacity headroom
-
Flexibility
to provide support for future high level data management/disaster
recovery such as remote replication, hotsite or multi-site
data storage centres, High Availability, and real-time
imaging and backup that provides the ability to rollback
systems to an exact point in time.
Disaster
Recovery & Integrated Backup
The
storage models presented for consideration do not constitute
a DR process in themselves, however, they do provide the
base upon which initial DR techniques can be employed.
By
introducing processes such as automated backup (through
LAN-Free or Serverless backup), the reliability, performance,
quality of service, and regularity of data backups can
be improved dramatically. Storage devices on the network
communicate and transfer data directly between themselves
with no server involvement or processor overhead.
The
speed and quality of connection ensures quick efficient
backups and copies may be made at any time - even during
peak operating hours. This also allows reductions in the
Recovery Point Objective (RPO) bringing forward the age
of the data you want the ability to restore in the event
of a disaster. This automated backup process also allows
us to also initiate a direct tape-to-tape copy of the
backed up data onto a second single slot drive (the existing
LTO device if appropriate) to produce a physical copy
suitable for daily offsite vaulting in a secured storage
facility, a process known as PTAM - Pick-up Truck Access
Method - as it usually involves a secure courier service.
The
introduction of these backup procedures form the basic
initial steps towards developing a DR environment by placing
critical company inormation on secure systems that can
be replicated, rolled back, or otherwise restored in the
event of major failures or losses.
From
an IT-centric perspective, outages are classified as planned
or unplanned disruptions to operations. The list below
shows the types of outages commonly experienced in enterprise
computing environments. The majority of outages are familiar
risks that are applicable across all business centres,
and do not just affect the IT division. From the perspective
of a business, these risks should be already highly defined
and understood.
IT
Outages & Risk Assessment
An
unplanned IT outage can equate to an IT disaster, depending
on the scope and severity of the problem. Many Disaster
Recovery plans focus solely on risks within the data centre.
However, the importance of looking beyond the data centre
operations by implementing the BCP process in addition
to traditional IT Disaster Recovery Planning is essential.
Beyond the immediate control of the data centre, IT operations
face a variety of risks such as:
The
quality of the DR procedures in place is directly comparable
to the number of scenarios or disasters that the strategy
is designed to compensate for and recover from.
In-house
backup and security policies should be built that handles
the majority of basic situations such as restoration of
corrupt data, viral infection, hacking & external access,
network failure etc. DR policies will need to cover external
risks beyond the control of the business to predict or
prevent. Each possible risk does not have to be addressed
individually.
For
example, if a business has a DR process that provides
highly accessible, up-to-date copies of business critical
data at a secure remote location and in addition also
provides a method of automatic retrieval and restoration
of that data to redundant systems at secondary site (a
hotsite), most building-centric disasters can be eliminated
(or at least greatly reduced) from the IT DR risk assessment.
To
provide a basis for testing the resilience of your DR
procedures, there are seven defined levels or tiers of
DR. These tiers are accepted as de facto standards for
DR Planning. Every
DR Plan can therefore be given a classification based
upon which aspects of DR criteria it satisfies. Classifications,
from lowest to highest DR ability, run from Tier 0 (the
lowest classification and is classed as the lack of any
DR ability) to Tier 6 (the highest classification which
provides for zero data loss in multiple disaster scenarios).
|
Important
information on StorageWorld |
|
StorageWorld
is a reference for my clients, colleagues in the
data storage industry, and end-users (hopefully
potential clients) to provide an overview of the
range of data storage service offered. There is
a wide & diverse network of independent data
professionals in the UK providing consultant &
engineering services on all aspects of data storage,
data network design, project management of data
system implementations, and offering vendor-independent
advice through either a direct relationship with
end clients or through third-party suppliers.
The
objective of this site was to use the term 'StorageWorld'
as a name to describe an independent group of professional
storage colleagues who offered their services directly,
or through contracting, to clients, and provides
a platform to promote and explain the type of services
we provide. It also serves as a contact point for
services currently rendered to clients, with restricted
sections that we may use as central reference library.
Suggestions,
comments, contributions, error corrections, or other
interest welcome.
|
|
|
|