|
Hotswap
Functionality
The
ability to quickly and easily exchange failed components in a RAID
system is vital. Most, if not all, commercial RAID systems offer
modular components in quick release modules that allow you to simply
release and remove the failed component and insert the new one.
This is a fundamental design that no serious RAID system should
be built without.
However, a RAID system has objectives beyond this type of simplicity.
The replacement of components when the system is powered down or
off-line is not enough for a mission critical storage system that
is under constant use and access. Bringing down an Internet server
on a busy web site just to exchange a power unit - no matter how
quickly it may be done - simply cannot be tolerated. As more and
more organisations rely on their IT systems, and as an obvious consequent,
their data storage, the provision of continuous and uninterrupted
access is the main motivation behind the implementation of protected
storage systems. Protected storage systems that need to be taken
off-line for the replacement of basic consumer components such as
PSUs, fans, hard drives, etc. defeats the entire purpose of their
installation.
As one of the main objectives of a RAID array is the provision of
non-stop data access, the ability for the system to provide exchange
of critical components without any disruption to user access is
extremely important. The two main ways in which this can be implemented
is by Hot-Swap and Warm-Swap functionality. Normal powering down
of a system for maintenance (replacing memory in a server for example),
is usually termed Cold-Swap in comparison.
Hot-swap is the ability to exchange the component with no disruption
to I/O requests and transactions, no powering down of any part of
the system except the failed component, and bringing the new component
on-line and integrated as part of the array with no further action
necessary to the remaining working components. The most common hot-swap
functionality the majority of higher-end RAID systems offer is the
ability to replace hard drives.
In a true hot-swap system, the failed hard drive may be released
from its bay in the enclosure without first stopping and I/O requests
to the array that the drive is a member of. All user access and
transaction must continue as normal. Once the drive is removed,
a new drive is added to the array and brought on-line by the controller.
If the drive contained part of the logical drive data, the controller
should then begin the rebuild or reconstruction process. Depending
on the controller, this may be manual rebuild or an automatic
rebuild. At no point should any disruption occur to the
I/O processing of the array. All good professional or enterprise
level RAID systems should include a hot-swap ability of as many
components as possible as a fundamental part of the basic design
including power supplies, cooling units, and hard drives.
Warm-swap is a compromise between hot-swap and cold-swap. In a typical
warm-swap of a hard drive the array may require I/O transactions
to be halted whilst the failed drive is exchanged, but the system
does not have to be powered down. This eliminates the delays incurred
by the drive spin-up, controller boot-up, and negotiation with the
host. In a warm-swap the controller simply places I/O requests on
hold until he component is ready, then resumes operation. This is
normally the only type of component exchange offered by PCI-based
RAID systems.
The ability to provide a hot-swap function depends on two major
components of an array: the RAID
Controller and the Enclosure.
The physical handling of power and data disconnection and reconnection
without disruption must be built into the drive enclosure. Once
this is available the controller must have the ability to recognise
and use this function.
Related
topics:
|