RAID is the term used for systems that employ multiple hard disk drives to form what the host computer sees as a single storage volume. RAID was initially introduced when larger capacity drives were particularly expensive and used a controller with an array of multiple cheaper, smaller capacity drives to form a large volume. This gave rise to the acronym RAID, standing for Redundant Array of Inexpensive Drives.
As well as boosting overall capacity it introduced the possibility of redundancy, where use of data ‘mirroring’ or ‘parity’ processes means that if an individual drive failed, this would not necessarily lead to a permanent loss of data. It also allowed the speed at which data could be written to and read from the array to be increased by ‘striping’ data across more than one drive simultaneously.
The vastly reduced ‘cost per GB’ of today’s high capacity drives has meant that RAID systems are now less about the cost of overall capacity, and more about increasing performance, maintaining system availability and securing data through redundancy. This has lead to the meaning of RAID now becoming accepted as a Redundant Array of Independent Drives.
A multiplicity of different RAID types has emerged indicated by numbers, i.e. RAID 0 or RAID 5. The various types each have differing attributes aimed at increasing performance or data security (or more commonly now a combination of the two), and each will be a compromise between these advantages and the resultant complexity and increased hardware costs. Each type has a wide range of independently configurable parameters meaning that the overall range of possible configurations can be bewildering.
RAID system failures can stem from a range of differing causes. Hardware failures of individual drives would normally be within the scope of the system to handle, but multiple drive failures, or failures of the controller can often lead to a system ‘crash’. Even the loss of a single drive, if not responded to in the correct manner by experienced personnel, can lead to a ‘catastrophic’ failure of the entire system. This illustrates that despite the concept of RAID having great strategic benefits for storage performance and data security, these will only be achieved where the system is understood, implemented and managed correctly.
Where a RAID system has failed for whatever reason, our the recovery procedure follows an established process:
o On-s Site or Remote Consultation and Technical Support – The first step is to gather information about the system and it’s configuration, the nature and cause of the failure, and the steps necessary to limit further data loss and initiate the recovery process. Prompt and effective support at this stage can make the recovery process easier and quicker and may even be sufficient to reinstate the system without the need for further intervention.
o In-lLab Diagnosis – The components of the system will be diagnosed for individual failure, and the data may be transposed to a recovery server for analysis to protect the original source. The next key stage is to ascertain the original configuration. This involves analysis to obtain such information as RAID Type, Disk Order, any Hot Spare Disks, Stripe Size, Parity Type and Rotation. Correct identification of these parameters is vital to recover data and may require the use of the latest applications and algorithms.
o Raid RAID Reconstruction and Commission – With the parameters established, the system can be returned to its original configuration and tested to confirm the integrity of the resulting data.
o Data Retrieval and Repair- – The data can now be recovered and checked with the client to confirm that a full recovery has been achieved. Arrangements at this stage will be made with the client to return the data in their preferred manner, either by recreating the original RAID system, or in any other form that suits their individual requirements.
o On-Site site Data and System Restoration – To complete the total RAID recovery service, the system can be reinstalled on-site by our technicians. As well as testing the system to confirm full restoration, clients can be advised as to the correct system management processes and procedures to prevent any further instances of data loss.