Objective
Understanding the definition of RAID and the different RAID levels.
Overview
RAID is a storage virtualization technology which is capable of storing data in different locations on multiple hard disks (HDD) or solid-state drives (SSD) to protect data in case of drive failure. There are a variety of RAID levels, with each having different functions.
RAID operates by distributing data across multiple disks and enabling input/output (I/O) operations to overlap in a balanced way, thus increasing the performance. Using RAID to store data also improves fault tolerance because using multiple disks can increase the mean time between failures.
RAID arrays appear to the OS as a singular logical drive. There are two techniques applied by RAID, namely disk mirroring and disk striping. Disk mirroring copies identical data into another drive, while disk striping spreads data over multiple drives.
RAID Levels
RAID is implemented through different levels, each having their own functionality and characteristics. Each level is given its own number to make people easier to identify them. By right, RAID levels are categorized into three: standard, nested and nonstandard.
RAID 0
RAID 0 is fully based on data striping. Data is divided into blocks and combined into multiple disks. When the system wants to read the data, it will join all the disks together. As a result, the disk capacity of a RAID 0 array would be the total sum of all the hard disk capacity.
As there are no redundancy and fault tolerance in RAID 0, it is not recommended for use in critical systems.
Advantages
- Increases read and write speed significantly.
- Increases total storage capacity and no wastage of space.
- Lower cost to storage ratio.
Disadvantages
- There is no redundancy.
- No fault tolerance, if one disk fails, entire data will become corrupted.
RAID 1
RAID 1 utilizes disk mirroring. Data is cloned into an identical set of disks, in the case that one clone fails, the other can be used to replace it. There is no disk striping involved. Because of this, a RAID 1 setup would have to consist of at least two hard disks.
Advantages
- Data is not lost and can be retrieved during disk failure.
- Increase read operation performance.
Disadvantages
- Reduced write performance because all drives must be updated each time a new data is written.
- Takes up more storage space to duplicate the data, therefore higher cost to storage ratio.
RAID 5
RAID 5 utilizes block striping with distributed parity information. If a drive in the array fails, remaining data in the other drives can be combined with the parity data to rebuild the actual data. A minimum of three hard disks are required for a RAID 5 setup.
As shown in the diagram below, a RAID 5 configuration will distribute the parity blocks (Ap, Bp, Cp, and Dp) among the four disks. If one disk were to fail, the data can be rebuilt using the parity information stored in the other disks.
Advantages
- Cost-efficient data redundancy (lower cost per storage ratio).
- Increased performance for read operations because of data striping.
Disadvantages
- Can only handle up to one disk failure. If more than one disk fails, then data cannot be restored.
RAID 6
RAID 6 is an enhanced version of RAID 5, using double parity blocks to improve data redundancy. In a RAID 6 configuration, each disk has two parity blocks stored across the array. Because of this, the array would still be able to recover if there are two disk failures. RAID 6 configuration requires at least four hard disks.
Advantages
- Improved fault tolerance and redundancy.
Disadvantages
- Slower write performance compared to RAID 5 due to additional parity data that needs to be calculated.
RAID 10 (RAID 1+0)
RAID 10 is the combination of RAID 1 and RAID 0, where data is first mirrored, and then striped. It has both the fast performance of RAID 0 and the fault tolerance of RAID 0. A RAID 10 configuration requires at least four hard disks to operate.
Advantages
- Fast performance, comparable to that of RAID 0
- Provides redundancy and fault tolerance
Disadvantages
- High cost to storage ratio because data is mirrored.