RAID is short for redundant array of independent disks.
Originally, the term RAID was defined as redundant array of inexpensive disks, but now it usually refers to a redundant array of independent disks. RAID storage uses multiple disks in order to provide fault tolerance, to improve overall performance, and to increase storage capacity in a system. This is in contrast with older storage devices that used only a single disk drive to store data.
RAID allows you to store the same data redundantly (in multiple paces) in a balanced way to improve overall performance. RAID disk drives are used frequently on servers but aren’t generally necessary for personal computers.
How RAID Works
With RAID technology, data can be mirrored on one or more disks in the same array, so that if one disk fails, the data is preserved. Thanks to a technique known as striping (a technique for spreading data over multiple disk drives), RAID also offers the option of reading or writing to more than one disk at the same time in order to improve performance.
In this arrangement, sequential data is broken into segments which are sent to the various disks in the array, speeding up throughput. A typical RAID array uses multiple disks that appear to be a single device so it can provide more storage capacity than a single disk.
Standard RAID Levels
RAID devices use many different architectures, called levels, depending on the desired balance between performance and fault tolerance. RAID levels describe how data is distributed across the drives. Standard RAID levels include the following:
- Level 0: Striped disk array without fault tolerance. Provides data striping (spreading out blocks of each file across multiple disk drives) but no redundancy. This improves performance but does not deliver fault tolerance. If one drive fails then all data in the array is lost.
- Level 1: Mirroring and duplexing. Provides disk mirroring. Level 1 provides twice the read transaction rate of single disks and the same write transaction rate as single disks.
- Level 2: Error-correcting coding. Not a typical implementation and rarely used, Level 2 stripes data at the bit level rather than the block level.
- Level 3: Bit-interleaved parity. Provides byte-level striping with a dedicated parity disk. Level 3, which cannot service simultaneous multiple requests, also is rarely used.
- Level 4: Dedicated parity drive. A commonly used implementation of RAID, Level 4 provides block-level striping (like Level 0) with a parity disk. If a data disk fails, the parity data is used to create a replacement disk. A disadvantage to Level 4 is that the parity disk can create write bottlenecks.
- Level 5: Block interleaved distributed parity. Provides data striping at the byte level and also stripe error correction information. This results in excellent performance and good fault tolerance. Level 5 is one of the most popular implementations of RAID.
- Level 6: Independent data disks with double parity. Provides block-level striping with parity data distributed across all disks.
- Level 10: A stripe of mirrors. Not one of the original RAID levels, multiple RAID 1 mirrors are created, and a RAID 0 stripe is created over these.
Non-Standard RAID Levels
Some devices use more than one level in a hybrid or nested arrangement, and some vendors also offer non-standard proprietary RAID levels. Examples of non-standard RAID levels include the following:
- Level 0+1: A Mirror of Stripes. Not one of the original RAID levels, two RAID 0 stripes are created, and a RAID 1 mirror is created over them. Used for both replicating and sharing data among disks.
- Level 7: A trademark of Storage Computer Corporation that adds caching to Levels 3 or 4.
- RAID 1E: A RAID 1 implementation with more than two disks. Data striping is combined with mirroring each written stripe to one of the remaining disks in the array.
- RAID S: Also called Parity RAID, this is EMC Corporation’s proprietary striped parity RAID system used in its Symmetrix storage systems.
- RAID History and Alternative Storage Options. Before RAID devices became popular, most systems used a single drive to store data. This arrangement is sometimes referred to as a SLED (single large expensive disk). However, SLEDs have some drawbacks. First, they can create I/O bottlenecks because the data cannot be read from the disk quickly enough to keep up with the other components in a system, particularly the processor. Second, if a SLED fails, all the data is lost unless it has been recently backed up onto another disk or tape.
In 1987, three University of California, Berkeley, researchers — David Patterson, Garth A. Gibson, and Randy Katz — first defined the term RAID in a paper titled A Case for Redundant Arrays of Inexpensive Disks (RAID). They theorized that spreading data across multiple drives could improve system performance, lower costs and reduce power consumption while avoiding the potential reliability problems inherent in using inexpensive, and less reliable, disks. The paper also described the five original RAID levels.
Today, RAID technology is nearly ubiquitous among enterprise storage devices and is also found in many high-capacity consumer storage devices. However, some non-RAID storage options do exist. One alternative is JBOD (Just a Bunch of Drives). JBOD architecture utilizes multiple disks, but each disk in the device is addressed separately. JBOD provides increased storage capacity versus a single disk, but doesn’t offer the same fault tolerance and performance benefits as RAID devices.
Another RAID alternative is concatenation or spanning. This is the practice of combining multiple disk drives so that they appear to be a single drive. Spanning increases the storage capacity of a drive; however, as with JBOD, spanning does not provide reliability or speed benefits.
RAID Is Not Data Backup
RAID should not be confused with data backup. Although some RAID levels do provide redundancy, experts advise utilizing a separate storage system for backup and disaster recovery purposes.
Setting Up a RAID Array
In order to set up a RAID array, you’ll need a group of disk drives and either a software or a hardware controller. Software RAID runs directly on a server, utilizing server resources. As a result, it may cause some applications to run more slowly. Most server operating systems include some built-in RAID management capabilities.
You can also set up your own RAID array by adding a RAID controller to a server or a desktop PC. The RAID controller runs essentially the same software, but it uses its own processor instead of the system’s CPU. Some less expensive “fake RAID” controllers provide RAID management software but don’t have a separate processor.
Alternatively, you can purchase a pre-built RAID array from a storage vendor. These appliances generally include two RAID controllers and a group of disks in their own housing.
Using a RAID array is usually no different than using any other kind of primary storage. The RAID management will be handled by the hardware or software controller and is generally invisible to the end user.
RAID Technology Standards
The Storage Networking Industry Association has established the Common RAID Disk Data Format (DDF) specification. In an effort to promote interoperability among different RAID vendors, it defines how data should be distributed across the disks in a RAID device.
Another industry group called the RAID Advisory Board worked during the 1990s to promote RAID technology, but the group is no longer active.