How RAID systems work
The term RAID is not the acronym of the well-known French police unit, but rather the English acronym for a storage system made up of multiple physical drives.
RAID, which stands for "Redundant Array of Independent Disks", is a form of storage virtualization. In this article, I will detail the various RAID systems commonly used. For simplicity, I will refer to RAID systems based on hard drives (SATA, SCSI, or SAS). However, it is entirely possible to set up RAID systems using SSDs or even USB flash drives.
Storage Virtualization
Storage virtualization is a computing concept that emerged in 1987. It involves combining several physical storage devices into one large virtual storage system (by using all available drives). In the 1990s, storage devices were significantly more expensive than today. The original goal of RAID was therefore to provide more storage capacity at a lower cost.
The Downside of Simple Virtualization
Initially, virtualization systems used two hard drives to split data into small “chunks” across them. Due to this distribution, reliability was very poor. If either drive failed, all data would be lost. The risk of losing data was very high.
Redundancy
A team soon began working on techniques to increase the reliability of RAID storage. The main objective was to reduce the likelihood of data loss. This is when true RAID systems began to emerge, meaning systems that tolerate the failure of one or more drives. In French, RAID translates to “Redundant Group of Independent Disks.” In short, a proper RAID system can handle hardware failures without data loss.
Creating a RAID
RAID systems can be managed in three ways: through a dedicated hardware RAID controller (hardware RAID), through the computer’s motherboard (pseudo-hardware RAID), or through the operating system (software RAID).
In software RAID, the RAID configuration is managed by a software layer within the operating system (e.g., Windows). Reinstalling the OS would result in the loss of the RAID configuration.
In pseudo-hardware RAID, the RAID is managed by the motherboard, typically through the SATA controller, which has some additional features to create RAID arrays. It’s not a dedicated RAID controller.
In hardware RAID, the system is managed by a dedicated controller card, often with its own processor — similar to a graphics card but designed for RAID management. This is the best but also the most expensive option.
Different RAID Systems
There are several types of RAID that I’ll explain (RAID 0, RAID 1, RAID 10, RAID 5, RAID 6, and JBOD). As you will see, some RAID types are not actually redundant. These should be used cautiously and only in specific cases.
Block Size
This concept, expressed in kilobytes, is crucial to understanding RAID. It refers to how data is "sliced" into blocks. For example, in a RAID system made of two drives with a 64 KB block size, the first 64 KB block is stored on the first drive, the next 64 KB on the second, and so on. A 192 KB file would be split into 3 blocks written alternately across both drives. If one drive fails, you lose half the file's blocks and thus the entire file.
JBOD
![]() |
JBOD, which stands for "Just a Bunch Of Disks", is not part of the redundant RAID systems. However, it can be found in some storage systems, so I decided to include it here. JBOD is essentially a virtual stacking of drives, regardless of their size. When the first drive is full, the second begins to fill, and so on. If one of the drives fails, you only lose the data stored on that particular drive. JBOD is simple but offers no security or speed benefits, unlike the RAID systems we'll explore next. |
RAID 0
![]() |
RAID 0 is not a redundant system and doesn’t truly deserve the RAID label. A RAID 0 storage volume is made up of multiple drives — typically at least two. Data is distributed across these drives using the block concept explained earlier. This type of RAID is fast. The system writes to all drives simultaneously. Simply put, with two drives in a RAID 0 setup, writing or reading a file takes half the time compared to using a single drive. When using drives of equal capacity, the total storage in RAID 0 equals the sum of all the drives — so no storage is lost. The main drawback, as mentioned before, is that failure of any one drive results in the loss of all data, since files are split into blocks across all drives. |
RAID 1
![]() |
RAID 1 is essentially a mirror. Data is written to all drives — typically two — forming the RAID 1 array. If one drive fails, the other remains as a perfect copy. Since data is written fully to each drive, there’s no speed advantage compared to a single hard drive. Also, with two drives in RAID 1, you effectively lose 50% of total capacity due to mirroring. It’s the most secure system, but also the least efficient in terms of capacity/cost/speed ratio. |
Parity
As we’ve seen, JBOD and RAID 0 offer no data protection. RAID 1, while secure, sacrifices 50% of total capacity when using two drives. To overcome these drawbacks, it’s possible to use RAID systems consisting of at least three drives and incorporating parity.
Parity is the result of a simple XOR calculation based on data blocks from other drives. It's essentially the result of an equation (e.g., 5 + 8 = 13, where 13 is the parity). This parity is stored on one of the drives and “rotates” among all the drives according to specific algorithms.
Thanks to parity, if one drive fails, the missing blocks can be recalculated using the data from the other drives and the parity — similar to solving an equation with one unknown (e.g., 5 + X = 13, X = 8).
RAID 5
![]() |
RAID 5 combines elements of RAID 0 with parity. It requires a minimum of 3 drives, ideally of the same capacity. Total usable storage is equal to ("number of drives" – 1) × "smallest drive size". For example, with 3 × 1000 GB drives, usable capacity is 2000 GB. With 16 drives, you get 15,000 GB usable space. This RAID type addresses two major limitations of RAID 1 — storage capacity and speed — as data blocks can be written to all drives simultaneously. RAID 5 offers a compelling balance of performance and security, which is why it's the most commonly used RAID level in servers and NAS. However, it can only tolerate the failure of a single drive. Any more, and all data is lost. |
RAID 6
![]() |
RAID 5 only tolerates the failure of one drive, which isn’t always sufficient. Moreover, with many drives in a RAID 5 array, the impact of parity on total capacity is minimal. For instance, with 16 × 1000 GB drives, you get 15,000 GB of usable space — losing just 1000 GB to parity. To increase RAID 5’s resilience, RAID 6 builds upon it by adding a second layer of parity. This second parity is generally calculated using Reed-Solomon coding. The two parities — XOR and Reed-Solomon — are stored across the drives using a precise algorithm. Total usable storage is ("number of drives" – 2) × "smallest drive size". For 16 × 1000 GB drives, you get 14,000 GB usable space. RAID 6 can therefore tolerate the failure of two drives. |
Conclusion
This brings us to the end of this introductory article on RAID systems. As you’ve seen, I cannot recommend using RAID 0 — a configuration found in some high-capacity external drives — because the risk of data loss is extremely high. Prefer a system like RAID 5, which offers both high storage capacity and faster speeds than RAID 1 mirroring.