Parity is a mathematical "summary" of data that allows for the reconstruction of lost information without requiring a full 1:1 copy of every file.
Traditional mirroring (copying) requires 100% extra space. Parity allows for protection using significantly less overhead by storing a mathematical relationship between files instead.
"Filling in the Blanks"
Imagine three boxes: two for data (numbers) and one for Parity (the sum of the first two).
If Disk 1 is destroyed, the system solves the remaining equation:
[?] + 15 = 25.
The system identifies the missing value (10) and restores it to a new disk.
RAID 5 Logic
The most common form of protection, allowing an array to survive the loss of any single disk.
Mechanism: It uses the XOR (Exclusive OR) operation. XOR compares bits across multiple disks to create a "map" that can reverse-engineer any single missing piece of the puzzle.
Limitation: Single parity can only solve for one unknown variable. If two disks fail simultaneously, the mathematical equation becomes unsolvable (e.g., X + Y = 25 has too many possibilities).
Used to survive multiple simultaneous disk failures by employing more complex mathematical formulas.
Survives 2 simultaneous failures.
Requires two separate equations: an addition-based sum (XOR) and a polynomial-based formula (Reed-Solomon) where disks are treated as coordinates on a geometric curve.
Survives 3 to 6 failures.
Essential for large arrays (20+ disks) where the statistical risk of multiple failures during a rebuild is higher.
The Parity Disk must be at least as large as the largest Data Disk in the array.
Reason: Parity is calculated on a "sector-by-sector" basis. To protect the 1,000th gigabyte of a data drive, the parity drive must have a 1,000th gigabyte available to store that specific calculation. Without that space, the data is left unprotected.
Physical disk failure is obvious: the drive disappears. However, "Silent Corruption" (bit-rot) is invisible. A single bit flips from a 0 to a 1, but the disk still appears "healthy" to the computer.
"Parity can fix a hole, but it cannot find a lie."
Recall our simple math: Disk A (5) + Disk B (3) = Parity (8).
Scenario: Bit-Rot on Disk A
The system sees a mismatch, but faces a dilemma:
Without knowing the culprit, parity is useless. If you try to "fix" the array blindly, you might overwrite good data with bad data.
"Digital Fingerprints"
A checksum is a mathematical fingerprint for a specific file or block of data.
Result: We know with 100% certainty that Disk A is the liar. We can now safely use the math (Parity 8 - Disk B 3) to recalculate the original 5 and repair the corruption.
"Integrity checksums provide the Identity of the error, while Parity provides the Medicine. You need both to perform a successful surgery on your data."