RAID-10: Production-Grade Setup Guide

โ ๏ธ CRITICAL WARNING
THIS TUTORIAL DESTROYS DATA ON SPECIFIED DISKS
Use virtual disks or dedicated physical disks only
Never use system disks
Verify with lsblk before every command
Test in VM first
lsblk -o NAME,SIZE,TYPE,MOUNTPOINT
๐ก Understanding RAID-10: Two Different Approaches
What Does "RAID-10" Actually Mean?
RAID-10 literally means RAID 1+0 โ mirror first, then stripe:
Traditional RAID-1+0 (Nested):
Step 1: Create RAID-1 mirrors โ Step 2: Stripe those mirrors
mdadm --level=10 (Single Array):
Step 1: Create one array โ Step 2: mdadm handles mirroring via layout
They're NOT the same thing! Here's why it matters:
Method 1: mdadm --level=10 (What This Tutorial Uses)
Single array with automatic mirror placement.
๐ CORRECTED: How Mirroring Actually Works with near=2
With 4 disks and near=2 layout:
โโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโ
โ Disk 0 โ Disk 1 โ Disk 2 โ Disk 3 โ
โ (sda1) โ (sdb1) โ (sdc1) โ (sdd1) โ
โโโโโโโโโโโผโโโโโโโโโโผโโโโโโโโโโผโโโโโโโโโโค
โ A1 โ A1 โ B1 โ B1 โ โ Mirrors horizontally (0โ1, 2โ3)
โ A2 โ A2 โ B2 โ B2 โ
โ A3 โ A3 โ B3 โ B3 โ
โ A4 โ A4 โ B4 โ B4 โ
โโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
How mdadm actually decides mirror pairs:
CORRECT Mirror Pairing:
Mirror Pair 1 โ Disk 0 โ Disk 1 โ
/dev/sda1โ/dev/sdb1Mirror Pair 2 โ Disk 2 โ Disk 3 โ
/dev/sdc1โ/dev/sdd1
Explanation:
near=2means each data block is stored twice ("2 copies")mdadm places mirrors on adjacent disk positions (0โ1, 2โ3)
The "A" data is striped across positions 0-1, "B" data across positions 2-3
Both stripe sets work together for performance
Understanding set-A and set-B Labels:
When you see this in mdadm --detail:
Number Major Minor RaidDevice State
0 8 1 0 active sync set-A /dev/sda1
1 8 17 1 active sync set-B /dev/sdb1
2 8 33 2 active sync set-A /dev/sdc1
3 8 49 3 active sync set-B /dev/sdd1
What set-A and set-B actually mean:
set-A= First position in each mirror pair (positions 0 and 2)set-B= Second position in each mirror pair (positions 1 and 3)These labels indicate striping roles, NOT mirror partners
The actual mirrors are: 0โ1 and 2โ3
Failure Tolerance (CORRECTED):
โ Can survive:
Loss of Disk 0 OR Disk 1 (not both) โ Mirror Pair 1 survives
Loss of Disk 2 OR Disk 3 (not both) โ Mirror Pair 2 survives
Loss of Disk 0 AND Disk 2 โ โ Both pairs still have one disk
Loss of Disk 0 AND Disk 3 โ โ Both pairs still have one disk
Loss of Disk 1 AND Disk 2 โ โ Both pairs still have one disk
Loss of Disk 1 AND Disk 3 โ โ Both pairs still have one disk
โ Will fail:
Loss of Disk 0 AND Disk 1 โ โ Mirror Pair 1 completely lost
Loss of Disk 2 AND Disk 3 โ โ Mirror Pair 2 completely lost
Summary: You can lose up to 2 disks IF they're from different mirror pairs (0โ1 or 2โ3).
Method 2: True RAID-1+0 (Nested Arrays)
User-controlled mirror pairs + explicit striping.
Step 1: Create Two RAID-1 Mirrors
โโโโโโโโโโโฌโโโโโโโโโโ โโโโโโโโโโโฌโโโโโโโโโโ
โ sda1 โ sdc1 โ โ sdb1 โ sdd1 โ
โ A1 โ A1 โ โ B1 โ B1 โ
โ A2 โ A2 โ โ B2 โ B2 โ
โ A3 โ A3 โ โ B3 โ B3 โ
โโโโโโโโโโโดโโโโโโโโโโ โโโโโโโโโโโดโโโโโโโโโโ
md1 (RAID-1) md2 (RAID-1)
Step 2: Stripe the Mirrors
โโโโโโโโโโโโโโโ
โ md0 (RAID-0) โ
โ Stripes: โ
โ md1 + md2 โ
โโโโโโโโโโโโโโโ
Commands (nested approach):
# Create two RAID-1 mirrors
mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sda1 /dev/sdc1
mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdd1
# Stripe them
mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/md1 /dev/md2
Failure tolerance:
โ GUARANTEED to survive one disk from each mirror
You chose which disks mirror each other
Lose sda1 AND sdb1? โ Still works (different mirrors)
More control, more complexity
Method Comparison Table
| Feature | mdadm --level=10 | True RAID-1+0 (Nested) |
| Command Complexity | Simple (one command) | Complex (three commands) |
| Mirror Control | Automatic (adjacent pairs 0โ1, 2โ3) | Manual (you choose pairs) |
| Best Case | Survive 2 disk failures | Survive 2 disk failures |
| Worst Case | โ Fail with 2 disk losses (same pair) | โ Survive 2 disk losses (guaranteed) |
| Guarantee | Probabilistic (50% chance) | Guaranteed (100% if different mirrors) |
| Performance | Excellent | Excellent |
| Management | Single array | Multiple arrays |
| Use Case | Testing, general purpose, most production | Mission-critical databases |
Which Method Does This Tutorial Use?
This tutorial uses mdadm --level=10 because:
โ
Simpler to learn and demonstrate
โ
Good for understanding RAID-10 concepts
โ
Sufficient for most use cases
โ
Single array management
โ ๏ธ When to use nested RAID-1+0 instead:
Financial transaction databases
VM storage requiring guaranteed uptime
Any system where two disk failures must be survivable
Production systems with same-batch disks (higher correlated failure risk)
Math: Usable Capacity
Raw capacity = 4 disks ร 2GB = 8GB
Mirror overhead = 50% (everything duplicated)
Usable capacity = 8GB รท 2 = 4GB
Formula: (Total Disks รท 2) ร Disk Size
This applies to BOTH methods โ you always lose 50% to mirroring.
๐ Step 1: Verify Available Disks
lsblk -o NAME,SIZE,TYPE,MOUNTPOINT,FSTYPE
Output:
NAME SIZE TYPE MOUNTPOINT
sda 2G disk
sdb 2G disk
sdc 2G disk
sdd 2G disk
sde 2G disk
sdf 2G disk
vda 20G disk
Requirements:
Minimum 4 disks (must be even number)
Same size strongly recommended
Not mounted anywhere
Not your system disk!
Example for this tutorial:
/dev/sdathrough/dev/sddโ RAID-10 array (4 disks)/dev/sde,/dev/sdfโ Hot spares
๐ง Step 2: Partition Disks Properly
Why Partitioning Matters
Don't skip this! Using raw disks (/dev/sda) instead of partitions (/dev/sda1) causes:
Boot loader conflicts
Disk identification issues
Problems with disk replacement
Create Partition on First Disk
sudo fdisk /dev/sda
Inside fdisk (type exactly):
Command: n โ Create new partition
Type: p โ Primary partition
Number: 1 โ Partition number 1
First sector: [ENTER] โ Use default (starts at beginning)
Last sector: [ENTER] โ Use default (uses entire disk)
Command: t โ Change partition type
Hex code: fd โ Linux RAID autodetect (legacy but works)
Command: w โ Write changes and exit
Note: Modern systems can use 83 (Linux) or 8e (Linux LVM) instead of fd. Both work fine with mdadm 3.0+.
Critical: Force Kernel to Update
This step prevents "partition doesn't exist" errors:
sudo partprobe /dev/sda 2>/dev/null || true
sudo sync
sleep 1
What this does:
partprobeโ Tells kernel to re-read partition tablesyncโ Flushes disk cachessleep 1โ Gives kernel time to process
Repeat for All Disks
# Disk sdb
sudo fdisk /dev/sdb
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sdb 2>/dev/null || true
sudo sync && sleep 1
# Disk sdc
sudo fdisk /dev/sdc
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sdc 2>/dev/null || true
sudo sync && sleep 1
# Disk sdd
sudo fdisk /dev/sdd
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sdd 2>/dev/null || true
sudo sync && sleep 1
# Spare disk sde
sudo fdisk /dev/sde
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sde 2>/dev/null || true
sudo sync && sleep 1
# Spare disk sdf
sudo fdisk /dev/sdf
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sdf 2>/dev/null || true
sudo sync && sleep 1
Verify Partitions Exist
lsblk -o NAME,SIZE,TYPE,FSTYPE
ls -la /dev/sd{a,b,c,d,e,f}1
Expected output:
root@rhel:~# lsblk -o NAME,SIZE,TYPE,FSTYPE
NAME SIZE TYPE FSTYPE
sda 2G disk
โโsda1 2G part
sdb 2G disk
โโsdb1 2G part
sdc 2G disk
โโsdc1 2G part
sdd 2G disk
โโsdd1 2G part
sde 2G disk
โโsde1 2G part
sdf 2G disk
โโsdf1 2G part
root@rhel:~# ls -la /dev/sd{a,b,c,d,e,f}1
brw-rw----. 1 root disk 8, 1 Oct 30 11:21 /dev/sda1
brw-rw----. 1 root disk 8, 17 Oct 30 11:21 /dev/sdb1
brw-rw----. 1 root disk 8, 33 Oct 30 11:21 /dev/sdc1
brw-rw----. 1 root disk 8, 49 Oct 30 11:21 /dev/sdd1
brw-rw----. 1 root disk 8, 65 Oct 30 11:21 /dev/sde1
brw-rw----. 1 root disk 8, 81 Oct 30 11:21 /dev/sdf1
root@rhel:~#
If partitions don't appear: Run partprobe and sync again.
๐ฆ Step 3: Install mdadm
Debian/Ubuntu:
sudo apt update
sudo apt install mdadm -y
RHEL/CentOS/Rocky/AlmaLinux:
sudo dnf install mdadm -y
Verify installation:
mdadm --version
Expected: mdadm - v4.x - ...
๐๏ธ Step 4: Create RAID-10 Array
The Critical Command
sudo mdadm --create --verbose /dev/md0 \
--level=10 \
--raid-devices=4 \
--bitmap=internal \
/dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
Understanding Each Parameter
| Parameter | Meaning | Why It Matters |
--create /dev/md0 | Create new array named md0 | Standard naming convention |
--level=10 | RAID-10 (mirrored stripe) | Balance of speed + safety |
--raid-devices=4 | Use exactly 4 disks | Minimum for RAID-10 |
--bitmap=internal | Track changed blocks | Prevents full resync after crash |
| Device order | sda1 sdb1 sdc1 sdd1 | Creates pairs: 0โ1 and 2โ3 |
Why --bitmap=internal Is Mandatory
Without bitmap:
Power loss โ Unclean shutdown โ mdadm doesn't know which blocks changed
Result: Full array rescan (hours or days)
With bitmap:
Power loss โ mdadm checks bitmap โ Only resync changed blocks
Result: Minutes of recovery time
Trade-offs:
Overhead: ~1MB per 256GB of array size
Slight write penalty: ~1-3% (negligible)
In production, always use --bitmap=internal
What Happens Next
Prompt:
mdadm: layout defaults to n2
Continue creating array?
Type: y then press ENTER
What n2 means:
n= "near" layout (mirrors are physically adjacent positions)2= 2 copies of each blockResult: Positions (0,1) mirror each other, (2,3) mirror each other
Monitor Initial Synchronization
watch -n 2 cat /proc/mdstat
During sync:
md0 : active raid10 sdd1[3] sdc1[2] sdb1[1] sda1[0]
4190208 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
[===>.................] resync = 15.3% (642560/4190208) finish=1.2min speed=45000K/sec
bitmap: 1/1 pages [4KB], 65536KB chunk
What to look for:
[4/4] [UUUU]โ All 4 disks activebitmap: 1/1 pagesโ Write-intent bitmap workingresync = 15.3%โ Initial sync in progress
Press Ctrl+C when resync reaches 100%
Or wait automatically:
while grep -q resync /proc/mdstat; do sleep 2; done
echo "โ Sync complete"
Verify Array Configuration
sudo mdadm --detail /dev/md0
Expected output:
root@rhel:~# sudo mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Thu Oct 30 11:23:43 2025
Raid Level : raid10
Array Size : 4188160 (3.99 GiB 4.29 GB)
Used Dev Size : 2094080 (2045.00 MiB 2144.34 MB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Oct 30 11:23:47 2025
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : near=2
Chunk Size : 512K
Consistency Policy : bitmap
Name : rhel:0 (local to host rhel)
UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
Events : 17
Number Major Minor RaidDevice State
0 8 1 0 active sync set-A /dev/sda1
1 8 17 1 active sync set-B /dev/sdb1
2 8 33 2 active sync set-A /dev/sdc1
3 8 49 3 active sync set-B /dev/sdd1
๐ Understanding the Mirror Pairs
RaidDevice Position Disk Mirror Partner
0 /dev/sda1 โ
1 /dev/sdb1 โ Mirror each other (Pair 1)
2 /dev/sdc1 โ
3 /dev/sdd1 โ Mirror each other (Pair 2)
Note: The set-A / set-B labels indicate striping positions, not mirror partners.
What matters: Position mapping (0โ1, 2โ3) โ that's how mirroring works in RAID-10 near=2.
Quick Mirror Pair Verification (Run Anytime)
Want to see mirror pairs instantly? Use this:
#!/bin/bash
echo "Mirror Pairs (RAID-10 near=2):"
sudo mdadm --detail /dev/md0 2>/dev/null | awk '
/^[[:space:]]*[0-9]+[[:space:]]+8/ {
slot = $1; dev = $8; pair = int(slot / 2)
role = (slot % 2 == 0) ? "set-A" : "set-B"
printf " %-8s โ Pair %d (position %d, %s)\n", dev, pair, slot, role
}
' | sort -n -k6
Expected output:
root@rhel:~# #!/bin/bash
echo "Mirror Pairs (RAID-10 near=2):"
sudo mdadm --detail /dev/md0 2>/dev/null | awk '
/^[[:space:]]*[0-9]+[[:space:]]+8/ {
slot = $1; dev = $8; pair = int(slot / 2)
role = (slot % 2 == 0) ? "set-A" : "set-B"
printf " %-8s โ Pair %d (position %d, %s)\n", dev, pair, slot, role
}
' | sort -n -k6
Mirror Pairs (RAID-10 near=2):
/dev/sda1 โ Pair 0 (position 0, set-A)
/dev/sdb1 โ Pair 0 (position 1, set-B)
/dev/sdc1 โ Pair 1 (position 2, set-A)
/dev/sdd1 โ Pair 1 (position 3, set-B)
root@rhel:~#
What this shows:
Pair 0 =
/dev/sda1and/dev/sdb1mirror each otherPair 1 =
/dev/sdc1and/dev/sdd1mirror each otherPositions 0&1 are one mirror group, positions 2&3 are another
โ This proves the horizontal (0โ1, 2โ3) pairing, not vertical (0โ2, 1โ3)!
๐พ Step 5: Make Configuration Persistent
Why This Step Matters
Without saving the configuration:
Array won't assemble automatically after reboot
You'll have to manually reassemble with
mdadm --assembleSystem might not boot if it expects the array
Save Array Configuration
Debian/Ubuntu:
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
RHEL/CentOS/Rocky/AlmaLinux:
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
Verify it saved correctly
# Debian/Ubuntu
sudo cat /etc/mdadm/mdadm.conf | grep md0
# RHEL/CentOS
sudo cat /etc/mdadm.conf | grep md0
Expected output:
root@rhel:~# sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
ARRAY /dev/md0 metadata=1.2 UUID=eec6fc91:3e2b911f:37dd1dda:0b661777
Update Boot System (Critical!)
Debian/Ubuntu:
sudo update-initramfs -u
RHEL/CentOS:
sudo dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
What this does:
Embeds mdadm configuration into boot image
Ensures array assembles before root filesystem mounts
Required for arrays that contain system files
๐ Step 6: Create Optimized Filesystem
Why Alignment Matters
Misaligned filesystem = 20-30% performance loss
Without Alignment:
Write 1MB file โ Crosses chunk boundaries โ Extra reads โ Slower
With Alignment:
Write 1MB file โ Fits within chunks โ Direct writes โ Faster
Calculate Alignment Parameters
For RAID-10 with 512K chunks:
Filesystem block size = 4KB (4096 bytes)
RAID chunk size = 512KB (524288 bytes)
Stride = Chunk Size รท Block Size
= 524288 รท 4096
= 128 blocks
Stripe-width = Stride ร Number of Data Disks
= 128 ร 2
= 256 blocks
Why ร 2?
RAID-10 with 4 disks has 2 data disks actively storing unique data (the other 2 hold mirrors).
Corrected terminology: Use "data disks" instead of "stripe groups"
Create Filesystem with Proper Alignment
sudo mkfs.ext4 \
-L SPEED_RAID10 \
-b 4096 \
-E stride=128,stripe-width=256 \
/dev/md0
Parameter breakdown:
| Parameter | Value | Purpose |
-L SPEED_RAID10 | Label | Easy to identify in df and mount |
-b 4096 | 4K blocks | Matches modern 4K sector disks |
-E stride=128 | 128 blocks | Aligns writes to chunk boundaries |
-E stripe-width=256 | 256 blocks | Aligns to full stripe across data disks |
Expected output:
root@rhel:~# sudo mkfs.ext4 \
-L SPEED_RAID10 \
-b 4096 \
-E stride=128,stripe-width=256 \
/dev/md0
mke2fs 1.47.1 (20-May-2024)
/dev/md0 contains a ext4 file system labelled 'SPEED_RAID10'
last mounted on /mnt/raid10 on Wed Oct 29 18:03:04 2025
Proceed anyway? (y,N) y
Discarding device blocks: done
Creating filesystem with 1047040 4k blocks and 262144 inodes
Filesystem UUID: 5979f165-f4e5-45c5-8603-f07f6810a62c
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736
Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done
root@rhel:~#
Mount the Array
sudo mkdir -p /mnt/raid10
sudo mount -o noatime,nodiratime /dev/md0 /mnt/raid10
Mount options explained:
noatimeโ Don't update file access times (reduces writes)nodiratimeโ Don't update directory access times (faster listings)Performance impact: 5-15% faster in read-heavy workloads
Verify Mount
df -hT /mnt/raid10
mount | grep raid10
Output:
root@rhel:~# df -hT /mnt/raid10
Filesystem Type Size Used Avail Use% Mounted on
/dev/md0 ext4 3.9G 24K 3.7G 1% /mnt/raid10
root@rhel:~# mount | grep raid10
/dev/md0 on /mnt/raid10 type ext4 (rw,noatime,nodiratime,seclabel,stripe=256)
root@rhel:~#
โ
Check for stripe=256 โ confirms alignment is active.
โก Step 7: Optimize Rebuild Performance
Why This Matters
Default rebuild speed = too slow for modern hardware
Default: 200 MB/s max
Modern SSD: Can handle 500+ MB/s
Result: Rebuild takes 5ร longer than necessary
During rebuild, array is vulnerable โ faster rebuild = safer.
Set Permanent Rebuild Speeds
sudo tee /etc/sysctl.d/99-raid.conf > /dev/null <<EOF
# RAID rebuild speed optimization
dev.raid.speed_limit_min = 50000
dev.raid.speed_limit_max = 500000
EOF
Apply immediately:
sudo sysctl -p /etc/sysctl.d/99-raid.conf
Expected output:
dev.raid.speed_limit_min = 50000
dev.raid.speed_limit_max = 500000
My Output:
root@rhel:~# sudo sysctl -p /etc/sysctl.d/99-raid.conf
dev.raid.speed_limit_min = 100000
dev.raid.speed_limit_max = 500000
root@rhel:~#
Verify:
cat /proc/sys/dev/raid/speed_limit_min
cat /proc/sys/dev/raid/speed_limit_max
Understanding the Values
| Hardware | Min (KB/s) | Max (KB/s) | Why |
| HDD | 50000 | 200000 | Avoid starving apps during rebuild |
| SATA SSD | 100000 | 500000 | Can handle full speed safely |
| NVMe SSD | 200000 | 1000000 | Only if system is mostly idle |
Explanation:
speed_limit_min= Guaranteed minimum rebuild progressspeed_limit_max= Cap to prevent I/O starvationHigher values = faster rebuild BUT less responsive system
โ ๏ธ Don't set too high: Rebuild will starve normal I/O operations.
Note: Values like 1000000-2000000 (1-2 GB/s) shown in some examples are too aggressive for most systems.
๐ Step 8: Verify Array Health
sudo mdadm --detail /dev/md0
Healthy array checklist:
root@rhel:~# sudo mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Thu Oct 30 11:23:43 2025
Raid Level : raid10
Array Size : 4188160 (3.99 GiB 4.29 GB)
Used Dev Size : 2094080 (2045.00 MiB 2144.34 MB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Oct 30 12:45:21 2025
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : near=2
Chunk Size : 512K
Consistency Policy : bitmap
Name : rhel:0 (local to host rhel)
UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
Events : 17
Number Major Minor RaidDevice State
0 8 1 0 active sync set-A /dev/sda1
1 8 17 1 active sync set-B /dev/sdb1
2 8 33 2 active sync set-A /dev/sdc1
3 8 49 3 active sync set-B /dev/sdd1
root@rhel:~#
Check bitmap location:
cat /sys/block/md0/md/bitmap/location
Expected: +8 or +1024 (not none)
If shows none: Bitmap is disabled โ recreate array with --bitmap=internal.
๐งช Step 9: Test With Data
Create Test Files
# Small text file
echo "RAID-10 Performance Test" | sudo tee /mnt/raid10/test.txt
# Large file (100MB with progress)
sudo dd if=/dev/zero of=/mnt/raid10/speedtest.dat \
bs=1M count=100 oflag=direct status=progress
What oflag=direct does:
Bypasses OS cache
Forces direct writes to disk
Shows true RAID performance
Expected output:
root@rhel:~# echo "RAID-10 Performance Test" | sudo tee /mnt/raid10/test.txt
RAID-10 Performance Test
root@rhel:~# sudo dd if=/dev/zero of=/mnt/raid10/speedtest.dat \
bs=1M count=100 oflag=direct status=progress
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.300927 s, 348 MB/s
root@rhel:~#
Verify Files
ls -lh /mnt/raid10/
cat /mnt/raid10/test.txt
Output:
root@rhel:~# ls -lh /mnt/raid10/
cat /mnt/raid10/test.txt
total 101M
drwx------. 2 root root 16K Oct 30 12:44 lost+found
-rw-r--r--. 1 root root 100M Oct 30 12:50 speedtest.dat
-rw-r--r--. 1 root root 25 Oct 30 12:49 test.txt
RAID-10 Performance Test
root@rhel:~#
Measure Performance
# Write speed
sudo dd if=/dev/zero of=/mnt/raid10/write_test \
bs=1M count=500 oflag=direct status=progress
# Read speed
sudo dd if=/mnt/raid10/write_test of=/dev/null \
bs=1M iflag=direct status=progress
Expected (RAID-10 with 4 disks):
Write: 1.5-2ร single disk speed
Read: 2-3ร single disk speed
Cleanup
sudo rm /mnt/raid10/speedtest.dat /mnt/raid10/write_test
๐ฅ Step 10: Simulate Disk Failure
Mark Disk as Failed
sudo mdadm --manage /dev/md0 --fail /dev/sda1
What happens:
mdadm marks disk as failed immediately
Mirror partner (
/dev/sdb1) continues serving dataArray enters "degraded" state
Check Array Status
cat /proc/mdstat
Output:
oot@rhel:~# cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sdd1[3] sdc1[2] sdb1[1] sda1[0](F)
4188160 blocks super 1.2 512K chunks 2 near-copies [4/3] [_UUU]
bitmap: 1/1 pages [4KB], 65536KB chunk
unused devices: <none>
root@rhel:~#
Indicators:
sda1[0](F)โ Failed disk[4/3]โ 4 total, 3 working[_UUU]โ Position 0 failed, others OK
Verify Data Is Still Accessible
cat /mnt/raid10/test.txt
ls -la /mnt/raid10/
Output:
root@rhel:~# cat /mnt/raid10/test.txt
RAID-10 Performance Test
root@rhel:~# ls -la /mnt/raid10/
total 24
drwxr-xr-x. 3 root root 4096 Oct 30 12:51 .
drwxr-xr-x. 3 root root 20 Oct 29 18:02 ..
drwx------. 2 root root 16384 Oct 30 12:44 lost+found
-rw-r--r--. 1 root root 25 Oct 30 12:49 test.txt
root@rhel:~#
โ
Still works! Data served from mirror (/dev/sdb1).
Check Detailed Status
sudo mdadm --detail /dev/md0
Shows:
root@rhel:~# sudo mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Thu Oct 30 11:23:43 2025
Raid Level : raid10
Array Size : 4188160 (3.99 GiB 4.29 GB)
Used Dev Size : 2094080 (2045.00 MiB 2144.34 MB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Oct 30 12:51:39 2025
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Layout : near=2
Chunk Size : 512K
Consistency Policy : bitmap
Name : rhel:0 (local to host rhel)
UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
Events : 21
Number Major Minor RaidDevice State
- 0 0 0 removed
1 8 17 1 active sync set-B /dev/sdb1
2 8 33 2 active sync set-A /dev/sdc1
3 8 49 3 active sync set-B /dev/sdd1
0 8 1 - faulty /dev/sda1
Remove Failed Disk
sudo mdadm --manage /dev/md0 --remove /dev/sda1
Output:
mdadm: hot removed /dev/sda1 from /dev/md0
Verify removal:
sudo mdadm --detail /dev/md0 | grep State
Shows:
State : clean, degraded
๐ง Step 11: Replace Failed Disk
Add Replacement Disk
sudo mdadm --manage /dev/md0 --add /dev/sde1
Output:
mdadm: added /dev/sde1
What happens:
mdadm detects array is degraded
Automatically starts rebuilding to
/dev/sde1/dev/sde1becomes active member after rebuild
Monitor Rebuild Progress
watch -n 2 'cat /proc/mdstat'
During rebuild:
md0 : active raid10 sde1[4] sdd1[3] sdc1[2] sdb1[1]
4190208 blocks super 1.2 512K chunks 2 near-copies [4/3] [_UUU]
[====>................] recovery = 23.5% (986112/4190208) finish=0.8min speed=45000K/sec
bitmap: 1/1 pages [4KB], 65536KB chunk
Progress indicators:
sde1[4]โ New disk (position 4 = rebuilding to position 0)[4/3]โ 4 total, 3 fully synced (rebuild in progress)recovery = 23.5%โ Current progressfinish=0.8minโ Estimated time remainingspeed=45000K/secโ Current rebuild speed
Press Ctrl+C when complete.
Verify Recovery Complete
sudo mdadm --detail /dev/md0
Should show:
root@rhel:~# sudo mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Thu Oct 30 11:23:43 2025
Raid Level : raid10
Array Size : 4188160 (3.99 GiB 4.29 GB)
Used Dev Size : 2094080 (2045.00 MiB 2144.34 MB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Oct 30 12:53:47 2025
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : near=2
Chunk Size : 512K
Consistency Policy : bitmap
Name : rhel:0 (local to host rhel)
UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
Events : 41
Number Major Minor RaidDevice State
4 8 65 0 active sync set-A /dev/sde1
1 8 17 1 active sync set-B /dev/sdb1
2 8 33 2 active sync set-A /dev/sdc1
3 8 49 3 active sync set-B /dev/sdd1
โ
Note: /dev/sde1 took position 0 (where /dev/sda1 was).
๐ Step 12: Add Hot Spare
What Is a Hot Spare?
Hot spare = standby disk that automatically activates on failure
Normal operation: Disk fails: Auto-rebuild:
โโโโโโโ โโโโโโโ โโโโโโโ
โsde1 โ โsde1 โ โ โspareโ โ activated
โsdb1 โ โsdb1 โ โsdb1 โ โ rebuilding
โsdc1 โ โsdc1 โ โsdc1 โ
โsdd1 โ โsdd1 โ โsdd1 โ
โspareโ (idle) โspareโ โ activates โโโโโโโ
โโโโโโโ โโโโโโโ
Benefits:
โ Zero downtime for disk replacement
โ Rebuild starts immediately (no human intervention)
โ Array never stays degraded
Add Spare Disk
sudo mdadm --manage /dev/md0 --add-spare /dev/sdf1
Output:
mdadm: added /dev/sdf1
Verify Spare Added
sudo mdadm --detail /dev/md0 | tail -10
Output:
root@rhel:~# sudo mdadm --detail /dev/md0 | tail -10
UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
Events : 42
Number Major Minor RaidDevice State
4 8 65 0 active sync set-A /dev/sde1
1 8 17 1 active sync set-B /dev/sdb1
2 8 33 2 active sync set-A /dev/sdc1
3 8 49 3 active sync set-B /dev/sdd1
5 8 81 - spare /dev/sdf1
root@rhel:~#
Look for: spare in State column
Test Automatic Failover
Simulate another failure:
sudo mdadm --manage /dev/md0 --fail /dev/sde1
Check immediately:
cat /proc/mdstat
Output (progressing rapidly):
md0 : active raid10 sdf1[5] sde1[4](F) sdd1[3] sdc1[2] sdb1[1]
4188160 blocks super 1.2 512K chunks 2 near-copies [4/3] [_UUU]
[======>..............] recovery = 32.3% (677056/2094080) finish=0.0min speed=677056K/sec
bitmap: 0/1 pages [0KB], 65536KB chunk
What happened:
โ ๏ธ
/dev/sde1marked as failedโก
/dev/sdf1(spare) automatically activated๐ Rebuild started immediately (no manual intervention!)
After rebuild completes:
md0 : active raid10 sdf1[5] sde1[4](F) sdd1[3] sdc1[2] sdb1[1]
4188160 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
bitmap: 0/1 pages [0KB], 65536KB chunk
Verify Spare Activation
sudo mdadm --detail /dev/md0 | tail -10
Output:
UUID : de5845c1:f2b6a4ab:87bc4816:6ba93b9d
Events : 59
Number Major Minor RaidDevice State
5 8 81 0 active sync set-A /dev/sdf1 โ Spare became active!
1 8 17 1 active sync set-B /dev/sdb1
2 8 33 2 active sync set-A /dev/sdc1
3 8 49 3 active sync set-B /dev/sdd1
4 8 65 - faulty /dev/sde1 โ Old disk failed
Remove failed disk
sudo mdadm --manage /dev/md0 --remove /dev/sde1
โ This is why hot spares are critical in production.
๐ Step 13: Set Up Monitoring
Why Monitoring Matters
RAID arrays fail silently:
Disk starts having errors โ No immediate notification
Second disk fails โ Data loss
Bitrot corrupts data gradually โ Undetected until too late
Proper monitoring prevents disasters.
In production systems, you need to know when disks fail BEFORE you lose data.
Check RAID Health Regularly
# Quick status check
cat /proc/mdstat
# Detailed health report
sudo mdadm --detail /dev/md0
# Check all RAID arrays
sudo mdadm --detail --scan
Set Up Automated Daily Checks (Optional)
# Edit crontab
sudo crontab -e
# Add this line (checks every day at 2 AM)
0 2 * * * /usr/sbin/mdadm --detail --scan > /var/log/raid-check.log 2>&1
Check Disk Health with SMART
# Install smartmontools if not already installed
sudo apt install smartmontools -y # Debian/Ubuntu
sudo dnf install smartmontools -y # RHEL/CentOS
# Check individual disk health
sudo smartctl -a /dev/sda
sudo smartctl -a /dev/sdb
sudo smartctl -a /dev/sdc
sudo smartctl -a /dev/sdd
sudo smartctl -a /dev/sde
sudo smartctl -a /dev/sdf
Look for:
SMART Health Status: OK= GoodReallocated_Sector_Ct= Should be 0 or very lowCurrent_Pending_Sector= Should be 0
๐ก Note for Virtual Machines
SMART doesn't work on virtual drives. For VMs, you can:
Monitor the host's physical disks (from the hypervisor, not the VM)
Use software-level checks inside the VM:
# Non-destructive read-only test (safe, shown with -s for progress)
sudo badblocks -sv /dev/sda
sudo badblocks -sv /dev/sdb
sudo badblocks -sv /dev/sdc
sudo badblocks -sv /dev/sdd
sudo badblocks -sv /dev/sde
sudo badblocks -sv /dev/sdf
Expected output (healthy disk):
Checking blocks 0 to 2097151
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)
โ ๏ธ Warning: Never use badblocks -w (write test) on production data โ it's destructive.
๐ Step 14: Configure Auto-Mount at Boot
Why This Is Critical
Without auto-mount:
Array exists but isn't usable after reboot
Applications can't access data
Manual intervention required every boot
Get Array UUID
sudo blkid /dev/md0 -s UUID -o value
Example output:
a1b2c3d4-e5f6-7890-abcd-ef1234567890
Copy this UUID โ you'll need it next.
Add to fstab
sudo nano /etc/fstab
Add this line at the end (replace with your UUID):
UUID=a1b2c3d4-e5f6-7890-abcd-ef1234567890 /mnt/raid10 ext4 defaults,noatime,nodiratime,nofail 0 2
Understanding Each Field
| Field | Value | Purpose |
UUID=... | Your array's UUID | Identifies array uniquely |
/mnt/raid10 | Mount point | Where array appears |
ext4 | Filesystem type | Tells kernel how to read it |
defaults,noatime,nodiratime | Mount options | Performance optimization |
nofail | CRITICAL! | System boots even if array fails |
0 | Dump frequency | 0 = don't backup with dump |
2 | fsck order | 2 = check after root filesystem |
Understanding nofail (Critical!)
Without nofail:
Boot โ Wait for RAID โ RAID doesn't assemble โ System hangs forever
Result: Unbootable system, requires rescue mode
With nofail:
Boot โ Wait for RAID โ RAID doesn't assemble โ Continue booting anyway
Result: System accessible, you can fix RAID issue
โ
Always use nofail for non-root RAID arrays.
Test Auto-Mount Without Rebooting
# Unmount array
sudo umount /mnt/raid10
# Test fstab entry
sudo mount -a
# Verify it mounted
df -h | grep raid10
Output:
root@rhel:~# df -h | grep raid10
/dev/md0 3.9G 28K 3.7G 1% /mnt/raid10
root@rhel:~#
If error occurs: Check fstab syntax, verify UUID matches.
Test Reboot (Optional)
sudo reboot
After reboot:
df -h | grep raid10
cat /mnt/raid10/test.txt
Output:
/dev/md0 3.9G 28K 3.7G 1% /mnt/raid10
RAID-10 Performance Test
โ Should work automatically.
๐งน Step 15: Complete Cleanup (Lab Only)
โ ๏ธ WARNING: THIS DESTROYS THE ARRAY AND ALL DATA
Only do this in test/lab environments!
Stop Using the Array
# Unmount filesystem
sudo umount /mnt/raid10
# Remove from fstab
sudo sed -i '/raid10/d' /etc/fstab
Stop the Array
sudo mdadm --stop /dev/md0
Output:
mdadm: stopped /dev/md0
Erase RAID Metadata (Critical!)
Why this is necessary:
mdadm stores metadata at start of each partition
Without zeroing: Old metadata confuses new arrays
System might try to auto-assemble old array
sudo mdadm --zero-superblock /dev/sda1
sudo mdadm --zero-superblock /dev/sdb1
sudo mdadm --zero-superblock /dev/sdc1
sudo mdadm --zero-superblock /dev/sdd1
sudo mdadm --zero-superblock /dev/sde1
sudo mdadm --zero-superblock /dev/sdf1
Note: This command gives no output on success. It only outputs errors.
Remove Partitions
for disk in sda sdb sdc sdd sde sdf; do
echo -e "d\nw" | sudo fdisk /dev/$disk
sudo partprobe /dev/$disk 2>/dev/null || true
done
What this does:
dโ Delete partitionwโ Write changesRepeats for all disks
Remove Array Configuration
Debian/Ubuntu:
sudo sed -i '/md0/d' /etc/mdadm/mdadm.conf
sudo update-initramfs -u
RHEL/CentOS:
sudo sed -i '/md0/d' /etc/mdadm.conf
sudo dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
Verify Complete Cleanup
# No RAID arrays
cat /proc/mdstat
# Disks are clean
lsblk -o NAME,SIZE,TYPE,FSTYPE
# No RAID metadata
sudo mdadm --examine /dev/sda 2>&1 | grep -i "no md"
Expected output:
Personalities : [raid10]
unused devices: <none>
NAME SIZE TYPE FSTYPE
sda 2G disk
sdb 2G disk
sdc 2G disk
sdd 2G disk
sde 2G disk
sdf 2G disk
๐ Quick Reference Commands
Daily Operations
# Check array status
cat /proc/mdstat
sudo mdadm --detail /dev/md0
# Check array health
sudo mdadm --detail /dev/md0 | grep -E 'State|Active|Failed'
# View rebuild speed
cat /sys/block/md0/md/sync_speed_min
cat /sys/block/md0/md/sync_speed_max
Disk Management
# Mark disk as failed
sudo mdadm --manage /dev/md0 --fail /dev/sda1
# Remove failed disk
sudo mdadm --manage /dev/md0 --remove /dev/sda1
# Add replacement disk
sudo mdadm --manage /dev/md0 --add /dev/sde1
# Add hot spare
sudo mdadm --manage /dev/md0 --add-spare /dev/sdf1
Maintenance Commands
# Start manual scrub (integrity check)
echo check | sudo tee /sys/block/md0/md/sync_action
# Check scrub progress
cat /proc/mdstat
# View mismatch count (should be 0)
cat /sys/block/md0/md/mismatch_cnt
# Stop scrub (if needed)
echo idle | sudo tee /sys/block/md0/md/sync_action
Performance Testing
# Write speed test
sudo dd if=/dev/zero of=/mnt/raid10/write_test \
bs=1M count=1000 oflag=direct status=progress
# Read speed test
sudo dd if=/mnt/raid10/write_test of=/dev/null \
bs=1M iflag=direct status=progress
# Random I/O test (requires fio)
sudo fio --name=randwrite --ioengine=libaio --iodepth=16 \
--rw=randwrite --bs=4k --direct=1 --size=1G \
--numjobs=4 --runtime=60 --group_reporting \
--filename=/mnt/raid10/fiotest
# Cleanup
sudo rm /mnt/raid10/write_test /mnt/raid10/fiotest
๐ฏ Production Deployment Checklist
Before putting RAID-10 into production, verify:
Hardware
[ ] All disks are same size and model
[ ] Disks are from different manufacturing batches
[ ] SMART monitoring enabled on all disks
[ ] Hardware RAID controller (if used) configured correctly
[ ] UPS power protection in place
Configuration
[ ] Array created with
--bitmap=internal[ ] Bitmap visible in
mdadm --detailoutput[ ] Filesystem created with proper
strideandstripe-width[ ] Mount options include
noatime,nodiratime,nofail[ ] Rebuild speed limits configured in
/etc/sysctl.d/
Persistence
[ ] Array configuration saved in
/etc/mdadm/mdadm.conf[ ] Initramfs/dracut updated with new config
[ ] fstab entry uses UUID (not
/dev/md0)[ ] fstab includes
nofailoption
Monitoring
[ ] Monthly scrub scheduled (
/etc/cron.monthly/raid-check)[ ] Daily health checks scheduled (crontab)
[ ] Email alerts configured (mdadm daemon or custom script)
[ ] Logging to
/var/log/raid-health.logworking
Redundancy
[ ] At least one hot spare added
[ ] Spare disk(s) tested (simulate failure)
[ ] Automatic failover verified
[ ] Replacement disk procedure documented
Testing
[ ] Single disk failure tested
[ ] Data verified accessible during degraded state
[ ] Rebuild process tested and timed
[ ] Hot spare activation tested
[ ] System reboot tested (auto-assembly)
[ ] Performance benchmarks recorded
Backup
[ ] RAID is NOT a backup!
[ ] Regular backups to external system configured
[ ] Backup restore procedure tested
[ ] Recovery time objective (RTO) documented
โ ๏ธ Common Mistakes and How to Avoid Them
Mistake 1: "RAID is my backup"
Wrong:
RAID protects against: Disk failure
RAID does NOT protect against: Accidental deletion, ransomware,
corruption, fire, theft, user error
Right:
RAID = Availability (keeps system running)
Backup = Data protection (recovers from disasters)
You need BOTH!
Mistake 2: Forgetting --bitmap=internal
Impact:
Unclean shutdown โ Full array resync (hours/days)
Extended vulnerability window
Poor performance during recovery
โ
Always specify: --bitmap=internal when creating array
Mistake 3: No hot spare
Without spare:
Disk fails โ You get paged โ Drive to datacenter โ Replace disk โ
Start rebuild (30 minutes to hours elapsed)
With spare:
Disk fails โ Spare activates immediately โ Rebuild starts (30 seconds elapsed)
Mistake 4: Skipping filesystem alignment
Performance loss: 20-30% slower without proper stride/stripe-width
โ Always calculate and specify alignment parameters
Mistake 5: NOT using nofail in fstab
Without nofail: System won't boot if RAID fails
โ
Always include nofail for non-root arrays
Mistake 6: Same-batch disks
Problem:
Disks from same manufacturing batch fail together
Higher chance of losing both mirrors simultaneously
Solution:
Buy disks from different vendors/batches
Stagger disk purchases over time
๐ Advanced Topics
Understanding RAID-10 Layouts
This tutorial uses near=2 (default), but mdadm supports three layouts:
Layout 1: near=2 (Default - What We Use)
Disk 0: [A1][A2][A3][A4] โโ
Disk 1: [A1][A2][A3][A4] โโ Mirror pair (0โ1)
Disk 2: [B1][B2][B3][B4] โโ
Disk 3: [B1][B2][B3][B4] โโ Mirror pair (2โ3)
Characteristics:
โ Best read performance (sequential reads hit both disks in each pair)
โ Good write performance
โ Simple to understand
โ Recommended for most use cases
Layout 2: far=2
Disk 0: [A1][A2][B1][B2]
Disk 1: [A3][A4][B3][B4]
Disk 2: [B1][B2][C1][C2] โ Mirrors spread across disk
Disk 3: [B3][B4][C3][C4]
Characteristics:
โ Best sequential read performance (all disks contribute)
โ ๏ธ Slower random writes
Use case: Read-heavy workloads (media streaming)
To use:
mdadm --create /dev/md0 --level=10 --layout=f2 --raid-devices=4 ...
Layout 3: offset=2
Disk 0: [A1][A2][A3][A4]
Disk 1: [B1][B2][B3][B4]
Disk 2: [A1][A2][A3][A4] โ Offset mirror
Disk 3: [B1][B2][B3][B4] โ Offset mirror
Characteristics:
Balance between near and far
Rarely used in practice
Recommendation: Stick with near=2 (default) unless you have specific sequential read requirements.
๐ Performance Optimization
SSD-Specific Optimizations
For SSDs, add the discard option:
sudo mount -o noatime,nodiratime,discard /dev/md0 /mnt/raid10
Or in fstab:
UUID=... /mnt/raid10 ext4 defaults,noatime,nodiratime,discard,nofail 0 2
What discard does:
โ Enables TRIM support
โ Tells SSD which blocks are free
โ Maintains long-term performance
โ Essential for SSD longevity
๐ Troubleshooting Guide
Problem: Array won't assemble after reboot
Symptoms:
cat /proc/mdstat
# Shows: Personalities : [raid10]
# unused devices: <none>
Solutions:
- Check if disks are detected:
lsblk -o NAME,SIZE,TYPE,FSTYPE
# Verify sd{a,b,c,d}1 exist
- Try manual assembly:
sudo mdadm --assemble --scan --verbose
- Check configuration:
sudo cat /etc/mdadm/mdadm.conf | grep md0
# Should show ARRAY /dev/md0 ...
- Force assembly:
sudo mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
Problem: Slow rebuild speed
Symptoms:
cat /proc/mdstat
# Shows: speed=10000K/sec (very slow)
Solutions:
- Check speed limits:
cat /proc/sys/dev/raid/speed_limit_min
cat /proc/sys/dev/raid/speed_limit_max
- Increase limits:
echo 50000 | sudo tee /proc/sys/dev/raid/speed_limit_min
echo 500000 | sudo tee /proc/sys/dev/raid/speed_limit_max
- Check I/O load:
iostat -x 2
# If disks are busy with other I/O, rebuild will be slow
Problem: Mismatch count increasing
Symptoms:
cat /sys/block/md0/md/mismatch_cnt
# Shows: 42 (non-zero)
This indicates:
Possible bitrot (data corruption)
Failing disk
Memory errors
Bad SATA cable
Solutions:
- Run repair:
echo repair | sudo tee /sys/block/md0/md/sync_action
- Check SMART status:
sudo smartctl -a /dev/sda
sudo smartctl -a /dev/sdb
# Look for reallocated sectors, pending sectors
- Test individual disks:
sudo badblocks -sv /dev/sda1
Problem: Array degraded but no failed disk shown
Check detailed status:
sudo mdadm --detail /dev/md0
cat /sys/block/md*/md/sync_action
Possible causes:
Bitmap corruption
Filesystem errors
Cache coherency issues
Solution:
sudo mdadm --stop /dev/md0
sudo mdadm --assemble /dev/md0 --force
๐ Final Notes
What You've Learned
โ Conceptual understanding:
CORRECTED: RAID-10 mirror pairs work as 0โ1 and 2โ3 with
near=2set-Aandset-Bindicate striping roles, not mirror partnersFailure tolerance patterns (can lose 2 disks if from different pairs)
Difference between
mdadm --level=10and true nested RAID-1+0
โ Practical skills:
Creating production-grade RAID-10
Proper filesystem alignment
Monitoring and maintenance
Disaster recovery procedures
โ Best practices:
Write-intent bitmaps (
--bitmap=internal)Hot spare configuration
Auto-mount with failsafe options (
nofail)Regular integrity checks
Next Steps for Production
Implement monitoring alerts:
Configure email notifications
Set up Nagios/Zabbix checks
Create runbooks for failures
Document your setup:
Hardware inventory
Disk serial numbers
Recovery procedures
Contact information
Test disaster scenarios:
Multiple disk failures
Power loss during rebuild
Full array recovery from scratch
Establish backup system:
Regular backups to external storage
Test restore procedures
Document retention policies
๐ Key Takeaways
Remember These Critical Points:
Mirror Pairing in mdadm --level=10 with near=2:
โ CORRECT: Adjacent pairs (0โ1, 2โ3)
โ WRONG: Vertical pairs (0โ2, 1โ3)
The
set-A/set-Blabels indicate striping positions, not mirror partners
mdadm --level=10 โ True RAID-1+0:
mdadm version: Automatic adjacent mirrors (probabilistic failure tolerance)
Nested version: User-defined mirrors (guaranteed failure tolerance)
Choose nested for mission-critical systems
Always use --bitmap=internal:
Prevents hours-long resyncs after power loss
Only ~1MB overhead per 256GB
Mandatory for production
Filesystem alignment matters:
Calculate
stride= chunk_size รท block_sizeCalculate
stripe-width= stride ร number_of_data_disksImpact: 20-30% performance difference
RAID is NOT backup:
RAID = Availability (protects against disk failure)
Backup = Data protection (protects against everything else)
Always have external backups
Always use nofail in fstab:
Without it: System won't boot if RAID fails
With it: System boots, you can fix the issue
Critical for non-root arrays
Hot spares save downtime:
Automatic failover
Immediate rebuild
Essential for 24/7 systems
Reasonable rebuild speeds:
HDDs: 100-200 MB/s
SATA SSDs: 300-500 MB/s
NVMe SSDs: 500 MB/s - 1 GB/s
Don't set too high (will starve normal I/O)
๐ง Teaching the Contradictions - What Was Fixed
Contradiction #1: Mirror Pairing (MAJOR FIX)
Original guide said:
Mirror Set 1 โ Disk 0 โ Disk 2 (vertical pairing)
Mirror Set 2 โ Disk 1 โ Disk 3 (vertical pairing)
Reality with near=2:
Mirror Pair 1 โ Disk 0 โ Disk 1 (horizontal/adjacent pairing)
Mirror Pair 2 โ Disk 2 โ Disk 3 (horizontal/adjacent pairing)
How to verify yourself:
# After creating array, fail a disk and check what happens
sudo mdadm --manage /dev/md0 --fail /dev/sda1 # Fail position 0
cat /proc/mdstat
# You'll see [_UUU] - position 0 down, others working
# Data served from position 1 (sdb1), NOT position 2
Visual proof:
If 0โ2 were mirrors (wrong):
Fail sda1 โ Data served from sdc1
Reality (0โ1 are mirrors):
Fail sda1 โ Data served from sdb1 โ
Contradiction #2: set-A and set-B Meaning
Original guide implied:
set-A = Mirror Set 1
set-B = Mirror Set 2
Actually means:
set-A = First position in each mirror pair (0, 2)
set-B = Second position in each mirror pair (1, 3)
These are striping labels, not mirror identifiers!
How to understand it:
mdadm --detail output:
Position 0: set-A โ
Position 1: set-B โ These mirror each other
Position 2: set-A โ
Position 3: set-B โ These mirror each other
set-A and set-B indicate how data is striped across the pairs,
not which disks mirror each other.
Contradiction #3: Mistake 5 (nofail)
Original guide said:
Mistake 5: Using nofail in fstab
(Then immediately contradicted itself)
Corrected:
Mistake 5: NOT using nofail in fstab
โ
Always USE nofail for non-root RAID arrays
Why this matters:
# Without nofail in fstab:
UUID=... /mnt/raid10 ext4 defaults,noatime,nodiratime 0 2
# โ System hangs if array fails to mount
# With nofail:
UUID=... /mnt/raid10 ext4 defaults,noatime,nodiratime,nofail 0 2
# โ System boots even if array fails, you can investigate
Contradiction #4: Stripe-Width Terminology
Original guide said:
Stripe-width = Stride ร Number of Stripe Groups
Number of stripe groups = 2 for RAID-10 with 4 disks
Corrected terminology:
Stripe-width = Stride ร Number of Data Disks
Number of data disks = 2 for RAID-10 with 4 disks
(The other 2 disks hold mirrors, not unique data)
Why "data disks" is clearer:
RAID-10 with 4 disks: 2 store data, 2 store mirrors
Stripe-width should span all unique data
"Stripe groups" is non-standard terminology
Contradiction #5: Rebuild Speed Values
Original guide recommended:
dev.raid.speed_limit_min = 10000 (10 MB/s)
dev.raid.speed_limit_max = 500000 (500 MB/s)
But showed example output:
dev.raid.speed_limit_min = 1000000 (1000 MB/s)
dev.raid.speed_limit_max = 2000000 (2000 MB/s)
Why the example was wrong:
1-2 GB/s is too aggressive for most hardware
Will starve normal I/O operations
Only suitable for high-end NVMe arrays with no other workload
Corrected recommendation:
# General purpose (balanced)
dev.raid.speed_limit_min = 50000 (50 MB/s)
dev.raid.speed_limit_max = 500000 (500 MB/s)
# Adjust based on hardware:
- HDDs: 100000-200000
- SATA SSDs: 300000-500000
- NVMe (idle system): 500000-1000000
Contradiction #6: Partition Type (Minor)
Original guide used:
Hex code: fd (Linux RAID autodetect)
Note added:
This is legacy but still works
Modern systems can use
83(Linux) or8e(Linux LVM)mdadm 3.0+ doesn't require autodetect type
Not wrong, just slightly outdated
Both work fine:
# Legacy (still works)
fdisk: t โ fd
# Modern (also works)
fdisk: t โ 83
๐ฏ Quick Summary of All Fixes
| Issue | Original | Corrected |
| Mirror pairs | 0โ2, 1โ3 (vertical) | 0โ1, 2โ3 (horizontal) |
| set-A/set-B | Mirror identifiers | Striping position labels |
| Mistake 5 | "Using nofail" (contradictory) | "NOT using nofail" |
| Stripe-width term | "Stripe groups" | "Data disks" |
| Rebuild speed | Example showed 1-2 GB/s | Use 50-500 MB/s |
| Partition type | Only mentioned fd | Added note about 83/8e |
๐งช How to Verify the Corrections Yourself
Test 1: Verify Mirror Pairing
# Create array
sudo mdadm --create /dev/md0 --level=10 --raid-devices=4 \
--bitmap=internal /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
# Fail position 0
sudo mdadm --manage /dev/md0 --fail /dev/sda1
# Check which disk serves data
sudo dd if=/mnt/raid10/test.txt of=/dev/null
iostat -x 1 5
# You'll see sdb1 (position 1) active, NOT sdc1 (position 2)
# This proves 0โ1 are mirrors, not 0โ2
Test 2: Verify Failure Tolerance
# Start fresh
sudo mdadm --create /dev/md0 --level=10 --raid-devices=4 \
--bitmap=internal /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
# Test 1: Fail disks from same pair
sudo mdadm --manage /dev/md0 --fail /dev/sda1 /dev/sdb1
cat /proc/mdstat
# Result: Array FAILS (both mirrors of pair 1 gone)
# Recreate array, test 2: Fail disks from different pairs
sudo mdadm --manage /dev/md0 --fail /dev/sda1 /dev/sdc1
cat /proc/mdstat
# Result: Array SURVIVES (each pair still has one disk)
Test 3: Verify nofail Behavior
# Add to fstab WITHOUT nofail
UUID=... /mnt/raid10 ext4 defaults 0 2
# Stop array
sudo mdadm --stop /dev/md0
# Try to boot
sudo systemctl daemon-reload
sudo mount -a
# Result: Hangs waiting for array
# Now add nofail
UUID=... /mnt/raid10 ext4 defaults,nofail 0 2
# Try again
sudo mount -a
# Result: Continues, shows warning but doesn't hang
๐ Additional Resources
Official Documentation
mdadm man page:
man mdadmLinux RAID Wiki: https://raid.wiki.kernel.org/
md driver documentation:
/usr/share/doc/mdadm/
Recommended Reading
Understanding RAID levels and their trade-offs
Linux kernel md driver architecture
Filesystem alignment for RAID arrays
Backup strategies for RAID systems
Community Support
Linux RAID mailing list: linux-raid@vger.kernel.org
Stack Exchange: Unix & Linux / Server Fault
Reddit: r/sysadmin, r/homelab
โ Final Checklist
Before considering this guide complete, verify:
Understanding:
[ ] I understand how
near=2creates mirror pairs (0โ1, 2โ3)[ ] I know the difference between
mdadm --level=10and nested RAID-1+0[ ] I understand what
set-Aandset-Bactually mean[ ] I can calculate filesystem alignment parameters
[ ] I know why
nofailis critical in fstab
Practical Skills:
[ ] I can create a RAID-10 array with proper parameters
[ ] I can simulate and recover from disk failures
[ ] I can configure hot spares
[ ] I can set up monitoring and alerts
[ ] I can configure auto-mount correctly
Production Readiness:
[ ] I have tested failure scenarios
[ ] I have backup systems in place
[ ] I have documented my configuration
[ ] I have monitoring alerts configured
[ ] I understand this is NOT a backup solution
๐ Congratulations!
You now have a corrected, production-ready understanding of RAID-10 with mdadm.
Key achievements:
โ Understand true mirror pairing behavior
โ Can build optimized RAID-10 arrays
โ Know how to handle failures and recoveries
โ Understand the difference between RAID and backup
โ Can deploy this knowledge in production
Remember: RAID provides availability, not data protection. Always maintain proper backups!
๐ Questions or Issues?
If you encounter problems:
Check the Troubleshooting Guide (above)
Review the Quick Reference Commands
Verify array status:
sudo mdadm --detail /dev/md0Check system logs:
dmesg | grep -i raidorjournalctl -xeConsult community resources (listed above)
Stay safe, keep backups, and happy RAID-ing! ๐



