Linux RAID-10 Setup Guide with mdadm

⚠️ CRITICAL WARNING

THIS TUTORIAL DESTROYS DATA ON SPECIFIED DISKS

Use virtual disks or dedicated physical disks only
Never use system disks
Verify with lsblk before every command
Test in VM first

lsblk -o NAME,SIZE,TYPE,MOUNTPOINT

💡 Understanding RAID-10: Two Different Approaches

What Does "RAID-10" Actually Mean?

RAID-10 literally means RAID 1+0 — mirror first, then stripe:

Traditional RAID-1+0 (Nested):
Step 1: Create RAID-1 mirrors → Step 2: Stripe those mirrors

mdadm --level=10 (Single Array):
Step 1: Create one array → Step 2: mdadm handles mirroring via layout

They're NOT the same thing! Here's why it matters:

Method 1: mdadm --level=10 (What This Tutorial Uses)

Single array with automatic mirror placement.

🔍 CORRECTED: How Mirroring Actually Works with near=2

With 4 disks and near=2 layout:

┌─────────┬─────────┬─────────┬─────────┐
│ Disk 0  │ Disk 1  │ Disk 2  │ Disk 3  │
│ (sda1)  │ (sdb1)  │ (sdc1)  │ (sdd1)  │
├─────────┼─────────┼─────────┼─────────┤
│   A1    │   A1    │   B1    │   B1    │  ← Mirrors horizontally (0↔1, 2↔3)
│   A2    │   A2    │   B2    │   B2    │
│   A3    │   A3    │   B3    │   B3    │
│   A4    │   A4    │   B4    │   B4    │
└─────────┴─────────┴─────────┴─────────┘

How mdadm actually decides mirror pairs:

CORRECT Mirror Pairing:

Mirror Pair 1 → Disk 0 ↔ Disk 1 → /dev/sda1 ↔ /dev/sdb1
Mirror Pair 2 → Disk 2 ↔ Disk 3 → /dev/sdc1 ↔ /dev/sdd1

Explanation:

near=2 means each data block is stored twice ("2 copies")
mdadm places mirrors on adjacent disk positions (0↔1, 2↔3)
The "A" data is striped across positions 0-1, "B" data across positions 2-3
Both stripe sets work together for performance

Understanding set-A and set-B Labels:

When you see this in mdadm --detail:

Number   Major   Minor   RaidDevice State
   0       8        1        0      active sync set-A   /dev/sda1
   1       8       17        1      active sync set-B   /dev/sdb1
   2       8       33        2      active sync set-A   /dev/sdc1
   3       8       49        3      active sync set-B   /dev/sdd1

What set-A and set-B actually mean:

set-A = First position in each mirror pair (positions 0 and 2)
set-B = Second position in each mirror pair (positions 1 and 3)
These labels indicate striping roles, NOT mirror partners
The actual mirrors are: 0↔1 and 2↔3

Failure Tolerance (CORRECTED):

✅ Can survive:

Loss of Disk 0 OR Disk 1 (not both) → Mirror Pair 1 survives
Loss of Disk 2 OR Disk 3 (not both) → Mirror Pair 2 survives
Loss of Disk 0 AND Disk 2 → ✅ Both pairs still have one disk
Loss of Disk 0 AND Disk 3 → ✅ Both pairs still have one disk
Loss of Disk 1 AND Disk 2 → ✅ Both pairs still have one disk
Loss of Disk 1 AND Disk 3 → ✅ Both pairs still have one disk

❌ Will fail:

Loss of Disk 0 AND Disk 1 → ❌ Mirror Pair 1 completely lost
Loss of Disk 2 AND Disk 3 → ❌ Mirror Pair 2 completely lost

Summary: You can lose up to 2 disks IF they're from different mirror pairs (0↔1 or 2↔3).

Method 2: True RAID-1+0 (Nested Arrays)

User-controlled mirror pairs + explicit striping.

Step 1: Create Two RAID-1 Mirrors
┌─────────┬─────────┐         ┌─────────┬─────────┐
│  sda1   │  sdc1   │         │  sdb1   │  sdd1   │
│   A1    │   A1    │         │   B1    │   B1    │
│   A2    │   A2    │         │   B2    │   B2    │
│   A3    │   A3    │         │   B3    │   B3    │
└─────────┴─────────┘         └─────────┴─────────┘
    md1 (RAID-1)                  md2 (RAID-1)

Step 2: Stripe the Mirrors
         ┌─────────────┐
         │ md0 (RAID-0) │
         │  Stripes:    │
         │  md1 + md2   │
         └─────────────┘

Commands (nested approach):

# Create two RAID-1 mirrors
mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sda1 /dev/sdc1
mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdd1

# Stripe them
mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/md1 /dev/md2

Failure tolerance:

✅ GUARANTEED to survive one disk from each mirror
You chose which disks mirror each other
Lose sda1 AND sdb1? ✅ Still works (different mirrors)
More control, more complexity

Method Comparison Table

Feature	mdadm --level=10	True RAID-1+0 (Nested)
Command Complexity	Simple (one command)	Complex (three commands)
Mirror Control	Automatic (adjacent pairs 0↔1, 2↔3)	Manual (you choose pairs)
Best Case	Survive 2 disk failures	Survive 2 disk failures
Worst Case	❌ Fail with 2 disk losses (same pair)	✅ Survive 2 disk losses (guaranteed)
Guarantee	Probabilistic (50% chance)	Guaranteed (100% if different mirrors)
Performance	Excellent	Excellent
Management	Single array	Multiple arrays
Use Case	Testing, general purpose, most production	Mission-critical databases

Which Method Does This Tutorial Use?

This tutorial uses mdadm --level=10 because:

✅ Simpler to learn and demonstrate
✅ Good for understanding RAID-10 concepts
✅ Sufficient for most use cases
✅ Single array management

⚠️ When to use nested RAID-1+0 instead:

Financial transaction databases
VM storage requiring guaranteed uptime
Any system where two disk failures must be survivable
Production systems with same-batch disks (higher correlated failure risk)

Math: Usable Capacity

Raw capacity    = 4 disks × 2GB = 8GB
Mirror overhead = 50% (everything duplicated)
Usable capacity = 8GB ÷ 2 = 4GB

Formula: (Total Disks ÷ 2) × Disk Size

This applies to BOTH methods — you always lose 50% to mirroring.

🔍 Step 1: Verify Available Disks

lsblk -o NAME,SIZE,TYPE,MOUNTPOINT,FSTYPE

Output:

NAME             SIZE TYPE MOUNTPOINT
sda                2G disk 
sdb                2G disk 
sdc                2G disk 
sdd                2G disk 
sde                2G disk 
sdf                2G disk 
vda               20G disk

Requirements:

Minimum 4 disks (must be even number)
Same size strongly recommended
Not mounted anywhere
Not your system disk!

Example for this tutorial:

/dev/sda through /dev/sdd → RAID-10 array (4 disks)
/dev/sde, /dev/sdf → Hot spares

🔧 Step 2: Partition Disks Properly

Why Partitioning Matters

Don't skip this! Using raw disks (/dev/sda) instead of partitions (/dev/sda1) causes:

Boot loader conflicts
Disk identification issues
Problems with disk replacement

Create Partition on First Disk

sudo fdisk /dev/sda

Inside fdisk (type exactly):

Command: n         ← Create new partition
Type: p            ← Primary partition
Number: 1          ← Partition number 1
First sector: [ENTER]   ← Use default (starts at beginning)
Last sector: [ENTER]    ← Use default (uses entire disk)

Command: t         ← Change partition type
Hex code: fd       ← Linux RAID autodetect (legacy but works)
Command: w         ← Write changes and exit

Note: Modern systems can use 83 (Linux) or 8e (Linux LVM) instead of fd. Both work fine with mdadm 3.0+.

Critical: Force Kernel to Update

This step prevents "partition doesn't exist" errors:

sudo partprobe /dev/sda 2>/dev/null || true
sudo sync
sleep 1

What this does:

partprobe → Tells kernel to re-read partition table
sync → Flushes disk caches
sleep 1 → Gives kernel time to process

Repeat for All Disks

# Disk sdb
sudo fdisk /dev/sdb
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sdb 2>/dev/null || true
sudo sync && sleep 1

# Disk sdc
sudo fdisk /dev/sdc
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sdc 2>/dev/null || true
sudo sync && sleep 1

# Disk sdd
sudo fdisk /dev/sdd
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sdd 2>/dev/null || true
sudo sync && sleep 1

# Spare disk sde
sudo fdisk /dev/sde
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sde 2>/dev/null || true
sudo sync && sleep 1

# Spare disk sdf
sudo fdisk /dev/sdf
# (n, p, 1, ENTER, ENTER, t, fd, w)
sudo partprobe /dev/sdf 2>/dev/null || true
sudo sync && sleep 1

Verify Partitions Exist

lsblk -o NAME,SIZE,TYPE,FSTYPE
ls -la /dev/sd{a,b,c,d,e,f}1

Expected output:

root@rhel:~# lsblk -o NAME,SIZE,TYPE,FSTYPE
NAME             SIZE TYPE FSTYPE
sda                2G disk 
└─sda1             2G part 
sdb                2G disk 
└─sdb1             2G part 
sdc                2G disk 
└─sdc1             2G part 
sdd                2G disk 
└─sdd1             2G part 
sde                2G disk 
└─sde1             2G part 
sdf                2G disk 
└─sdf1             2G part 


root@rhel:~# ls -la /dev/sd{a,b,c,d,e,f}1
brw-rw----. 1 root disk 8,  1 Oct 30 11:21 /dev/sda1
brw-rw----. 1 root disk 8, 17 Oct 30 11:21 /dev/sdb1
brw-rw----. 1 root disk 8, 33 Oct 30 11:21 /dev/sdc1
brw-rw----. 1 root disk 8, 49 Oct 30 11:21 /dev/sdd1
brw-rw----. 1 root disk 8, 65 Oct 30 11:21 /dev/sde1
brw-rw----. 1 root disk 8, 81 Oct 30 11:21 /dev/sdf1
root@rhel:~#

If partitions don't appear: Run partprobe and sync again.

📦 Step 3: Install mdadm

Debian/Ubuntu:

sudo apt update
sudo apt install mdadm -y

RHEL/CentOS/Rocky/AlmaLinux:

sudo dnf install mdadm -y

Verify installation:

mdadm --version

Expected: mdadm - v4.x - ...

🏗️ Step 4: Create RAID-10 Array

The Critical Command

sudo mdadm --create --verbose /dev/md0 \
  --level=10 \
  --raid-devices=4 \
  --bitmap=internal \
  /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

Understanding Each Parameter

Parameter	Meaning	Why It Matters
`--create /dev/md0`	Create new array named md0	Standard naming convention
`--level=10`	RAID-10 (mirrored stripe)	Balance of speed + safety
`--raid-devices=4`	Use exactly 4 disks	Minimum for RAID-10
`--bitmap=internal`	Track changed blocks	Prevents full resync after crash
Device order	sda1 sdb1 sdc1 sdd1	Creates pairs: 0↔1 and 2↔3

Why --bitmap=internal Is Mandatory

Without bitmap:

Power loss → Unclean shutdown → mdadm doesn't know which blocks changed
Result: Full array rescan (hours or days)

With bitmap:

Power loss → mdadm checks bitmap → Only resync changed blocks
Result: Minutes of recovery time

Trade-offs:

Overhead: ~1MB per 256GB of array size
Slight write penalty: ~1-3% (negligible)
In production, always use --bitmap=internal

What Happens Next

Prompt:

mdadm: layout defaults to n2
Continue creating array?

Type: y then press ENTER

What n2 means:

n = "near" layout (mirrors are physically adjacent positions)
2 = 2 copies of each block
Result: Positions (0,1) mirror each other, (2,3) mirror each other

Monitor Initial Synchronization

watch -n 2 cat /proc/mdstat

During sync:

md0 : active raid10 sdd1[3] sdc1[2] sdb1[1] sda1[0]
      4190208 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      [===>.................]  resync = 15.3% (642560/4190208) finish=1.2min speed=45000K/sec
      bitmap: 1/1 pages [4KB], 65536KB chunk

What to look for:

[4/4] [UUUU] → All 4 disks active
bitmap: 1/1 pages → Write-intent bitmap working
resync = 15.3% → Initial sync in progress

Press Ctrl+C when resync reaches 100%

Or wait automatically:

while grep -q resync /proc/mdstat; do sleep 2; done
echo "✓ Sync complete"

Verify Array Configuration

sudo mdadm --detail /dev/md0

Expected output:

root@rhel:~# sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Thu Oct 30 11:23:43 2025
        Raid Level : raid10
        Array Size : 4188160 (3.99 GiB 4.29 GB)
     Used Dev Size : 2094080 (2045.00 MiB 2144.34 MB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Thu Oct 30 11:23:47 2025
             State : clean 
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : near=2
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : rhel:0  (local to host rhel)
              UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
            Events : 17

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync set-A   /dev/sda1
       1       8       17        1      active sync set-B   /dev/sdb1
       2       8       33        2      active sync set-A   /dev/sdc1
       3       8       49        3      active sync set-B   /dev/sdd1

🔍 Understanding the Mirror Pairs

RaidDevice Position    Disk        Mirror Partner
       0               /dev/sda1   ┐
       1               /dev/sdb1   ┘ Mirror each other (Pair 1)

       2               /dev/sdc1   ┐
       3               /dev/sdd1   ┘ Mirror each other (Pair 2)

Note: The set-A / set-B labels indicate striping positions, not mirror partners.
What matters: Position mapping (0↔1, 2↔3) — that's how mirroring works in RAID-10 near=2.

Quick Mirror Pair Verification (Run Anytime)

Want to see mirror pairs instantly? Use this:

#!/bin/bash
echo "Mirror Pairs (RAID-10 near=2):"
sudo mdadm --detail /dev/md0 2>/dev/null | awk '
  /^[[:space:]]*[0-9]+[[:space:]]+8/ {
    slot = $1; dev = $8; pair = int(slot / 2)
    role = (slot % 2 == 0) ? "set-A" : "set-B"
    printf "  %-8s → Pair %d (position %d, %s)\n", dev, pair, slot, role
  }
' | sort -n -k6

Expected output:

root@rhel:~# #!/bin/bash
echo "Mirror Pairs (RAID-10 near=2):"
sudo mdadm --detail /dev/md0 2>/dev/null | awk '
  /^[[:space:]]*[0-9]+[[:space:]]+8/ {
    slot = $1; dev = $8; pair = int(slot / 2)
    role = (slot % 2 == 0) ? "set-A" : "set-B"
    printf "  %-8s → Pair %d (position %d, %s)\n", dev, pair, slot, role
  }
' | sort -n -k6

Mirror Pairs (RAID-10 near=2):
  /dev/sda1 → Pair 0 (position 0, set-A)
  /dev/sdb1 → Pair 0 (position 1, set-B)
  /dev/sdc1 → Pair 1 (position 2, set-A)
  /dev/sdd1 → Pair 1 (position 3, set-B)
root@rhel:~#

What this shows:

Pair 0 = /dev/sda1 and /dev/sdb1 mirror each other
Pair 1 = /dev/sdc1 and /dev/sdd1 mirror each other
Positions 0&1 are one mirror group, positions 2&3 are another

✅ This proves the horizontal (0↔1, 2↔3) pairing, not vertical (0↔2, 1↔3)!

💾 Step 5: Make Configuration Persistent

Why This Step Matters

Without saving the configuration:

Array won't assemble automatically after reboot
You'll have to manually reassemble with mdadm --assemble
System might not boot if it expects the array

Save Array Configuration

Debian/Ubuntu:

sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf

RHEL/CentOS/Rocky/AlmaLinux:

sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf

Verify it saved correctly

# Debian/Ubuntu
sudo cat /etc/mdadm/mdadm.conf | grep md0

# RHEL/CentOS
sudo cat /etc/mdadm.conf | grep md0

Expected output:

root@rhel:~# sudo mdadm --detail --scan  | sudo tee -a /etc/mdadm.conf
ARRAY /dev/md0 metadata=1.2 UUID=eec6fc91:3e2b911f:37dd1dda:0b661777

Update Boot System (Critical!)

Debian/Ubuntu:

sudo update-initramfs -u

RHEL/CentOS:

sudo dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)

What this does:

Embeds mdadm configuration into boot image
Ensures array assembles before root filesystem mounts
Required for arrays that contain system files

📂 Step 6: Create Optimized Filesystem

Why Alignment Matters

Misaligned filesystem = 20-30% performance loss

Without Alignment:
Write 1MB file → Crosses chunk boundaries → Extra reads → Slower

With Alignment:
Write 1MB file → Fits within chunks → Direct writes → Faster

Calculate Alignment Parameters

For RAID-10 with 512K chunks:

Filesystem block size = 4KB (4096 bytes)
RAID chunk size = 512KB (524288 bytes)

Stride = Chunk Size ÷ Block Size
       = 524288 ÷ 4096
       = 128 blocks

Stripe-width = Stride × Number of Data Disks
             = 128 × 2
             = 256 blocks

Why × 2?
RAID-10 with 4 disks has 2 data disks actively storing unique data (the other 2 hold mirrors).

Corrected terminology: Use "data disks" instead of "stripe groups"

Create Filesystem with Proper Alignment

sudo mkfs.ext4 \
  -L SPEED_RAID10 \
  -b 4096 \
  -E stride=128,stripe-width=256 \
  /dev/md0

Parameter breakdown:

Parameter	Value	Purpose
`-L SPEED_RAID10`	Label	Easy to identify in df and mount
`-b 4096`	4K blocks	Matches modern 4K sector disks
`-E stride=128`	128 blocks	Aligns writes to chunk boundaries
`-E stripe-width=256`	256 blocks	Aligns to full stripe across data disks

Expected output:

root@rhel:~# sudo mkfs.ext4 \
  -L SPEED_RAID10 \
  -b 4096 \
  -E stride=128,stripe-width=256 \
  /dev/md0
mke2fs 1.47.1 (20-May-2024)
/dev/md0 contains a ext4 file system labelled 'SPEED_RAID10'
    last mounted on /mnt/raid10 on Wed Oct 29 18:03:04 2025
Proceed anyway? (y,N) y
Discarding device blocks: done                            
Creating filesystem with 1047040 4k blocks and 262144 inodes
Filesystem UUID: 5979f165-f4e5-45c5-8603-f07f6810a62c
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

root@rhel:~#

Mount the Array

sudo mkdir -p /mnt/raid10
sudo mount -o noatime,nodiratime /dev/md0 /mnt/raid10

Mount options explained:

noatime → Don't update file access times (reduces writes)
nodiratime → Don't update directory access times (faster listings)
Performance impact: 5-15% faster in read-heavy workloads

Verify Mount

df -hT /mnt/raid10
mount | grep raid10

Output:

root@rhel:~# df -hT /mnt/raid10
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/md0       ext4  3.9G   24K  3.7G   1% /mnt/raid10
root@rhel:~# mount | grep raid10
/dev/md0 on /mnt/raid10 type ext4 (rw,noatime,nodiratime,seclabel,stripe=256)
root@rhel:~#

✅ Check for stripe=256 — confirms alignment is active.

⚡ Step 7: Optimize Rebuild Performance

Why This Matters

Default rebuild speed = too slow for modern hardware

Default: 200 MB/s max
Modern SSD: Can handle 500+ MB/s
Result: Rebuild takes 5× longer than necessary

During rebuild, array is vulnerable — faster rebuild = safer.

Set Permanent Rebuild Speeds

sudo tee /etc/sysctl.d/99-raid.conf > /dev/null <<EOF
# RAID rebuild speed optimization
dev.raid.speed_limit_min = 50000
dev.raid.speed_limit_max = 500000
EOF

Apply immediately:

sudo sysctl -p /etc/sysctl.d/99-raid.conf

Expected output:

dev.raid.speed_limit_min = 50000
dev.raid.speed_limit_max = 500000

My Output:

root@rhel:~# sudo sysctl -p /etc/sysctl.d/99-raid.conf
dev.raid.speed_limit_min = 100000
dev.raid.speed_limit_max = 500000
root@rhel:~#

Verify:

cat /proc/sys/dev/raid/speed_limit_min
cat /proc/sys/dev/raid/speed_limit_max

Understanding the Values

Hardware	Min (KB/s)	Max (KB/s)	Why
HDD	50000	200000	Avoid starving apps during rebuild
SATA SSD	100000	500000	Can handle full speed safely
NVMe SSD	200000	1000000	Only if system is mostly idle

Explanation:

speed_limit_min = Guaranteed minimum rebuild progress
speed_limit_max = Cap to prevent I/O starvation
Higher values = faster rebuild BUT less responsive system

⚠️ Don't set too high: Rebuild will starve normal I/O operations.

Note: Values like 1000000-2000000 (1-2 GB/s) shown in some examples are too aggressive for most systems.

🔍 Step 8: Verify Array Health

sudo mdadm --detail /dev/md0

Healthy array checklist:

root@rhel:~# sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Thu Oct 30 11:23:43 2025
        Raid Level : raid10
        Array Size : 4188160 (3.99 GiB 4.29 GB)
     Used Dev Size : 2094080 (2045.00 MiB 2144.34 MB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Thu Oct 30 12:45:21 2025
             State : clean 
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : near=2
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : rhel:0  (local to host rhel)
              UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
            Events : 17

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync set-A   /dev/sda1
       1       8       17        1      active sync set-B   /dev/sdb1
       2       8       33        2      active sync set-A   /dev/sdc1
       3       8       49        3      active sync set-B   /dev/sdd1
root@rhel:~#

Check bitmap location:

cat /sys/block/md0/md/bitmap/location

Expected: +8 or +1024 (not none)

If shows none: Bitmap is disabled — recreate array with --bitmap=internal.

🧪 Step 9: Test With Data

Create Test Files

# Small text file
echo "RAID-10 Performance Test" | sudo tee /mnt/raid10/test.txt

# Large file (100MB with progress)
sudo dd if=/dev/zero of=/mnt/raid10/speedtest.dat \
  bs=1M count=100 oflag=direct status=progress

What oflag=direct does:

Bypasses OS cache
Forces direct writes to disk
Shows true RAID performance

Expected output:

root@rhel:~# echo "RAID-10 Performance Test" | sudo tee /mnt/raid10/test.txt
RAID-10 Performance Test
root@rhel:~# sudo dd if=/dev/zero of=/mnt/raid10/speedtest.dat \
  bs=1M count=100 oflag=direct status=progress
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.300927 s, 348 MB/s
root@rhel:~#

Verify Files

ls -lh /mnt/raid10/
cat /mnt/raid10/test.txt

Output:

root@rhel:~# ls -lh /mnt/raid10/
cat /mnt/raid10/test.txt
total 101M
drwx------. 2 root root  16K Oct 30 12:44 lost+found
-rw-r--r--. 1 root root 100M Oct 30 12:50 speedtest.dat
-rw-r--r--. 1 root root   25 Oct 30 12:49 test.txt

RAID-10 Performance Test
root@rhel:~#

Measure Performance

# Write speed
sudo dd if=/dev/zero of=/mnt/raid10/write_test \
  bs=1M count=500 oflag=direct status=progress

# Read speed
sudo dd if=/mnt/raid10/write_test of=/dev/null \
  bs=1M iflag=direct status=progress

Expected (RAID-10 with 4 disks):

Write: 1.5-2× single disk speed
Read: 2-3× single disk speed

Cleanup

sudo rm /mnt/raid10/speedtest.dat /mnt/raid10/write_test

💥 Step 10: Simulate Disk Failure

Mark Disk as Failed

sudo mdadm --manage /dev/md0 --fail /dev/sda1

What happens:

mdadm marks disk as failed immediately
Mirror partner (/dev/sdb1) continues serving data
Array enters "degraded" state

Check Array Status

cat /proc/mdstat

Output:

oot@rhel:~# cat /proc/mdstat
Personalities : [raid10] 
md0 : active raid10 sdd1[3] sdc1[2] sdb1[1] sda1[0](F)
      4188160 blocks super 1.2 512K chunks 2 near-copies [4/3] [_UUU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>
root@rhel:~#

Indicators:

sda1[0](F) → Failed disk
[4/3] → 4 total, 3 working
[_UUU] → Position 0 failed, others OK

Verify Data Is Still Accessible

cat /mnt/raid10/test.txt
ls -la /mnt/raid10/

Output:

root@rhel:~# cat /mnt/raid10/test.txt
RAID-10 Performance Test

root@rhel:~# ls -la /mnt/raid10/
total 24
drwxr-xr-x. 3 root root  4096 Oct 30 12:51 .
drwxr-xr-x. 3 root root    20 Oct 29 18:02 ..
drwx------. 2 root root 16384 Oct 30 12:44 lost+found
-rw-r--r--. 1 root root    25 Oct 30 12:49 test.txt
root@rhel:~#

✅ Still works! Data served from mirror (/dev/sdb1).

Check Detailed Status

sudo mdadm --detail /dev/md0

Shows:

root@rhel:~# sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Thu Oct 30 11:23:43 2025
        Raid Level : raid10
        Array Size : 4188160 (3.99 GiB 4.29 GB)
     Used Dev Size : 2094080 (2045.00 MiB 2144.34 MB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Thu Oct 30 12:51:39 2025
             State : clean, degraded 
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 1
     Spare Devices : 0

            Layout : near=2
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : rhel:0  (local to host rhel)
              UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
            Events : 21

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       1       8       17        1      active sync set-B   /dev/sdb1
       2       8       33        2      active sync set-A   /dev/sdc1
       3       8       49        3      active sync set-B   /dev/sdd1

       0       8        1        -      faulty   /dev/sda1

Remove Failed Disk

sudo mdadm --manage /dev/md0 --remove /dev/sda1

Output:

mdadm: hot removed /dev/sda1 from /dev/md0

Verify removal:

sudo mdadm --detail /dev/md0 | grep State

Shows:

State : clean, degraded

🔧 Step 11: Replace Failed Disk

Add Replacement Disk

sudo mdadm --manage /dev/md0 --add /dev/sde1

Output:

mdadm: added /dev/sde1

What happens:

mdadm detects array is degraded
Automatically starts rebuilding to /dev/sde1
/dev/sde1 becomes active member after rebuild

Monitor Rebuild Progress

watch -n 2 'cat /proc/mdstat'

During rebuild:

md0 : active raid10 sde1[4] sdd1[3] sdc1[2] sdb1[1]
      4190208 blocks super 1.2 512K chunks 2 near-copies [4/3] [_UUU]
      [====>................]  recovery = 23.5% (986112/4190208) finish=0.8min speed=45000K/sec
      bitmap: 1/1 pages [4KB], 65536KB chunk

Progress indicators:

sde1[4] → New disk (position 4 = rebuilding to position 0)
[4/3] → 4 total, 3 fully synced (rebuild in progress)
recovery = 23.5% → Current progress
finish=0.8min → Estimated time remaining
speed=45000K/sec → Current rebuild speed

Press Ctrl+C when complete.

Verify Recovery Complete

sudo mdadm --detail /dev/md0

Should show:

root@rhel:~# sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Thu Oct 30 11:23:43 2025
        Raid Level : raid10
        Array Size : 4188160 (3.99 GiB 4.29 GB)
     Used Dev Size : 2094080 (2045.00 MiB 2144.34 MB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Thu Oct 30 12:53:47 2025
             State : clean 
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : near=2
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : rhel:0  (local to host rhel)
              UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
            Events : 41

    Number   Major   Minor   RaidDevice State
       4       8       65        0      active sync set-A   /dev/sde1
       1       8       17        1      active sync set-B   /dev/sdb1
       2       8       33        2      active sync set-A   /dev/sdc1
       3       8       49        3      active sync set-B   /dev/sdd1

✅ Note: /dev/sde1 took position 0 (where /dev/sda1 was).

🏆 Step 12: Add Hot Spare

What Is a Hot Spare?

Hot spare = standby disk that automatically activates on failure

Normal operation:    Disk fails:           Auto-rebuild:
┌─────┐              ┌─────┐              ┌─────┐
│sde1 │              │sde1 │ ✗            │spare│ → activated
│sdb1 │              │sdb1 │              │sdb1 │ ← rebuilding
│sdc1 │              │sdc1 │              │sdc1 │
│sdd1 │              │sdd1 │              │sdd1 │
│spare│ (idle)       │spare│ → activates  └─────┘
└─────┘              └─────┘

Benefits:

✅ Zero downtime for disk replacement
✅ Rebuild starts immediately (no human intervention)
✅ Array never stays degraded

Add Spare Disk

sudo mdadm --manage /dev/md0 --add-spare /dev/sdf1

Output:

mdadm: added /dev/sdf1

Verify Spare Added

sudo mdadm --detail /dev/md0 | tail -10

Output:

  root@rhel:~# sudo mdadm --detail /dev/md0 | tail -10
              UUID : eec6fc91:3e2b911f:37dd1dda:0b661777
            Events : 42

    Number   Major   Minor   RaidDevice State
       4       8       65        0      active sync set-A   /dev/sde1
       1       8       17        1      active sync set-B   /dev/sdb1
       2       8       33        2      active sync set-A   /dev/sdc1
       3       8       49        3      active sync set-B   /dev/sdd1

       5       8       81        -      spare   /dev/sdf1
root@rhel:~#

Look for: spare in State column

Test Automatic Failover

Simulate another failure:

sudo mdadm --manage /dev/md0 --fail /dev/sde1

Check immediately:

cat /proc/mdstat

Output (progressing rapidly):

md0 : active raid10 sdf1[5] sde1[4](F) sdd1[3] sdc1[2] sdb1[1]
      4188160 blocks super 1.2 512K chunks 2 near-copies [4/3] [_UUU]
      [======>..............]  recovery = 32.3% (677056/2094080) finish=0.0min speed=677056K/sec
      bitmap: 0/1 pages [0KB], 65536KB chunk

What happened:

⚠️ /dev/sde1 marked as failed
⚡ /dev/sdf1 (spare) automatically activated
🔄 Rebuild started immediately (no manual intervention!)

After rebuild completes:

md0 : active raid10 sdf1[5] sde1[4](F) sdd1[3] sdc1[2] sdb1[1]
      4188160 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

Verify Spare Activation

sudo mdadm --detail /dev/md0 | tail -10

Output:

              UUID : de5845c1:f2b6a4ab:87bc4816:6ba93b9d
            Events : 59

    Number   Major   Minor   RaidDevice State
       5       8       81        0      active sync set-A   /dev/sdf1  ← Spare became active!
       1       8       17        1      active sync set-B   /dev/sdb1
       2       8       33        2      active sync set-A   /dev/sdc1
       3       8       49        3      active sync set-B   /dev/sdd1

       4       8       65        -      faulty   /dev/sde1  ← Old disk failed

Remove failed disk

sudo mdadm --manage /dev/md0 --remove /dev/sde1

✅ This is why hot spares are critical in production.

📊 Step 13: Set Up Monitoring

Why Monitoring Matters

RAID arrays fail silently:

Disk starts having errors → No immediate notification
Second disk fails → Data loss
Bitrot corrupts data gradually → Undetected until too late

Proper monitoring prevents disasters.

In production systems, you need to know when disks fail BEFORE you lose data.

Check RAID Health Regularly

# Quick status check
cat /proc/mdstat

# Detailed health report
sudo mdadm --detail /dev/md0

# Check all RAID arrays
sudo mdadm --detail --scan

Set Up Automated Daily Checks (Optional)

# Edit crontab
sudo crontab -e

# Add this line (checks every day at 2 AM)
0 2 * * * /usr/sbin/mdadm --detail --scan > /var/log/raid-check.log 2>&1

Check Disk Health with SMART

# Install smartmontools if not already installed
sudo apt install smartmontools -y  # Debian/Ubuntu
sudo dnf install smartmontools -y  # RHEL/CentOS

# Check individual disk health
sudo smartctl -a /dev/sda
sudo smartctl -a /dev/sdb
sudo smartctl -a /dev/sdc
sudo smartctl -a /dev/sdd
sudo smartctl -a /dev/sde
sudo smartctl -a /dev/sdf

Look for:

SMART Health Status: OK = Good
Reallocated_Sector_Ct = Should be 0 or very low
Current_Pending_Sector = Should be 0

💡 Note for Virtual Machines

SMART doesn't work on virtual drives. For VMs, you can:

Monitor the host's physical disks (from the hypervisor, not the VM)
Use software-level checks inside the VM:

# Non-destructive read-only test (safe, shown with -s for progress)
sudo badblocks -sv /dev/sda
sudo badblocks -sv /dev/sdb
sudo badblocks -sv /dev/sdc
sudo badblocks -sv /dev/sdd
sudo badblocks -sv /dev/sde
sudo badblocks -sv /dev/sdf

Expected output (healthy disk):

Checking blocks 0 to 2097151
Checking for bad blocks (read-only test): done                                                 
Pass completed, 0 bad blocks found. (0/0/0 errors)

⚠️ Warning: Never use badblocks -w (write test) on production data — it's destructive.

🏁 Step 14: Configure Auto-Mount at Boot

Why This Is Critical

Without auto-mount:

Array exists but isn't usable after reboot
Applications can't access data
Manual intervention required every boot

Get Array UUID

sudo blkid /dev/md0 -s UUID -o value

Example output:

a1b2c3d4-e5f6-7890-abcd-ef1234567890

Copy this UUID — you'll need it next.

Add to fstab

sudo nano /etc/fstab

Add this line at the end (replace with your UUID):

UUID=a1b2c3d4-e5f6-7890-abcd-ef1234567890  /mnt/raid10  ext4  defaults,noatime,nodiratime,nofail  0  2

Understanding Each Field

Field	Value	Purpose
`UUID=...`	Your array's UUID	Identifies array uniquely
`/mnt/raid10`	Mount point	Where array appears
`ext4`	Filesystem type	Tells kernel how to read it
`defaults,noatime,nodiratime`	Mount options	Performance optimization
`nofail`	CRITICAL!	System boots even if array fails
`0`	Dump frequency	0 = don't backup with dump
`2`	fsck order	2 = check after root filesystem

Understanding nofail (Critical!)

Without nofail:

Boot → Wait for RAID → RAID doesn't assemble → System hangs forever
Result: Unbootable system, requires rescue mode

With nofail:

Boot → Wait for RAID → RAID doesn't assemble → Continue booting anyway
Result: System accessible, you can fix RAID issue

✅ Always use nofail for non-root RAID arrays.

Test Auto-Mount Without Rebooting

# Unmount array
sudo umount /mnt/raid10

# Test fstab entry
sudo mount -a

# Verify it mounted
df -h | grep raid10

Output:

root@rhel:~# df -h | grep raid10
/dev/md0               3.9G   28K  3.7G   1% /mnt/raid10
root@rhel:~#

If error occurs: Check fstab syntax, verify UUID matches.

Test Reboot (Optional)

sudo reboot

After reboot:

df -h | grep raid10
cat /mnt/raid10/test.txt

Output:

/dev/md0               3.9G   28K  3.7G   1% /mnt/raid10
RAID-10 Performance Test

✅ Should work automatically.

🧹 Step 15: Complete Cleanup (Lab Only)

⚠️ WARNING: THIS DESTROYS THE ARRAY AND ALL DATA

Only do this in test/lab environments!

Stop Using the Array

# Unmount filesystem
sudo umount /mnt/raid10

# Remove from fstab
sudo sed -i '/raid10/d' /etc/fstab

Stop the Array

sudo mdadm --stop /dev/md0

Output:

mdadm: stopped /dev/md0

Erase RAID Metadata (Critical!)

Why this is necessary:

mdadm stores metadata at start of each partition
Without zeroing: Old metadata confuses new arrays
System might try to auto-assemble old array

sudo mdadm --zero-superblock /dev/sda1
sudo mdadm --zero-superblock /dev/sdb1
sudo mdadm --zero-superblock /dev/sdc1
sudo mdadm --zero-superblock /dev/sdd1
sudo mdadm --zero-superblock /dev/sde1
sudo mdadm --zero-superblock /dev/sdf1

Note: This command gives no output on success. It only outputs errors.

Remove Partitions

for disk in sda sdb sdc sdd sde sdf; do
    echo -e "d\nw" | sudo fdisk /dev/$disk
    sudo partprobe /dev/$disk 2>/dev/null || true
done

What this does:

d → Delete partition
w → Write changes
Repeats for all disks

Remove Array Configuration

Debian/Ubuntu:

sudo sed -i '/md0/d' /etc/mdadm/mdadm.conf
sudo update-initramfs -u

RHEL/CentOS:

sudo sed -i '/md0/d' /etc/mdadm.conf
sudo dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)

Verify Complete Cleanup

# No RAID arrays
cat /proc/mdstat

# Disks are clean
lsblk -o NAME,SIZE,TYPE,FSTYPE

# No RAID metadata
sudo mdadm --examine /dev/sda 2>&1 | grep -i "no md"

Expected output:

Personalities : [raid10] 
unused devices: <none>

NAME             SIZE TYPE FSTYPE
sda                2G disk 
sdb                2G disk 
sdc                2G disk 
sdd                2G disk 
sde                2G disk 
sdf                2G disk

📚 Quick Reference Commands

Daily Operations

# Check array status
cat /proc/mdstat
sudo mdadm --detail /dev/md0

# Check array health
sudo mdadm --detail /dev/md0 | grep -E 'State|Active|Failed'

# View rebuild speed
cat /sys/block/md0/md/sync_speed_min
cat /sys/block/md0/md/sync_speed_max

Disk Management

# Mark disk as failed
sudo mdadm --manage /dev/md0 --fail /dev/sda1

# Remove failed disk
sudo mdadm --manage /dev/md0 --remove /dev/sda1

# Add replacement disk
sudo mdadm --manage /dev/md0 --add /dev/sde1

# Add hot spare
sudo mdadm --manage /dev/md0 --add-spare /dev/sdf1

Maintenance Commands

# Start manual scrub (integrity check)
echo check | sudo tee /sys/block/md0/md/sync_action

# Check scrub progress
cat /proc/mdstat

# View mismatch count (should be 0)
cat /sys/block/md0/md/mismatch_cnt

# Stop scrub (if needed)
echo idle | sudo tee /sys/block/md0/md/sync_action

Performance Testing

# Write speed test 
sudo dd if=/dev/zero of=/mnt/raid10/write_test \
  bs=1M count=1000 oflag=direct status=progress

# Read speed test
sudo dd if=/mnt/raid10/write_test of=/dev/null \
  bs=1M iflag=direct status=progress

# Random I/O test (requires fio)
sudo fio --name=randwrite --ioengine=libaio --iodepth=16 \
  --rw=randwrite --bs=4k --direct=1 --size=1G \
  --numjobs=4 --runtime=60 --group_reporting \
  --filename=/mnt/raid10/fiotest

# Cleanup
sudo rm /mnt/raid10/write_test /mnt/raid10/fiotest

🎯 Production Deployment Checklist

Before putting RAID-10 into production, verify:

Hardware

[ ] All disks are same size and model
[ ] Disks are from different manufacturing batches
[ ] SMART monitoring enabled on all disks
[ ] Hardware RAID controller (if used) configured correctly
[ ] UPS power protection in place

Configuration

[ ] Array created with --bitmap=internal
[ ] Bitmap visible in mdadm --detail output
[ ] Filesystem created with proper stride and stripe-width
[ ] Mount options include noatime,nodiratime,nofail
[ ] Rebuild speed limits configured in /etc/sysctl.d/

Persistence

[ ] Array configuration saved in /etc/mdadm/mdadm.conf
[ ] Initramfs/dracut updated with new config
[ ] fstab entry uses UUID (not /dev/md0)
[ ] fstab includes nofail option

Monitoring

[ ] Monthly scrub scheduled (/etc/cron.monthly/raid-check)
[ ] Daily health checks scheduled (crontab)
[ ] Email alerts configured (mdadm daemon or custom script)
[ ] Logging to /var/log/raid-health.log working

Redundancy

[ ] At least one hot spare added
[ ] Spare disk(s) tested (simulate failure)
[ ] Automatic failover verified
[ ] Replacement disk procedure documented

Testing

[ ] Single disk failure tested
[ ] Data verified accessible during degraded state
[ ] Rebuild process tested and timed
[ ] Hot spare activation tested
[ ] System reboot tested (auto-assembly)
[ ] Performance benchmarks recorded

Backup

[ ] RAID is NOT a backup!
[ ] Regular backups to external system configured
[ ] Backup restore procedure tested
[ ] Recovery time objective (RTO) documented

⚠️ Common Mistakes and How to Avoid Them

Mistake 1: "RAID is my backup"

Wrong:

RAID protects against: Disk failure
RAID does NOT protect against: Accidental deletion, ransomware, 
  corruption, fire, theft, user error

Right:

RAID = Availability (keeps system running)
Backup = Data protection (recovers from disasters)

You need BOTH!

Mistake 2: Forgetting --bitmap=internal

Impact:

Unclean shutdown → Full array resync (hours/days)
Extended vulnerability window
Poor performance during recovery

✅ Always specify: --bitmap=internal when creating array

Mistake 3: No hot spare

Without spare:

Disk fails → You get paged → Drive to datacenter → Replace disk → 
  Start rebuild (30 minutes to hours elapsed)

With spare:

Disk fails → Spare activates immediately → Rebuild starts (30 seconds elapsed)

Mistake 4: Skipping filesystem alignment

Performance loss: 20-30% slower without proper stride/stripe-width

✅ Always calculate and specify alignment parameters

Mistake 5: NOT using nofail in fstab

Without nofail: System won't boot if RAID fails

✅ Always include nofail for non-root arrays

Mistake 6: Same-batch disks

Problem:

Disks from same manufacturing batch fail together
Higher chance of losing both mirrors simultaneously

Solution:

Buy disks from different vendors/batches
Stagger disk purchases over time

🔍 Advanced Topics

Understanding RAID-10 Layouts

This tutorial uses near=2 (default), but mdadm supports three layouts:

Layout 1: near=2 (Default - What We Use)

Disk 0: [A1][A2][A3][A4]  ←┐
Disk 1: [A1][A2][A3][A4]  ←┘ Mirror pair (0↔1)

Disk 2: [B1][B2][B3][B4]  ←┐
Disk 3: [B1][B2][B3][B4]  ←┘ Mirror pair (2↔3)

Characteristics:

✅ Best read performance (sequential reads hit both disks in each pair)
✅ Good write performance
✅ Simple to understand
✅ Recommended for most use cases

Layout 2: far=2

Disk 0: [A1][A2][B1][B2]
Disk 1: [A3][A4][B3][B4]
Disk 2: [B1][B2][C1][C2]  ← Mirrors spread across disk
Disk 3: [B3][B4][C3][C4]

Characteristics:

✅ Best sequential read performance (all disks contribute)
⚠️ Slower random writes
Use case: Read-heavy workloads (media streaming)

To use:

mdadm --create /dev/md0 --level=10 --layout=f2 --raid-devices=4 ...

Layout 3: offset=2

Disk 0: [A1][A2][A3][A4]
Disk 1: [B1][B2][B3][B4]
Disk 2: [A1][A2][A3][A4]  ← Offset mirror
Disk 3: [B1][B2][B3][B4]  ← Offset mirror

Characteristics:

Balance between near and far
Rarely used in practice

Recommendation: Stick with near=2 (default) unless you have specific sequential read requirements.

🚀 Performance Optimization

SSD-Specific Optimizations

For SSDs, add the discard option:

sudo mount -o noatime,nodiratime,discard /dev/md0 /mnt/raid10

Or in fstab:

UUID=... /mnt/raid10 ext4 defaults,noatime,nodiratime,discard,nofail 0 2

What discard does:

✅ Enables TRIM support
✅ Tells SSD which blocks are free
✅ Maintains long-term performance
✅ Essential for SSD longevity

📖 Troubleshooting Guide

Problem: Array won't assemble after reboot

Symptoms:

cat /proc/mdstat
# Shows: Personalities : [raid10]
#        unused devices: <none>

Solutions:

Check if disks are detected:

lsblk -o NAME,SIZE,TYPE,FSTYPE
# Verify sd{a,b,c,d}1 exist

Try manual assembly:

sudo mdadm --assemble --scan --verbose

Check configuration:

sudo cat /etc/mdadm/mdadm.conf | grep md0
# Should show ARRAY /dev/md0 ...

Force assembly:

sudo mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

Problem: Slow rebuild speed

Symptoms:

cat /proc/mdstat
# Shows: speed=10000K/sec (very slow)

Solutions:

Check speed limits:

cat /proc/sys/dev/raid/speed_limit_min
cat /proc/sys/dev/raid/speed_limit_max

Increase limits:

echo 50000 | sudo tee /proc/sys/dev/raid/speed_limit_min
echo 500000 | sudo tee /proc/sys/dev/raid/speed_limit_max

Check I/O load:

iostat -x 2
# If disks are busy with other I/O, rebuild will be slow

Problem: Mismatch count increasing

Symptoms:

cat /sys/block/md0/md/mismatch_cnt
# Shows: 42 (non-zero)

This indicates:

Possible bitrot (data corruption)
Failing disk
Memory errors
Bad SATA cable

Solutions:

Run repair:

echo repair | sudo tee /sys/block/md0/md/sync_action

Check SMART status:

sudo smartctl -a /dev/sda
sudo smartctl -a /dev/sdb
# Look for reallocated sectors, pending sectors

Test individual disks:

sudo badblocks -sv /dev/sda1

Problem: Array degraded but no failed disk shown

Check detailed status:

sudo mdadm --detail /dev/md0
cat /sys/block/md*/md/sync_action

Possible causes:

Bitmap corruption
Filesystem errors
Cache coherency issues

Solution:

sudo mdadm --stop /dev/md0
sudo mdadm --assemble /dev/md0 --force

📝 Final Notes

What You've Learned

✅ Conceptual understanding:

CORRECTED: RAID-10 mirror pairs work as 0↔1 and 2↔3 with near=2
set-A and set-B indicate striping roles, not mirror partners
Failure tolerance patterns (can lose 2 disks if from different pairs)
Difference between mdadm --level=10 and true nested RAID-1+0

✅ Practical skills:

Creating production-grade RAID-10
Proper filesystem alignment
Monitoring and maintenance
Disaster recovery procedures

✅ Best practices:

Write-intent bitmaps (--bitmap=internal)
Hot spare configuration
Auto-mount with failsafe options (nofail)
Regular integrity checks

Next Steps for Production

Implement monitoring alerts:
- Configure email notifications
- Set up Nagios/Zabbix checks
- Create runbooks for failures
Document your setup:
- Hardware inventory
- Disk serial numbers
- Recovery procedures
- Contact information
Test disaster scenarios:
- Multiple disk failures
- Power loss during rebuild
- Full array recovery from scratch
Establish backup system:
- Regular backups to external storage
- Test restore procedures
- Document retention policies

🎓 Key Takeaways

Remember These Critical Points:

Mirror Pairing in mdadm --level=10 with near=2:
- ✅ CORRECT: Adjacent pairs (0↔1, 2↔3)
- ❌ WRONG: Vertical pairs (0↔2, 1↔3)
- The set-A/set-B labels indicate striping positions, not mirror partners
mdadm --level=10 ≠ True RAID-1+0:
- mdadm version: Automatic adjacent mirrors (probabilistic failure tolerance)
- Nested version: User-defined mirrors (guaranteed failure tolerance)
- Choose nested for mission-critical systems
Always use --bitmap=internal:
- Prevents hours-long resyncs after power loss
- Only ~1MB overhead per 256GB
- Mandatory for production
Filesystem alignment matters:
- Calculate stride = chunk_size ÷ block_size
- Calculate stripe-width = stride × number_of_data_disks
- Impact: 20-30% performance difference
RAID is NOT backup:
- RAID = Availability (protects against disk failure)
- Backup = Data protection (protects against everything else)
- Always have external backups
Always use nofail in fstab:
- Without it: System won't boot if RAID fails
- With it: System boots, you can fix the issue
- Critical for non-root arrays
Hot spares save downtime:
- Automatic failover
- Immediate rebuild
- Essential for 24/7 systems
Reasonable rebuild speeds:
- HDDs: 100-200 MB/s
- SATA SSDs: 300-500 MB/s
- NVMe SSDs: 500 MB/s - 1 GB/s
- Don't set too high (will starve normal I/O)

🔧 Teaching the Contradictions - What Was Fixed

Contradiction #1: Mirror Pairing (MAJOR FIX)

Original guide said:

Mirror Set 1 → Disk 0 ↔ Disk 2 (vertical pairing)
Mirror Set 2 → Disk 1 ↔ Disk 3 (vertical pairing)

Reality with near=2:

Mirror Pair 1 → Disk 0 ↔ Disk 1 (horizontal/adjacent pairing)
Mirror Pair 2 → Disk 2 ↔ Disk 3 (horizontal/adjacent pairing)

How to verify yourself:

# After creating array, fail a disk and check what happens
sudo mdadm --manage /dev/md0 --fail /dev/sda1  # Fail position 0
cat /proc/mdstat
# You'll see [_UUU] - position 0 down, others working
# Data served from position 1 (sdb1), NOT position 2

Visual proof:

If 0↔2 were mirrors (wrong):
  Fail sda1 → Data served from sdc1

Reality (0↔1 are mirrors):
  Fail sda1 → Data served from sdb1 ✓

Contradiction #2: set-A and set-B Meaning

Original guide implied:

set-A = Mirror Set 1
set-B = Mirror Set 2

Actually means:

set-A = First position in each mirror pair (0, 2)
set-B = Second position in each mirror pair (1, 3)

These are striping labels, not mirror identifiers!

How to understand it:

mdadm --detail output:
  Position 0: set-A  ┐
  Position 1: set-B  ┘ These mirror each other
  Position 2: set-A  ┐
  Position 3: set-B  ┘ These mirror each other

set-A and set-B indicate how data is striped across the pairs,
not which disks mirror each other.

Contradiction #3: Mistake 5 (nofail)

Original guide said:

Mistake 5: Using nofail in fstab
(Then immediately contradicted itself)

Corrected:

Mistake 5: NOT using nofail in fstab

✅ Always USE nofail for non-root RAID arrays

Why this matters:

# Without nofail in fstab:
UUID=... /mnt/raid10 ext4 defaults,noatime,nodiratime 0 2
# → System hangs if array fails to mount

# With nofail:
UUID=... /mnt/raid10 ext4 defaults,noatime,nodiratime,nofail 0 2
# → System boots even if array fails, you can investigate

Contradiction #4: Stripe-Width Terminology

Original guide said:

Stripe-width = Stride × Number of Stripe Groups
Number of stripe groups = 2 for RAID-10 with 4 disks

Corrected terminology:

Stripe-width = Stride × Number of Data Disks
Number of data disks = 2 for RAID-10 with 4 disks

(The other 2 disks hold mirrors, not unique data)

Why "data disks" is clearer:

RAID-10 with 4 disks: 2 store data, 2 store mirrors
Stripe-width should span all unique data
"Stripe groups" is non-standard terminology

Contradiction #5: Rebuild Speed Values

Original guide recommended:

dev.raid.speed_limit_min = 10000   (10 MB/s)
dev.raid.speed_limit_max = 500000  (500 MB/s)

But showed example output:

dev.raid.speed_limit_min = 1000000  (1000 MB/s)
dev.raid.speed_limit_max = 2000000  (2000 MB/s)

Why the example was wrong:

1-2 GB/s is too aggressive for most hardware
Will starve normal I/O operations
Only suitable for high-end NVMe arrays with no other workload

Corrected recommendation:

# General purpose (balanced)
dev.raid.speed_limit_min = 50000   (50 MB/s)
dev.raid.speed_limit_max = 500000  (500 MB/s)

# Adjust based on hardware:
- HDDs: 100000-200000
- SATA SSDs: 300000-500000
- NVMe (idle system): 500000-1000000

Contradiction #6: Partition Type (Minor)

Original guide used:

Hex code: fd  (Linux RAID autodetect)

Note added:

This is legacy but still works
Modern systems can use 83 (Linux) or 8e (Linux LVM)
mdadm 3.0+ doesn't require autodetect type
Not wrong, just slightly outdated

Both work fine:

# Legacy (still works)
fdisk: t → fd

# Modern (also works)
fdisk: t → 83

🎯 Quick Summary of All Fixes

Issue	Original	Corrected
Mirror pairs	0↔2, 1↔3 (vertical)	0↔1, 2↔3 (horizontal)
set-A/set-B	Mirror identifiers	Striping position labels
Mistake 5	"Using nofail" (contradictory)	"NOT using nofail"
Stripe-width term	"Stripe groups"	"Data disks"
Rebuild speed	Example showed 1-2 GB/s	Use 50-500 MB/s
Partition type	Only mentioned `fd`	Added note about `83`/`8e`

🧪 How to Verify the Corrections Yourself

Test 1: Verify Mirror Pairing

# Create array
sudo mdadm --create /dev/md0 --level=10 --raid-devices=4 \
  --bitmap=internal /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

# Fail position 0
sudo mdadm --manage /dev/md0 --fail /dev/sda1

# Check which disk serves data
sudo dd if=/mnt/raid10/test.txt of=/dev/null
iostat -x 1 5
# You'll see sdb1 (position 1) active, NOT sdc1 (position 2)
# This proves 0↔1 are mirrors, not 0↔2

Test 2: Verify Failure Tolerance

# Start fresh
sudo mdadm --create /dev/md0 --level=10 --raid-devices=4 \
  --bitmap=internal /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

# Test 1: Fail disks from same pair
sudo mdadm --manage /dev/md0 --fail /dev/sda1 /dev/sdb1
cat /proc/mdstat
# Result: Array FAILS (both mirrors of pair 1 gone)

# Recreate array, test 2: Fail disks from different pairs
sudo mdadm --manage /dev/md0 --fail /dev/sda1 /dev/sdc1
cat /proc/mdstat
# Result: Array SURVIVES (each pair still has one disk)

Test 3: Verify nofail Behavior

# Add to fstab WITHOUT nofail
UUID=... /mnt/raid10 ext4 defaults 0 2

# Stop array
sudo mdadm --stop /dev/md0

# Try to boot
sudo systemctl daemon-reload
sudo mount -a
# Result: Hangs waiting for array

# Now add nofail
UUID=... /mnt/raid10 ext4 defaults,nofail 0 2

# Try again
sudo mount -a
# Result: Continues, shows warning but doesn't hang

📖 Additional Resources

Official Documentation

mdadm man page: man mdadm
Linux RAID Wiki: https://raid.wiki.kernel.org/
md driver documentation: /usr/share/doc/mdadm/

Community Support

Linux RAID mailing list: linux-raid@vger.kernel.org
Stack Exchange: Unix & Linux / Server Fault
Reddit: r/sysadmin, r/homelab

✅ Final Checklist

Before considering this guide complete, verify:

Understanding:

[ ] I understand how near=2 creates mirror pairs (0↔1, 2↔3)
[ ] I know the difference between mdadm --level=10 and nested RAID-1+0
[ ] I understand what set-A and set-B actually mean
[ ] I can calculate filesystem alignment parameters
[ ] I know why nofail is critical in fstab

Practical Skills:

[ ] I can create a RAID-10 array with proper parameters
[ ] I can simulate and recover from disk failures
[ ] I can configure hot spares
[ ] I can set up monitoring and alerts
[ ] I can configure auto-mount correctly

Production Readiness:

[ ] I have tested failure scenarios
[ ] I have backup systems in place
[ ] I have documented my configuration
[ ] I have monitoring alerts configured
[ ] I understand this is NOT a backup solution

🎉 Congratulations!

You now have a corrected, production-ready understanding of RAID-10 with mdadm.

Key achievements:

✅ Understand true mirror pairing behavior
✅ Can build optimized RAID-10 arrays
✅ Know how to handle failures and recoveries
✅ Understand the difference between RAID and backup
✅ Can deploy this knowledge in production

Remember: RAID provides availability, not data protection. Always maintain proper backups!

📞 Questions or Issues?

If you encounter problems:

Check the Troubleshooting Guide (above)
Review the Quick Reference Commands
Verify array status: sudo mdadm --detail /dev/md0
Check system logs: dmesg | grep -i raid or journalctl -xe
Consult community resources (listed above)

Stay safe, keep backups, and happy RAID-ing! 🚀

Command Palette

⚠️ CRITICAL WARNING

💡 Understanding RAID-10: Two Different Approaches

What Does "RAID-10" Actually Mean?

Method 1: mdadm --level=10 (What This Tutorial Uses)

🔍 CORRECTED: How Mirroring Actually Works with near=2

How mdadm actually decides mirror pairs:

Understanding set-A and set-B Labels:

Failure Tolerance (CORRECTED):

Method 2: True RAID-1+0 (Nested Arrays)

Method Comparison Table

Which Method Does This Tutorial Use?

Math: Usable Capacity

🔍 Step 1: Verify Available Disks

🔧 Step 2: Partition Disks Properly

Why Partitioning Matters

Create Partition on First Disk

Critical: Force Kernel to Update

Repeat for All Disks

Verify Partitions Exist

📦 Step 3: Install mdadm

🏗️ Step 4: Create RAID-10 Array

The Critical Command

Understanding Each Parameter

Why --bitmap=internal Is Mandatory

What Happens Next

Monitor Initial Synchronization

Verify Array Configuration

🔍 Understanding the Mirror Pairs

Quick Mirror Pair Verification (Run Anytime)

💾 Step 5: Make Configuration Persistent

Why This Step Matters

Save Array Configuration

Verify it saved correctly

Update Boot System (Critical!)

📂 Step 6: Create Optimized Filesystem

Why Alignment Matters

Calculate Alignment Parameters

Create Filesystem with Proper Alignment

Mount the Array

Verify Mount

⚡ Step 7: Optimize Rebuild Performance

Why This Matters

Set Permanent Rebuild Speeds

Understanding the Values

🔍 Step 8: Verify Array Health

🧪 Step 9: Test With Data

Create Test Files

Verify Files

Measure Performance

Cleanup

💥 Step 10: Simulate Disk Failure

Mark Disk as Failed

Check Array Status

Verify Data Is Still Accessible

Check Detailed Status

Remove Failed Disk

🔧 Step 11: Replace Failed Disk

Add Replacement Disk

Monitor Rebuild Progress

Verify Recovery Complete

🏆 Step 12: Add Hot Spare

What Is a Hot Spare?

Add Spare Disk

Verify Spare Added

Test Automatic Failover

Verify Spare Activation

Remove failed disk

📊 Step 13: Set Up Monitoring

Why Monitoring Matters

Check RAID Health Regularly

Set Up Automated Daily Checks (Optional)

Check Disk Health with SMART

💡 Note for Virtual Machines

🏁 Step 14: Configure Auto-Mount at Boot

Why This Is Critical

Get Array UUID

Add to fstab

Understanding Each Field

Understanding nofail (Critical!)