Every once in a while hardware needs to be replaced. Sometimes boards, sometimes CPUs and sometimes drives. Drives are the ones most difficult to replace as they hold all that precious data one does not want to lose.
I personally hate reinstalling stuff; so I try as much as possible to “migrate” my data around with the least effort (from my part) and the least downtime.
For clarity, the system details are:
- /dev/sda, /dev/sdb – are the two new 3TB drives
- /dev/sdc, /dev/sdd – are another raid set which is not affected by this procedure
- /dev/sde, /dev/sdf – are the two old 500GB drives, with 2 raid-1 arrays on them (md0 and md1)
- /dev/md0 (raid 1) made up of sde1 and sdf1, to be migrated to sda1 and sdb1
- /dev/md1 (raid 1) made up of sde2 and sdf2, to be migrated to sda2 and sdb2
- the other arrays are not affected by the transition
You’ll be able to get the drive designations in order by re-arranging the cables/drives/ports before starting the new procedure (Linux raid is smart enough to recognise its drives no matter the ports/connectors they’re on).
This will only require one reboot so the downtime should be minimal. The plan is that the rest of the procedure be done on the live system without more downtime (not even when removing the old drives, if your board/system supports hotplug).
When making changes to important data and/or on a live system, ALWAYS TRIPLE CHECK targets before performing destructive operations. And ALWAYS HAVE BACKUPS. Otherwise you can lose everything within seconds.
Create the new partition table on one of the new bigger drives. You can use either fdisk or gparted or any other tool you like.
Since my drives are larger than 2TB (and I’ll be using a partition that’s also larger than 2TB) I had to use the new GPT partition table format.
where sda is source and sdb is destination. The last command generates a new UUID on the second disk. sgdisk is not yet available in most repositories but can be installed right from the source. (Edit: The sgdisk application is part of the gdisk package which as of 2017 is available on most current distros).
Alternative, the process can be compressed down to one command:
sgdisk -R <NewDisk> <ExistingDisk>
however be ABSOLUTELY sure you’re writing the disk identifiers in the correct order.
For md-dos partition tables,
sfdisk -d /dev/sda | sfdisk /dev/sdb
will work just fine (sda is source/old, sdb is destination/new disk).
Changing the drives in the arrays
First off disable write intent bitmap (if enabled).
mdadm --grow /dev/md0 --bitmap=none
mdadm --grow /dev/md1 --bitmap=none
Now it’s time to add the partitions on the new drives to the arrays:
mdadm --add /dev/md0 /dev/sda1
mdadm --add /dev/md0 /dev/sdb1
mdadm --grow --raid-devices=4 /dev/md0
Then the second array
mdadm --add /dev/md1 /dev/sda2
mdadm --add /dev/md1 /dev/sdb2
mdadm --grow --raid-devices=4 /dev/md1
Now we wait for the data to sync (this will take minutes to hours depending on size). The sync status can be monitored with
watch cat /proc/mdstat
If you notice the sync speed is too low (and you expect the disks to be capable of more speed), you can check the sync limits that are in place with:
You could, for example, increase the minimum limit with
sysctl -w dev.raid.speed_limit_min=value
A couple of hours later, when the migration is complete it’s time to distance yourself from the old disks and remove them from the arrays (but don’t be too haste in throwing them away; they have a backup of all your data, after all).
Continue by making the new disks bootable.
Then soft-fail and remove the old disk from the arrays:
mdadm --fail /dev/md0 /dev/sde1
mdadm --remove /dev/md0 /dev/sde1
mdadm --fail /dev/md0 /dev/sdf1
mdadm --remove /dev/md0 /dev/sdf1
mdadm --grow --raid-devices=2 /dev/md0
mdadm --fail /dev/md1 /dev/sde2
mdadm --remove /dev/md1 /dev/sde2
mdadm --fail /dev/md1 /dev/sdf2
mdadm --remove /dev/md1 /dev/sdf2
mdadm --grow --raid-devices=2 /dev/md1
Check /proc/mdstat again to make sure you removed the right disks and everything is correct and synced.
You can now disconnect the old drives (don’t erase them just yet, they still hold all your data from a few hours ago).
Enlarging the arrays to use up all the new available space
This step is a simplified procedure of my previous enlarging a RAID1 array tutorial.
Start by checking that you can live-resize the array. You can do this with mdadm –examine /dev/sda1 (and /dev/sdb1)
# mdadm --examine /dev/sda1 [...] Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 629145456 (300.00 GiB 322.12 GB) « available space Array Size : 131071928 (125.00 GiB 134.22 GB) needs to be larger than Used Dev Size : 262143856 (125.00 GiB 134.22 GB) « used space Super Offset : 629145584 sectors State : clean [...]
If the available space is the same as the used space, you’ll need to use the offline procedure described in the tutorial linked above.
Enlarge the arrays using
maadm --grow /dev/md0 --size=max
mdadm --grow /dev/md1 --size=max
Check that the arrays has been resized with –examine. The arrays will sync once more (wait for them to finish).
Then grow the filesystems to the available space
Re-enable write intent bitmap (to improve sync speeds and fault tolerance)
mdadm --grow /dev/md0 --bitmap=internal
mdadm --grow /dev/md1 --bitmap=internal
mdadm --examine --scan >> /etc/mdadm/mdadm.conf
Schedule a filesystem check at the next reboot just to keep things clean
To reboot right away and perform a filesystem check
shutdown -rF now