Replacing a Failed Drive in an mdadm Array on Ubuntu

Published: October 30, 2020 at 4:10:40 PM UTC

If you're in the dreaded situation of having a drive failure in an mdadm RAID array, this article explains how to correctly replace it on an Ubuntu system.

The information in this post is based Ubuntu 18.04 and the version of mdadm included in its repositories; at the time of writing v4.1-rc1. It may or may not be valid for other versions.

I recently had a sudden drive failure in my home file server, which consists of nine drives in an mdadm RAID-6 array. That's always scary, but I was fortunately able to quickly source a replacement drive that was delivered already the next day so I could get the rebuild started.

I was admittedly a bit too cheap when I originally setup the file server; only two of the drives are actual NAS drives (Seagate IronWolf), while the rest are desktop drives (Seagate Barracuda). Not surprisingly, it was one of the desktop drives that had given up (after almost three years of service, though). It was completely dead; after moving it to a desktop USB enclosure all I got out of it was an unnerving clicking sound and neither Ubuntu 20.04 nor Windows 10 was able to detect it.

Oh well, on to the replacement part (and yes, the new drive I bought was an IronWolf, lesson learned) - as scary as it is losing a drive in a running array, it's even scarier if you don't know the correct procedure for replacing it. It's not the first time I've had to replace a failed drive in an mdadm array, but fortunately it's so rare that I usually have to look up the proper commands. This time I decided to whip up my own little guide for future reference.

So, first of all, when you get the dreaded fail event e-mail from mdadm, you need to identity which drive has failed. Sure, it will tell you the device name (in my case /dev/sdf), but it's probably not obvious which physical drive that actually is as those names can change when the machine is booted.

If you're not even sure which device name has failed, you can use the following command to find out (replace /dev/md0 with your RAID device):

mdadm -–query -–detail /dev/md0

As mentioned, in my case it was /dev/sdf, so let's continue with that.

Then, you can try to find the serial number of the failed drive by issuing this command:

smartctl -–all /dev/sdf | grep -i 'Serial'

(if smartctl is not found, you need to install the smartmontools package on Ubuntu)

The serial number can then be compared to the serial numbers on the physical label on the drives to figure out which one has failed.

This time, I wasn't so lucky, though. The drive was completely dead and even refused to provide SMART or other data, including the serial number.

Since I had physical access to the server (which you really need if you're going to replace a physical drive yourself, I suppose ;-)) and the server was actually running when the disk failed (and continued to run fine thanks to the RAID-6 redundancy), I went with the really primitive, but actually highly effective and obvious, method of simply copying a large file to the server and watching which HDD light didn't flicker. Within a few seconds I had identified the culprit.

Now, before yanking out the physical drive, it's a good idea to formally inform mdadm of this intent, by issuing this command (replace device names with your own as appropriate):

mdadm -–manage /dev/md0 -–remove /dev/sdf1

On success, mdadm will reply with a message saying that it "hot removed" the drive, apparently because the virtual raid device is actually running at the time.

If it fails with an error message similar to "device or resource busy", it may be that mdadm has in fact not registered the drive to have completely failed. To make it do that, issue this command (again, remember to replace device names with your own as appropriate):

mdadm --manage /dev/md0 --fail /dev/sdf

After that, you should be able to remove the device from the array with the previous command.

Now it's time to actually replace the drive. If you're really, really - like, really - certain your machine and controller supports hot swapping, you could do this without shutting down the machine. That would be the way to go on critical production systems running on real, proper server hardware that you know for a fact can handle it. My home file server is based on a consumer grade desktop motherboard with a couple of semi-noname SATA controllers in the PCIe slots to provide more SATA ports, though.

Although SATA generally should support hot swapping, I wasn't about to risk anything in this setup, so I opted for shutting down the machine while replacing the drive.

Before doing that, it's a good idea to comment out the raid device in the /etc/fstab file so that Ubuntu won't try to mount it automatically on the next boot, because it might hang and force you into recovery mode due to the degraded RAID array. That may not be a big issue if it's a desktop system, but I run this server headless without monitor or keyboard attached, so this would be a bit of a hassle.

After booting the machine with the shiny new drive installed, use lsblk or some other means to identify it. If you haven't changed anything else, it will probably (but not necessarily) get the same name as the drive you replaced. In my case it did, so the new one is also called /dev/sdf.

As my array is based on partitions rather than physical devices, I needed to copy the partition table from a working drive to the new drive in order to make sure they're exactly the same. If you run your array on physical devices instead, you can skip this step.

I used sgdisk for this purpose, copying the partition table from /dev/sdc to /dev/sdf. Make sure to replace device names to match your own as appropriate.

Notice the order here: you list the "to" drive first! This is a bit counter-intuitive for me, but just make sure you get it right so you don't get another drive failure in the array ;-)

sgdisk -R /dev/sdf /dev/sdc

Then to avoid UUID conflicts, generate new UUIDs for the new drive:

sgdisk -G /dev/sdf

And now finally the time has come to add the new drive to the array and get the rebuild party started! (Okay, it's not really a party, it's actually a quite slow and unnerving process as you really, really don't want another drive failing at this time. Beer might help, though)

Anyway, to add the new drive to the array, issue this command (again, make sure to replace device names with your own as appropriate):

mdadm -–manage /dev/md0 -–add /dev/sdf1

If all goes well, the drive will be added to the array without hiccups. I believe it is actually added as a "hot spare" by default, but since this array is missing a disk (the one that failed), it is immediately put into use and the rebuild process will start.

You can keep an eye on it like so:

watch cat /proc/mdstat

This will probably take a while; on my lowly server (based largely on consumer grade hardware and desktop drives, mind you) it was able to reach just under 100 MB/sec. Bear in mind that this is RAID-6, so there's a lot of parity calculations involved with a rebuild; a RAID-10 would have been much faster. This particular machine has an AMD A10 9700E quad core CPU (the "E" meaning that it's an under-clocked energy efficient model, i.e. not super fast), just to give you an idea of what to expect. With the nine 8 TB drives in my setup, the full rebuild took just over 24 hours.

During the rebuild, you can mount the filesystem on the array and use it like normal if you wish, but I prefer to leave it to the rebuilding until it's done. Bear in mind that if one drive fails, another may soon follow, so you want the rebuild to be done as fast as possible as you really don't want another drive to fail during that. Therefore, don't burden it with other IO that isn't strictly necessary.

Once it's done, add it back to your /etc/fstab file, reboot and enjoy your files :-)