Already quite often I wrote in Hetzner to replace disks in a raid and in this article I will describe one of the cases.
And so, one morning, after a disk dropped out of a raid, mdadm sent me a message by email.
I looked at the status of the raids:
Information was displayed (as you can see, the first disc fell out):
Personalities : [raid1] [raid6] [raid5] [raid4] md2 : active raid6 sdc3 sdd3 sdb3 208218112 blocks super 1.0 level 6, 512k chunk, algorithm 2 [4/3] [_UUU] md1 : active raid1 sdc2 sdd2 sdb2 sda2(F) 524224 blocks super 1.0 [4/3] [_UUU] md0 : active raid1 sdc1 sdd1 sdb1 12582784 blocks super 1.0 [4/3] [_UUU]
I looked at what drives are present in the system:
fdisk -l | grep '/dev/sd'
In my case, there were sdb, sdc, sdd, and sda was missing:
Disk /dev/sdc: 120.0 GB, 120034123776 bytes /dev/sdc1 1 1567 12582912+ fd Linux raid autodetect /dev/sdc2 1567 1633 524288+ fd Linux raid autodetect /dev/sdc3 1633 14594 104109528+ fd Linux raid autodetect Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes /dev/sdd1 1 1567 12582912+ fd Linux raid autodetect /dev/sdd2 1567 1633 524288+ fd Linux raid autodetect /dev/sdd3 1633 14594 104109528+ fd Linux raid autodetect Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes /dev/sdb1 1 1567 12582912+ fd Linux raid autodetect /dev/sdb2 1567 1633 524288+ fd Linux raid autodetect /dev/sdb3 1633 14594 104109528+ fd Linux raid autodetect
I was convinced that the system does not see the disk:
smartctl -x /dev/sda Smartctl open device: /dev/sda failed: No such device
To make a request for a replacement disk, you must provide the serial number of the non-working disk, if the disk is not visible to the system, then you must report the serial number of all working disks.
I created the application through the site robot.your-server.de
Let’s look at the serial numbers of the working disks:
smartctl -x /dev/sdb smartctl -x /dev/sdc smartctl -x /dev/sdd
I have the following information:
... === START OF INFORMATION SECTION === Model Family: SandForce Driven SSDs Device Model: Corsair CSSD-F120GB2 Serial Number: 10446526320009980370 LU WWN Device Id: 5 000000 009980370 Firmware Version: 1.1 User Capacity: 120,034,123,776 bytes [120 GB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 6 Local Time is: Tue Nov 20 21:40:16 2018 EET SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled ATA Security is: Disabled, NOT FROZEN [SEC1] ...
In my cases, Hetzner was replacing drives for about half an hour, free of charge on a used SSD, about five years old.
After the disk is replaced, let’s see if the system sees it and split it in the same way as the one installed:
fdisk -l | grep '/dev/sd' sfdisk -d /dev/sdc | sfdisk --force /dev/sda
Next, you need to add a disk to your raids and wait for synchronization:
mdadm /dev/md0 -a /dev/sda1 cat /proc/mdstat mdadm /dev/md1 -a /dev/sda2 cat /proc/mdstat mdadm /dev/md2 -a /dev/sda3 cat /proc/mdstat
Now it remains to install GRUB2:
grub-install /dev/sda update-grub
Or GRUB1 (hd0 is /dev/sda, hd0,1 – /dev/sda2):
cat /boot/grub/device.map grub device (hd0) /dev/sda root (hd0,1) setup (hd0) quit
I used GRUB1, so I installed it:
grub Probing devices to guess BIOS drives. This may take a long time. GNU GRUB version 0.97 (640K lower / 3072K upper memory) [ Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists the possible completions of a device/filename.] grub> device (hd0) /dev/sda device (hd0) /dev/sda grub> root (hd0,1) root (hd0,1) Filesystem type is ext2fs, partition type 0xfd grub> setup (hd0) setup (hd0) Checking if "/boot/grub/stage1" exists... yes Checking if "/boot/grub/stage2" exists... yes Checking if "/boot/grub/e2fs_stage1_5" exists... yes Running "embed /boot/grub/e2fs_stage1_5 (hd0)"... 27 sectors are embedded. succeeded Running "install /boot/grub/stage1 (hd0) (hd0)1+27 p (hd0,1)/boot/grub/stage2 /boot/grub/grub.conf"... succeeded Done. grub> quit
PS. I will give an example of a backup MBR/GPT:
sfdisk --dump /dev/sda > sda_parttable_mbr.bak sfdisk /dev/sda < sda_parttable_mbr.bak sgdisk --backup=sda_parttable_gpt.bak /dev/sda sgdisk --load-backup=sda_parttable_gpt.bak /dev/sda
See also my article:
How to fix the problem with mdadm disks