How did I make a request to Hetzner to replace the disk in the raid

Already quite often I wrote in Hetzner to replace disks in a raid and in this article I will describe one of the cases.

And so, one morning, after a disk dropped out of a raid, mdadm sent me a message by email.
Continue reading “How did I make a request to Hetzner to replace the disk in the raid”

Updating the Linux kernel on Hetzner servers

It took one day to update the CentOS kernel on the server in Hetzner, which worked without rebooting for about 2 years.

I looked at various information about the system, the versions of the installed kernels and the running kernel:

lsb_release -a
uname -r
uname -a
cat /proc/version
sudo rpm -q kernel
ls /boot | grep vmlinuz

Updated kernel:

yum -y update

Rebooted the server using the link in cPanel https://example.com:2087/scripts/dialog?dialog=reboot

You can also use the command:

reboot

After rebooting the server did not start, I immediately made a request for technical support via the https://robot.your-server.de.
Technical support responded within two minutes, as they reported, the server did not respond to keystrokes, there was a black screen, after the physical shutdown and power-up, it successfully launched.
Here such there are cases of a reset, it is necessary to be ready to everything.
At the next reboots, hangup no longer occurred.

How to fix the problem with mdadm disks

I received three email messages from one of the servers on Hetzner with information about raids md0, md1, md2:

DegradedArray event on /dev/md/0:example.com
This is an automatically generated mail message from mdadm
running on example.com
A DegradedArray event had been detected on md device /dev/md/0.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid6] [raid5] [raid4] [raid1]
md2 : active raid6 sdb3[1] sdd3[3]
208218112 blocks super 1.0 level 6, 512k chunk, algorithm 2 [4/2] [_U_U]
md1 : active raid1 sdb2[1] sdd2[3]
524224 blocks super 1.0 [4/2] [_U_U]
md0 : active raid1 sdb1[1] sdd1[3]
12582784 blocks super 1.0 [4/2] [_U_U]
unused devices:

I looked at the information about RAID and disks:

cat /proc/mdstat
cat /proc/partitions
mdadm --detail /dev/md0
mdadm --detail /dev/md1
mdadm --detail /dev/md2
fdisk -l | grep '/dev/sd'
fdisk -l | less

I was going to send a ticket to the tech support and plan to replace the dropped SSD disks.
SMART recorded information about the dropped discs in the files, there was also their serial number:

smartctl -x /dev/sda > sda.log
smartctl -x /dev/sdc > sdc.log

Remove disks from the raid if you can:

mdadm /dev/md0 -r /dev/sda1
mdadm /dev/md1 -r /dev/sda2
mdadm /dev/md2 -r /dev/sda3

mdadm /dev/md0 -r /dev/sdc1
mdadm /dev/md1 -r /dev/sdc2
mdadm /dev/md2 -r /dev/sdc3

If any partition of the disk is displayed as working, and the disk needs to be extracted, then first mark the partition not working and then delete, for example, if /dev/sda1, /dev/sda2 are dropped, and /dev/sda3 works:

mdadm /dev/md0 -f /dev/sda3
mdadm /dev/md0 -r /dev/sda3

In my case, having looked at the information about the dropped discs, I found that they are whole and working, even better than active ones.

I looked at the disk partitions:

fdisk /dev/sda
p
q
fdisk /dev/sdc
p
q

They were marked the same way as before:

Disk /dev/sda: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00015e3f
Device Boot Start End Blocks Id System
/dev/sda1 1 1567 12582912+ fd Linux raid autodetect
/dev/sda2 1567 1633 524288+ fd Linux raid autodetect
/dev/sda3 1633 14594 104109528+ fd Linux raid autodetect

Therefore, after waiting for the synchronization of each returned these discs back to the raid:

mdadm /dev/md0 -a /dev/sda1
mdadm /dev/md1 -a /dev/sda2
mdadm /dev/md2 -a /dev/sda3

mdadm /dev/md0 -a /dev/sdc1
mdadm /dev/md1 -a /dev/sdc2
mdadm /dev/md2 -a /dev/sdc3

At the end, the command cat /proc/mdstat was already displayed with [UUUU].

If the disks are replaced with new ones, then they need to be broken in the same way as the ones installed.
An example of partitioning the disk /dev/sdb is similar to /dev/sda with MBR:

sfdisk -d /dev/sda | sfdisk --force /dev/sdb

Example of partitioning /dev/sdb with GPT and assigning a random UUID disk:

sgdisk -R /dev/sdb /dev/sda
sgdisk -G /dev/sdb

Also on the newly installed disk you need to install the bootloader:

grub-install --version
grub-install /dev/sdb
update-grub

Either through the menu grub (hd0 is /dev/sda, hd0,1 – /dev/sda2):

cat /boot/grub/device.map
grub
device (hd0) /dev/sda
root (hd0,1)
setup (hd0)
quit

If the grub installation is performed from the rescue disk, you need to look at the partition list and mount it, for example if RAID is not used:

ls /dev/[hsv]d[a-z]*[0-9]*
mount /dev/sda3 /mnt

If you are using software RAID:

ls /dev/md*
mount /dev/md2 /mnt

Either LVM:

ls /dev/mapper/*
mount /dev/mapper/vg0-root /mnt

And execute chroot:

chroot-prepare /mnt
chroot /mnt

After mounting, you can restore GRUB as I wrote above.

See also my other articles:
How did I make a request to Hetzner to replace the disk in the raid
The solution to the error “md: kicking non-fresh sda1 from array”
The solution to the warning “mismatch_cnt is not 0 on /dev/md*”
mdadm – utility for managing software RAID arrays
Description of RAID types
Diagnostics HDD using smartmontools
Recovering GRUB Linux

Setting up a backup space for Hetzner.de

On the test went into https://robot.your-server.de/, has opened Main functions – Servers, selected the server and in the tab Backup activated free 100GB, since for servers costing 39 € or more this place was allocated for free.
Activated WebDAV to test, samba was already activated, it is also possible to connect via FTP, FTPS, SFTP and SCP using a user name and password, via SFTP/SCP, you can also connect using the key.
The speed of data transfer to the backup server depends on the number of connected users and their traffic.
When connecting, use a domain name, for example USER.your-backup.de, because the IP address can change.
Also, you can not create the /etc and /lib directories in place for backups.

For an example in Ubuntu Server, I will mount a backup location using SAMBA/CIFS.
Install the necessary utilities and create a directory in which we will mount:

sudo apt-get install cifs-utils
sudo mkdir /backup

Temporarily mount the place with the command:

sudo mount.cifs -o user=USER,pass=PASSWORD //USER.your-backup.de/backup /backup

To automatically mount after rebooting the system, add the following line to the / etc / fstab file:

//USER.your-backup.de/backup /backup   cifs  iocharset=utf8,rw,credentials=/etc/backup-credentials.txt,uid=SYSTEM_USER,gid=SYSTEM_GROUP,file_mode=0660,dir_mode=0770 0 0

To open a file it is possible for example with the text editor nano (CTRL+X for an output, y/n for saving or canceling changes):

sudo nano /etc/fstab

And add the following lines to the file /etc/backup-credentials.txt:

username=USER
password=PASSWORD

We will only set permissions for the file owner for security reasons:

sudo chmod 600 /etc/backup-credentials.txt

If you use the Windows operating system, you need to create a system user with the same login and password as the backup location.

Now I will connect to the test via WebDAV.
Install the necessary utilities and create a directory in which we will mount:

sudo apt-get install davfs2
sudo mkdir /backup

In CentOS:

yum install davfs2
mkdir /backup

Temporarily to mount through WebDAV a place it is possible a command:

sudo mount -t davfs https://USER.your-backup.de /backup

To automatically mount after rebooting the system, add the following line to the /etc/fstab file:

https://USER.your-backup.de /backup davfs rw,uid=SYSTEM_USER,gid=SYSTEM_GROUP,file_mode=0660,dir_mode=0770 0 0

And in the file /etc/davfs2/secrets the following line:

https://USER.your-backup.de USER PASSWORD

That’s all, in my case, you can save backups to the /backup directory.