Monitoring mdadm in Zabbix

I will give an example of simple monitoring of a mdadm software raid in Zabbix.

Let’s say a mirror raid is configured on the server:

cat /proc/mdstat

Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda2[1] sdb2[2]
      234295296 blocks super 1.2 [2/2] [UU]
      bitmap: 2/2 pages [8KB], 65536KB chunk

Usually, when one of the disks falls out of the raid array, it is displayed with an underscore, and working disks with the letter U, as we see in the example, in a mirror raid both disks are working UU.

Now let’s execute a command that will look for underscores and count them (if there are none, the output will be 0):

egrep -c "\[.*_.*\]" /proc/mdstat

Now let’s add it to the configuration file /etc/zabbix/zabbix_agentd.conf:

UserParameter=cat_mdstat,egrep -c "\[.*_.*\]" /proc/mdstat

Restart zabbix-agent to apply the changes:

systemctl restart zabbix-agent

You can check:

zabbix_agentd -t cat_mdstat
mdadm.status [t|0]

Now you can create a template and add a cat_mdstat data element with the zabbix-agent type to it, and also create a trigger that will fire if the value is >0.
I also saved my template here

I would like to note that there may be cases when the SSD disk begins to partially fail and at the same time it will remain in the raid, so you need to set up receiving email notifications from smartctl in order to receive SMART test results when there are errors, and if necessary, independently exclude the problematic disk from the raid.

See my other articles about Zabbix
mdadm – utility for managing software RAID arrays

Leave a comment

Leave a Reply