The reason for increasing RX overruns on the network adapter

I noticed once on one of the servers that the RX overruns value is growing slightly.

I executed several commands (where p2p1 and p2p2 are the names of network interfaces):

ifconfig p2p1
ifconfig p2p2

Only the value of RX overruns increased by 10 packets every second, with traffic at about 2Gb/s (100,000 packets per second).
The server was equipped with the network adapter “HP NC552SFP 10Gb 2-Port Ethernet Server Adapter” with a network controller from Emulex.

Having looked the size of the maximum and current buffer:

ethtool -g p2p1
ethtool -g p2p2

Found that the buffers are set to maximum, TX buffer is 4096, and the maximum possible RX buffer was only 512.
See also my article – Changing TX and RX network interface buffers in Linux

Having looked at the correct distribution of network card interrupts over the processor cores:

grep p2p1 /proc/interrupts

Found that the network adapter has 4 IRQ interrupts maximum, and irqbalance accordingly allocated them to 4 cores, and the processor cores are 24.

The problem was solved by replacing the network adapter with a more expensive one – “665249-B21 HP 10Gb 2-port 560SFP + Adapter” with an Intel 82599 network controller.
After that, an error was not observed, RX and TX buffers were 4096, and IRQ was distributed to all 24 cores.
After a couple of days, the error counters remained at zero:

p2p1      RX packets:62535001155 errors:0 dropped:0 overruns:0 frame:0
          TX packets:36343078751 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:77395016742081 (77.3 TB)  TX bytes:10991051263063 (10.9 TB)

p2p2      RX packets:35672087256 errors:0 dropped:0 overruns:0 frame:0
          TX packets:58598868464 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:10996254475480 (10.9 TB)  TX bytes:73378418623349 (73.3 TB)

Probably the previous network adapter was some kind of truncated version, since the price was two times lower, and in general for serious purposes it is better to use network adapters with a controller from Intel.

See also my article – Configuring the Network in Linux

Monitoring Linux ISG in Zabbix

Today I wanted to monitor Linux ISG sessions in Zabbix.

By entering the command on one of the servers:

/opt/ISG/bin/ISG.pl show_count

Saw the following:

Approved sessions count: 2021
Unapproved sessions count: 2

The Zabbix agent on the server has already been installed, so it opened its configuration file (in the nano editor, the Ctrl+X keys for the exit, and y/n for saving or canceling the changes):

nano /etc/zabbix/zabbix_agentd.conf

Invented and added the following code:

UserParameter=isg.approved, /opt/ISG/bin/ISG.pl show_count | grep "Approved sessions count:" | awk '{print $4}'
UserParameter=isg.unapproved, /opt/ISG/bin/ISG.pl show_count | grep "Unapproved sessions count:" | awk '{print $4}'

We will allow Zabbix agent to work as root with the user specifying:

AllowRoot=1

Restart the Zabbix agent to apply the changes:

sudo /etc/init.d/zabbix-agent restart

On the Zabbix server, create an ISG template, add the data elements to it, specifying the type – Zabbix agent, and the keys: isg.approved, isg.unapproved.
Create graphics for the created data items.

Apply the template to the desired nodes of the network.

Done.