Mellanox ConnectX errors “Could not join netdev: No space left on device”

Once after installing the Mellanox ConnectX-3 Pro EN / HP 544+QSFP (764284-B21) network adapter into the HPE DL380 Gen10 server, I encountered errors in the /var/log/syslog file:

systemd-networkd: vlan972: netdev ready
systemd-networkd: vlan951: netdev ready
systemd-networkd: vlan601: netdev ready
systemd-networkd: ens4d1: Could not join netdev: No space left on device
systemd-networkd: ens4d1: Failed
systemd-networkd: ens2d1: Link UP
systemd-networkd: ens2d1: Gained carrier
systemd-networkd: vlan959: Link is not managed by us
systemd-networkd: vlan991: Link is not managed by us

Then I found another:

pci 0000:12:00.0: BAR 9: no space for [mem size 0x20000000 64bit pref]
pci 0000:12:00.0: BAR 9: failed to assign [mem size 0x20000000 64bit pref]
pci 0000:12:00.0: BAR 6: no space for [mem size 0x00100000 pref]
pci 0000:12:00.0: BAR 6: failed to assign [mem size 0x00100000 pref]
pci 0000:12:00.0: BAR 2: assigned [mem 0xc4000000000-0xc4001ffffff 64bit pref]
pci 0000:12:00.0: BAR 9: assigned [mem 0xc4002000000-0xc4021ffffff 64bit pref]

I installed two network adapters at once, here are some usual logs when loading the server:

dmesg -T
dmesg -T | grep mlx
mlx4_core: Mellanox ConnectX core driver v4.0-0
mlx4_core: Initializing 0000:12:00.0
mlx4_core 0000:12:00.0: DMFS high rate steer mode is: disabled performance optimized steering
mlx4_core 0000:12:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s
mlx4_core 0000:12:00.0: PCIe link width is x8, device supports x8
mlx4_core: Initializing 0000:af:00.0
mlx4_core 0000:af:00.0: DMFS high rate steer mode is: disabled performance optimized steering
mlx4_core 0000:af:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s
mlx4_core 0000:af:00.0: PCIe link width is x8, device supports x8
mlx4_en: Mellanox ConnectX HCA Ethernet driver v4.0-0
mlx4_en 0000:12:00.0: Activating port:2
mlx4_en: 0000:12:00.0: Port 2: Using 28 TX rings
mlx4_en: 0000:12:00.0: Port 2: Using 16 RX rings
mlx4_en: 0000:12:00.0: Port 2: Initializing port
mlx4_en 0000:12:00.0: registered PHC clock
mlx4_en 0000:af:00.0: Activating port:2
mlx4_core 0000:12:00.0 ens2d1: renamed from eth0
mlx4_en: 0000:af:00.0: Port 2: Using 28 TX rings
mlx4_en: 0000:af:00.0: Port 2: Using 16 RX rings
mlx4_en: 0000:af:00.0: Port 2: Initializing port
mlx4_en 0000:af:00.0: registered PHC clock
<mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v4.0-0
mlx4_core 0000:af:00.0 ens4d1: renamed from eth0
<mlx4_ib> mlx4_ib_add: counter index 0 for port 1 allocated 0
<mlx4_ib> mlx4_ib_add: counter index 2 for port 2 allocated 1
<mlx4_ib> mlx4_ib_add: counter index 0 for port 1 allocated 0
<mlx4_ib> mlx4_ib_add: counter index 2 for port 2 allocated 1

The first network adapter worked well and without errors, one IP address was specified on its first port, the second port was not used.
The second network adapter worked partially and there were errors in the logs about which I wrote above, in the netplan configuration, 200 VLANs without IP addresses, a pair of VLANs with an IP address and one IP address per port without a VLAN tag were specified on the first port.

I looked at the number of VLAN interfaces and found that there are only 126 of them:

ip a | grep ': vlan' | wc -l

If you execute the command “netplan apply”, then part of the VLAN disappeared, and the other part appeared, but the number of running VLANs did not change.

I have updated the driver version and firmware version of the network adapters to the newest ones, but it didn’t help.

Later I learned that ConnectX®-3, ConnectX®-3 Pro, ConnectX®-4, ConnectX®-4 Lx and ConnectX®-5 Ex network adapters only support 128 MAC/VLANs per port.

I did not find information about VLAN per port limitation in the documentation for the HP 544+QSFP (764284-B21) network adapter, for example:
https://h20195.www2.hpe.com/v2/default.aspx?cc=my&lc=en&oid=6938455
There is also no information for Mellanox ConnectX-3 Pro EN:
https://www.mellanox.com/files/doc-2020/pb-connectx-3-pro-card-en.pdf
And I noticed that little is written about this for other models, but here it is written:
https://www.mellanox.com/related-docs/prod_adapter_cards/ConnectX3_EN_Card.pdf

To solve the problem, I took a second QSFP+ cable, connected it to the second port of the network adapter and configured 100 VLANs on one port and 100 VLANs on the other, then applied the network configuration by executing “netplan apply” and all 200 VLANs worked successfully.

See also my articles:
Configuring the Network in Linux
How to configure networking with Netplan

Leave a comment

Leave a Reply