Mellanox NICs with VLAN-Aware Bridges on Linux
A Discord member of mine came to me with an interesting problem - enbling the VLAN-aware bridge in Proxmox would cause all network traffic on the physical card to stop, entirely. Definitely a frustrating issue, especially since the kernel logs made no sense.
The Problem⌗
Here’s what he sent from dmesg:
[ 32.732509] mlx5_core 0000:19:00.1: mlx5e_vport_context_update_vlans:179:(pid 13470): netdev vlans list size (4080) > (512) max vport list size, some vlans will be dropped
[ 32.735782] mlx5_core 0000:19:00.1: mlx5e_vport_context_update_vlans:179:(pid 13470): netdev vlans list size (4081) > (512) max vport list size, some vlans will be dropped
[ 32.739011] mlx5_core 0000:19:00.1: mlx5e_vport_context_update_vlans:179:(pid 13470): netdev vlans list size (4082) > (512) max vport list size, some vlans will be dropped
[ 32.742247] mlx5_core 0000:19:00.1: mlx5e_vport_context_update_vlans:179:(pid 13470): netdev vlans list size (4083) > (512) max vport list size, some vlans will be dropped
[ 32.745550] mlx5_core 0000:19:00.1: mlx5e_vport_context_update_vlans:179:(pid 13470): netdev vlans list size (4084) > (512) max vport list size, some vlans will be dropped
[ 32.748835] mlx5_core 0000:19:00.1: mlx5e_vport_context_update_vlans:179:(pid 13470): netdev vlans list size (4085) > (512) max vport list size, some vlans will be dropped
[ 32.751987] mlx5_core 0000:19:00.1: mlx5e_vport_context_update_vlans:179:(pid 13470): netdev vlans list size (4086) > (512) max vport list size, some vlans will be dropped
[ 32.755209] mlx5_core 0000:19:00.1: mlx5e_vport_context_update_vlans:179:(pid 13470): netdev vlans list size (4087) > (512) max vport list size, some vlans will be dropped
So somehow the mlx5 driver is unhappy with how many VLANs he’s tagging (all of them), and only wants to pass 512. But even when he reduced that number, it still didn’t work. For reference, this is what a /etc/network/interfaces
looks like with this configuration in Proxmox:
#Network adapter
auto enp6s0f1np1
iface enp6s0f1np1 inet manual
#Proxmox bridge
auto vmbr0
iface vmbr0 inet6 static
address 2001:db8:beef:cafe::6969/64
gateway 2001:db8:beef:cafe::420
bridge-ports enp6s0f1np1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
Note the bridge-vlan-aware
and bridge-vids
tags. In this case, the bridge supports VLANs, and VIDs 2-4094 are tagged (so vlan 1 is untagged). He tried reducing this number to 255, still no dice.
What made this even more interesting is we happen to have the exact same card! A Mellanox Connect-X4 dual 25G, both Dell branded as well! So I hopped on my test setup to figure out what the heck was going on, since my setup is working perfectly fine.
The Solution⌗
After searching the forums, we found two things:
- Nvidia are total assholes of the Linux community, their support forums just said ‘oh it looks like you’re using the Linux kernel drivers, we can’t help unless you use our proprietary drivers’. Linus was right to hate them! (“Nvidia has been one of the worst trouble spots we’ve had with hardware manufacturers”).
- Someone managed to fix it despite their non-help by enabling promiscuous mode.
So now that the problem is psuedo-solved, why does my setup work when his does not when we are using identical cards with the same kernel version (Linux 6.2.16-12-pve and Proxmox VE 8.0.4)? Maybe my dmesg will help:
[ 17.708193] device bond0 entered promiscuous mode
[ 68.300117] device enp6s0f1np1 entered promiscuous mode
I’m using a bond, and the bond sets its slaves into promiscuous mode. In my case, I’m using an active-backup bond between my 25G link and the 1G on the motherboard as a backup. He’s not using a bond, so no promiscuous mode, so Mellanox driver does weird things and can’t handle vlans properly.
Here’s the solution: ip link set dev enp6s0f1np1 promisc on
We can add that to our /etc/network/interfaces
like this:
#Network adapter
auto enp6s0f1np1
iface enp6s0f1np1 inet manual
#Enable promiscuous mode on the slave interface
post-up ip link set dev $IFACE promisc on
Even More Solution⌗
While this got the traffic flowing again, he was still getting a bunch of vlans list size > 512 errors after expanding the bridge-vids back to 4094. So, yes it fixed the initial issue, but it’s still trying to offload all vlans and not succeeding. If you still get that error in dmesg
, then maybe this additional fix will work for you (it wasn’t necessary for me, with the bond, so the bond appears to also disable hardware offload of vlans): ethtool -K enp6s0f1np1 rx-vlan-filter off
. If you just want to see the status, you can use ethtool -k enp6s0f1np1
(note the little k to read, big K to write). The new /etc/network/interfaces
would be:
#Network adapter
auto enp6s0f1np1
iface enp6s0f1np1 inet manual
#Enable promiscuous mode on the slave interface
post-up ip link set dev $IFACE promisc on
#Disable RX VLAN filtering in hardware offload
pre-up ethtool -K $IFACE rx-vlan-filter off
Hope the tip helps you!