[Bug 1892132] Re: Failure to get the correct UpLink Representor

Frode Nordahl 1892132 at bugs.launchpad.net
Mon Sep 6 12:06:58 UTC 2021


Proposed libvirt package on Hirsute system with original unmodified kernel and d
river:
$ uname -a
Linux node-laveran 5.11.0-31-generic #33-Ubuntu SMP Wed Aug 11 13:19:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

$ sudo lshw|grep mlx5_core
                configuration: autonegotiation=on broadcast=yes driver=mlx5_core driverversion=5.11.0-31-generic firmware=16.31.1014 (MT_0000000183) latency=0 link=no multicast=yes
...

$ lspci -nnvv | grep Mellanox
03:00.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
        Subsystem: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:0061]
03:00.1 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
        Subsystem: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:0061]
03:00.2 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] [15b3:1018]
        Subsystem: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] [15b3:0061]
...

$ dpkg -l | grep libvirt
ii  libvirt-clients                       7.0.0-2ubuntu2.1                                                     amd64        Programs for the libvirt library
ii  libvirt-daemon                        7.0.0-2ubuntu2.1                                                     amd64        Virtualization daemon
ii  libvirt-daemon-config-network         7.0.0-2ubuntu2.1                                                     all          Libvirt daemon configuration files (default network)
ii  libvirt-daemon-config-nwfilter        7.0.0-2ubuntu2.1                                                     all          Libvirt daemon configuration files (default network filters)
ii  libvirt-daemon-driver-qemu            7.0.0-2ubuntu2.1                                                     amd64        Virtualization daemon QEMU connection driver
ii  libvirt-daemon-system                 7.0.0-2ubuntu2.1                                                     amd64        Libvirt daemon configuration files
ii  libvirt-daemon-system-systemd         7.0.0-2ubuntu2.1                                                     all          Libvirt daemon configuration files (systemd)
ii  libvirt0:amd64                        7.0.0-2ubuntu2.1                                                     amd64        library for interfacing with different virtualization systems

$ sudo grep -A6 hostdev /etc/libvirt/qemu/instance-00000002.xml
    <interface type='hostdev' managed='yes'>
      <mac address='fa:16:3e:3e:af:3d'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x2'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

$ openstack server list --long
| 9de43d8a-b63b-4747-9f6a-5ca20a501450 | fnord-node-laveran-1 | ACTIVE | None       | Running     | network=10.42.3.233                 | auto-sync/ubuntu-focal-20.04-amd64-server-20210825-disk1.img | ee06a053-c350-474c-a03f-cf0afcb35591 | m1.large    | 29873860-7b2e-49ad-a290-9a2d0600369d | nova              | node-laveran.maas |            |

$ ssh fnord-node-laveran-1 lspci
...
00:03.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
...

<install OFED drivers, reboot and restart instances>

$ uname -a
Linux node-laveran 5.11.0-31-generic #33-Ubuntu SMP Wed Aug 11 13:19:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

$ sudo lshw|grep mlx5_core
                configuration: autonegotiation=on broadcast=yes driver=mlx5_core driverversion=5.4-1.0.3 duplex=full firmware=16.31.1014 (MT_0000000183) latency=0 link=yes multicast=yes slave=yes speed=10Gbit/s

$ ssh fnord-node-laveran-1 lspci
...
00:03.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
...


** Tags removed: verification-needed verification-needed-hirsute
** Tags added: verification-done verification-done-hirsute

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to python-os-vif in Ubuntu.
https://bugs.launchpad.net/bugs/1892132

Title:
  Failure to get the correct UpLink Representor

Status in os-vif:
  Fix Released
Status in os-vif victoria series:
  Fix Committed
Status in libvirt package in Ubuntu:
  Fix Released
Status in python-os-vif package in Ubuntu:
  Fix Released
Status in libvirt source package in Focal:
  Fix Committed
Status in python-os-vif source package in Focal:
  In Progress
Status in libvirt source package in Groovy:
  Won't Fix
Status in python-os-vif source package in Groovy:
  Won't Fix
Status in libvirt source package in Hirsute:
  Fix Committed
Status in python-os-vif source package in Hirsute:
  Fix Released
Status in libvirt source package in Impish:
  Fix Released
Status in python-os-vif source package in Impish:
  Fix Released

Bug description:
  [Impact] 
  An update to the mlx5_core driver [1] which will be made available to users of stable releases both through HWE kernels and DKMS packages provided by NVIDIA/Mellanox [2] makes some assumptions about sysfs layout made by OS-VIF and Libvirt apparent.

  To allow users with this hardware to continue to enjoy their existing
  systems with the most recent drivers updates are required to os-vif
  and libvirt.

  Without this update these systems will stop functioning when upgrading
  to the new mlx5_core driver.

  [Test Plan]
  Note: Hardware making use of the mlx5_core driver with support for HWOL is required to test these changes.

  1. Deploy OpenStack on machines with HWOL enabled using kernel without [1]
  2. Create an instance using an HWOL port
  3. Confirm the instance can start and that it has connectivity
  4. Upgrade to kernel with [1] and re-confirm

  [Regression Potential]
  For OS-VIF the changes are made to code paths used exclusively by consumers of this type of hardware and HWOL enabled. They are also made in a backward compatible way so that it works both with the old and new driver.

  For Libvirt the change is made in such a way that it will behave as
  before when used to look up hardware that populates net/phys_port_id.
  When used with hardware that do not populate net/phys_port_id but use
  net_phys_port_name instead, which is typical for the hardware in
  question, the new behavior is used.

  [Original Bug Description]
  Due to new kernel patch here [1], the PF and VF representors are linked to their parent PCI device.

  Old Structure:
  The structure of VF's PCI Address/physfn/net contains only the PF of that VF

  $ ls /sys/bus/pci/devices/<vf-pci-addre>/physfn/net/
  enp2s0f0

  $ ls -l /sys/class/net
  ...
  lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_0 -> ../../devices/virtual/net/enp2s0f0_0
  lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_1 -> ../../devices/virtual/net/enp2s0f0_1
  lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_2 -> ../../devices/virtual/net/enp2s0f0_2
  lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_3 -> ../../devices/virtual/net/enp2s0f0_3
  ...

  New Structure:
  The structure of VF's PCI Address/physfn/net contains the PF of that VF and the VF representors

  $ ls /sys/bus/pci/devices/<vf-pci-addre>/physfn/net/
  enp3s0f0  enp3s0f0_0  enp3s0f0_1  enp3s0f0_2  enp3s0f0_3

  $ ls -l /sys/class/net
  ...
  lrwxrwxrwx. 1 root root    0 Aug 17 08:43 enp3s0f0_0 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_0
  lrwxrwxrwx. 1 root root    0 Aug 17 08:43 enp3s0f0_1 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_1
  lrwxrwxrwx. 1 root root    0 Aug 17 08:43 enp3s0f0_2 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_2
  lrwxrwxrwx. 1 root root    0 Aug 17 08:43 enp3s0f0_3 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_3
  ...

  [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=123f0f53dd64b67e34142485fe866a8a581f12f1
  [2] https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed

To manage notifications about this bug go to:
https://bugs.launchpad.net/os-vif/+bug/1892132/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list