ACK/Cmnt: [SRU][N/O][PATCH v2 0/1] s390/pci: Fix zpci_bus_is_isolated_vf() for non-VFs
Zixing Liu
zixing.liu at canonical.com
Wed May 28 10:01:45 UTC 2025
On Wed, 28 May 2025 at 17:54, Stefan Bader <stefan.bader at canonical.com> wrote:
>
> On 26.05.25 16:45, Zixing Liu wrote:
> > BugLink: https://bugs.launchpad.net/bugs/2111599
> >
> > SRU Justification:
> >
> > [Impact]
> >
> > * For non-VFs, function zpci_bus_is_isolated_vf() should return false,
> > because they aren't VFs.
> > While zpci_iov_find_parent_pf() specifically checks if a function is a VF,
> > it then simply returns that there is no parent.
> >
> > * The simplistic check for a parent then leads to these functions being
> > confused with isolated VFs, and isolating them on their own domain even
> > if sibling PFs should share the domain.
> >
> > [Fix]
> >
> > * This is fixed by explicitly checking if a function is not a VF.
> > (Notice that at this point the case where RIDs are ignored is already
> > handled - and in this case all PCI functions get isolated by being
> > properly detected in zpci_bus_is_multifunction_root().)
> >
> > * 8691abd3afaa "s390/pci: Fix zpci_bus_is_isolated_vf() for non-VFs"
> >
> > [Test Plan]
> >
> > * Setup Ubuntu Server (24.04/24.10) for s390x on an IBM z17
> > or LinuxONE 5 LPAR.
> >
> > * Have at least two PCIe adapters that support PF/VF (like RoCE Express)
> > available in that LPAR.
> >
> > * Attach multiple PFs of a PF access mode device (notice that this is only
> > possible with z17 and L1-5 hardware), such as the two PFs of a NETD
> > to the same LPAR.
> >
> > * Observe that they are put into separate PCI domains
> > instead of sharing the same domain as expected by drivers.
> >
> > * Due to lack of hardware, the verification will be conducted by IBM.
> >
> > * The fix was discussed upstream and flagged a stable kernel update.
> >
> > [Where problems could occur]
> >
> > * The modification is limited to one additional if statement
> > across two lines) in function zpci_bus_is_isolated_vf()
> > in file arch/s390/pci/pci_bus.c.
> >
> > * Hence the modification will be limited to the s390x-specific parts of
> > the PCI code in the kernel (sometimes refers to as zPCI),
> > and will NOT impact any other architecture!
> >
> > * If add. if-statement is not correct and a wrong bool is returned,
> > function zpci_bus_is_isolated_vf() might report and incorrect zpci_bus
> > status. Either isolated when it's not or not-isolated when it really is.
> >
> > * And since the new if statement got inserted before the already existing
> > 'if (!pdev)', the latter code in function zpci_bus_is_isolated_vf()
> > might be accidentally skipped.
> >
> > * All this might lead to a similar confusion of the functions in regard to
> > isolated VFs status and whether isolating them on their own domain even
> > or not, hence proper testing is needed.
> >
> > [Other Info]
> >
> > * Since the commit got upstream accepted with v6.15-rc1,
> > 'Questing' is not affected.
> >
> > * Plucky got fixed with
> > https://bugs.launchpad.net/bugs/2108854
> > (cherry-picked as https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/plucky/commit/?id=58412b2aa12aa43c90d5414e22898a041ec05e8a)
> >
> > * Hence the only remaining affected releases, that contain the offending
> > commit ("s390/pci: Fix handling of isolated VFs"), are oracular and noble.
> >
> > * Since this issue does not happen (thus cannot be recreated nor tested)
> > on the IBM Z hardware we have in Canonical, the verification(s) will be
> > done by IBM on most recent hardware.
> >
> > Niklas Schnelle (1):
> > s390/pci: Fix zpci_bus_is_isolated_vf() for non-VFs
> >
> > arch/s390/pci/pci_bus.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
>
> Two comments on this which should not hinder the patch being applied:
> 1. The "impact" portion of the justification gives me no real idea about
> the impact.
> Note this is meant normally more for the SRU team which are no kernel
> developers and potentially neither versed in the architecture. This
> applies to the
> complete SRU justification. It should allow someone looking at it
> to understand
> what the problem is and how to check it (provided one has a
> mainframe in the
> basement...)
Frank from the Partner Engineering team prepared the SRU bug comment
after talking to IBM about this bug.
We also don't own an IBM z17 for testing (Canonical does not own this
new model). Frank stated that someone from IBM will help us test this
fix.
> 2. The patch for N and O appear to be the same (and both say cherry pick).
> If that is true why not make it "[SRU] [N/O][PATCH ...]"?
>
Understood.
> But regardless,
> Acked-by: Stefan Bader <stefan.bader at canonical.com>
More information about the kernel-team
mailing list