ACK/Cmnt: [SRU][J][PATCH 0/1] PCI: Batch BAR sizing operations

Kuba Pawlak kuba.pawlak at canonical.com
Wed Apr 16 21:24:42 UTC 2025


On 14.04.2025 17:15, Keifer Snedeker wrote:
> BugLink: https://bugs.launchpad.net/bugs/2097389
>
> SRU Justification:
>
> [ Impact ]
>
> VM guests that have large-BAR GPUs passed through to them
> will take 2x as long to initialize those devices' BARs without
> this patch
>
> [ Test Plan ]
>
> I verified that this patch applies cleanly to the Jammy kernel
> at 5.15.0-138.148
> and resolves the bug on DGX H100 and DGX A100. I observed no
> regressions. This can be verified on any machine with a GPU w/ a
> sufficiently large BAR and the capability to pass through
> to a VM using vfio.
>
> ppa:ks0/jammy-pci-probe-patch contains
> the jammy-generic kernel with this patch applied and can be
> used to validate this patch.
>
> To verify no regressions, I installed the kernel in that PPA
> to the guest VM, then rebooted and confirmed that:
> 1. The measured PCI initialization time on boot was ~50% of the
> unmodified kernel
> 2. Relevant parts of /proc/iomem mappings, the PCI init section
> of dmesg output, and lspci -vv output remained unchanged between
> the system with the unmodified kernel and with the patched kernel
> 3. The Nvidia driver still successfully loaded and was shown via
> nvidia-smi after the patch was applied
>
> [ Fix ]
>
> Roughly half of the time consuming device configuration options
> invoked during the PCI probe function can be eliminated by
> rearranging the memory and I/O disable/enable calls such that
> they only occur per-device rather than per-BAR. This is what the
> upstream patch does, and it results in roughly half the excess
> initialization time being eliminated reliably during VM boot.
>
> [ Where problems could occur ]
>
> I do not expect any regressions. The only callers of ABIs changed
> by this patch are also adjusted within this patch, and the functional
> change only removes entirely redundant calls to disable/enable PCI
> memory/IO. With that said, the main altered function is the PCI
> probe function, which is highly used across Ubuntu deployments, so
> we should pay attention to any user reports regarding PCI device
> initialization just in case they might be related.
>
> [ Additional Context ]
>
> Upstream patch: https://lore.kernel.org/all/20250111210652.402845-1-alex.williamson@redhat.com/
> Upstream bug report: https://lore.kernel.org/all/CAHTA-uYp07FgM6T1OZQKqAdSA5JrZo0ReNEyZgQZub4mDRrV5w@mail.gmail.com/
> SRU request for this patch in Noble & Oracular (approved): https://lists.ubuntu.com/archives/kernel-team/2025-February/156788.html
>
>
>
> Alex Williamson (1):
>    PCI: Batch BAR sizing operations
>
>   drivers/pci/iov.c   |  8 +++-
>   drivers/pci/pci.h   |  4 +-
>   drivers/pci/probe.c | 93 +++++++++++++++++++++++++++++++++------------
>   3 files changed, 78 insertions(+), 27 deletions(-)
>

there is a followup commit for this one:

commit 472ff48e2c09e49f2f90eeb6922f747306559506
Author: Alex Williamson <alex.williamson at redhat.com>
Date:   Wed Feb 12 11:53:32 2025 -0700

     PCI: Fix BUILD_BUG_ON usage for old gcc

     As reported in the below link, it seems older versions of gcc cannot
     determine that the howmany variable is known for all callers. Include
     a test so that newer compilers can enforce this sanity check and older
     compilers can still work.  Add __always_inline attribute to give the
     compiler an even better chance to know the inputs.

     Link: 
https://lore.kernel.org/r/20250212185337.293023-1-alex.williamson@redhat.com
     Fixes: 4453f360862e ("PCI: Batch BAR sizing operations")
     Reported-by: Oleg Nesterov <oleg at redhat.com>
     Link: https://lore.kernel.org/all/20250209154512.GA18688@redhat.com
     Signed-off-by: Alex Williamson <alex.williamson at redhat.com>
     Signed-off-by: Bjorn Helgaas <bhelgaas at google.com>
     Tested-by: Oleg Nesterov <oleg at redhat.com>
     Tested-by: Mitchell Augustin <mitchell.augustin at canonical.com>


I don't know what this "older version of gcc" is but consider adding 
that patch to this review


Acked-by: Kuba Pawlak <kuba.pawlak at canonical.com>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0x216A9D7E3B63DCB4.asc
Type: application/pgp-keys
Size: 3139 bytes
Desc: OpenPGP public key
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250416/a78aea03/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250416/a78aea03/attachment.sig>


More information about the kernel-team mailing list