APPLIED: [SRU][J][PATCH 0/1] PCI: Batch BAR sizing operations
Stefan Bader
stefan.bader at canonical.com
Wed Apr 23 08:01:58 UTC 2025
On 14.04.25 17:15, Keifer Snedeker wrote:
> BugLink: https://bugs.launchpad.net/bugs/2097389
>
> SRU Justification:
>
> [ Impact ]
>
> VM guests that have large-BAR GPUs passed through to them
> will take 2x as long to initialize those devices' BARs without
> this patch
>
> [ Test Plan ]
>
> I verified that this patch applies cleanly to the Jammy kernel
> at 5.15.0-138.148
> and resolves the bug on DGX H100 and DGX A100. I observed no
> regressions. This can be verified on any machine with a GPU w/ a
> sufficiently large BAR and the capability to pass through
> to a VM using vfio.
>
> ppa:ks0/jammy-pci-probe-patch contains
> the jammy-generic kernel with this patch applied and can be
> used to validate this patch.
>
> To verify no regressions, I installed the kernel in that PPA
> to the guest VM, then rebooted and confirmed that:
> 1. The measured PCI initialization time on boot was ~50% of the
> unmodified kernel
> 2. Relevant parts of /proc/iomem mappings, the PCI init section
> of dmesg output, and lspci -vv output remained unchanged between
> the system with the unmodified kernel and with the patched kernel
> 3. The Nvidia driver still successfully loaded and was shown via
> nvidia-smi after the patch was applied
>
> [ Fix ]
>
> Roughly half of the time consuming device configuration options
> invoked during the PCI probe function can be eliminated by
> rearranging the memory and I/O disable/enable calls such that
> they only occur per-device rather than per-BAR. This is what the
> upstream patch does, and it results in roughly half the excess
> initialization time being eliminated reliably during VM boot.
>
> [ Where problems could occur ]
>
> I do not expect any regressions. The only callers of ABIs changed
> by this patch are also adjusted within this patch, and the functional
> change only removes entirely redundant calls to disable/enable PCI
> memory/IO. With that said, the main altered function is the PCI
> probe function, which is highly used across Ubuntu deployments, so
> we should pay attention to any user reports regarding PCI device
> initialization just in case they might be related.
>
> [ Additional Context ]
>
> Upstream patch: https://lore.kernel.org/all/20250111210652.402845-1-alex.williamson@redhat.com/
> Upstream bug report: https://lore.kernel.org/all/CAHTA-uYp07FgM6T1OZQKqAdSA5JrZo0ReNEyZgQZub4mDRrV5w@mail.gmail.com/
> SRU request for this patch in Noble & Oracular (approved): https://lists.ubuntu.com/archives/kernel-team/2025-February/156788.html
>
>
>
> Alex Williamson (1):
> PCI: Batch BAR sizing operations
>
> drivers/pci/iov.c | 8 +++-
> drivers/pci/pci.h | 4 +-
> drivers/pci/probe.c | 93 +++++++++++++++++++++++++++++++++------------
> 3 files changed, 78 insertions(+), 27 deletions(-)
>
Applied to jammy:linux/master-next. Thanks.
-Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xE8675DEECBEECEA3.asc
Type: application/pgp-keys
Size: 47863 bytes
Desc: OpenPGP public key
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250423/c0562a9d/attachment-0001.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250423/c0562a9d/attachment-0001.sig>
More information about the kernel-team
mailing list