[Bug 2101903] Re: Backport "OvmfPkg: Use user-specified opt/ovmf/X-PciMmio64Mb value unconditionally" to Noble
Launchpad Bug Tracker
2101903 at bugs.launchpad.net
Mon May 19 13:47:15 UTC 2025
This bug was fixed in the package edk2 - 2024.05-2ubuntu0.2
---------------
edk2 (2024.05-2ubuntu0.2) oracular; urgency=medium
* ovmf: cherry-pick patch from upstream to "use user-specified
opt/ovmf/X-PciMmio64Mb value unconditionally". LP: #2101903.
- d/p/0001-OvmfPkg-Use-user-specified-opt-ovmf-X-PciMmio64Mb-va.patch
-- Mitchell Augustin <mitchell.augustin at canonical.com> Wed, 26 Mar
2025 16:30:00 -0500
** Changed in: edk2 (Ubuntu Oracular)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to edk2 in Ubuntu.
https://bugs.launchpad.net/bugs/2101903
Title:
Backport "OvmfPkg: Use user-specified opt/ovmf/X-PciMmio64Mb value
unconditionally" to Noble
Status in EDK II:
Fix Released
Status in edk2 package in Ubuntu:
Fix Released
Status in edk2 source package in Noble:
In Progress
Status in edk2 source package in Oracular:
Fix Released
Status in edk2 source package in Plucky:
Fix Released
Bug description:
Upstream patch: https://github.com/tianocore/edk2/pull/10856/commits/f8a8bb717c53c651750025aefaa5654f383bd02e
(To be added to Plucky via Debian)
SRU Justification:
[ Impact ]
Due to an inefficiency in the way older host kernels manage pfnmaps
for guest VM memory ranges[0], guests with large-BAR GPUs passed
through have a very long (multiple minutes) initialization time when
the MMIO window advertised by OVMF is sufficiently sized for the
passed-through BARs (i.e., the correct OVMF behavior). However, in the
past, users have benefited from fast guest boot times when OVMF
advertised an MMIO window that was too small to accommodate the full
BAR, since this resulted in the long PCI initialization process being
skipped (and retried later in a way that omitted the slow path, if
pci=realloc pci=nocrs were set).
While the root cause is being fully addressed in the upstream
kernel[1], the solution relies on huge pfnmap support, which is not
expected to be backported into the 6.8 or 6.11 -generic kernels. As a
result, the only kernel improvement supported on those kernels is this
patch[2], which reduces the extra boot time by about half.
Unfortunately, that boot time is still an average of 1-3 minutes
longer per-VM-boot than what can be achieved when the host is running
a version of OVMF without PlatformDynamicMmioWindow (PDMW) support
(introduced in [3]) (as was the case in Jammy's version of OVMF).
[ Test Plan ]
I have confirmed that this cleanly applies to the latest noble OVMF
and prepared a test PPA:
https://launchpad.net/~mitchellaugustin/+archive/ubuntu/edk2-honor-
user-mmio-window
I have verified that this knob works as expected for values large enough for the GPU MMIO windows (as supported by the original behavior) and for values smaller than what PDMW computes (newly introduced by this patch).
On DGX H100, forcing a value of 1024 (lower than required for passed-through GPUs) results in desired fast boot time, with GPUs still being usable as long as pci=nocrs pci=realloc are set in the guest, even on legacy kernels. I also observed no regressions, and no change in behavior when X-PciMmio64Mb is absent or above the PDMW-calculated value.
[ Fix ]
Since there is no way to force the use of the classic MMIO window
size4 in any version of OVMF after 3, and since we have a use case for
such functionality on legacy distro kernels that would yield
significant, recurring compute time savings across all impacted VMs,
apply this change to this knob's behavior to make this workaround
possible on Noble.
[ Where problems could occur ]
If there are user deployments on Noble in which X-PciMmio64Mb is
currently explicitly set to a value smaller than the PDMW-computed
value, those deployments are currently ignoring the X-PciMmio64Mb
value and instead using that which is calculated by PDMW. If any such
deployments exist, *and* are specifying values that are too small for
their GPUs' MMIO windows, *and* do not have `pci=realloc pci=nocrs`
set, their passed-through GPUs will stop working until they either
raise X-PciMmio64Mb to be large enough for their MMIO windows, remove
X-PciMmio64Mb from their config (if PDMW's value is high enough), or
add `pci=nocrs pci=realloc` to their guest kernel config to obtain the
benefits of this patch.
However, from the perspective of OVMF, we are making the X-PciMmio64Mb
behavior more consistent, so I do not believe the above risk should be
a blocker for including this patch. (I also suspect that those
circumstances are uncommon, since anyone impacted by their use of
X-PciMmio64Mb today must only be specifying values larger than PDMW,
who will not be impacted by this.)
Additionally, this patch only adds new opt-in functionality and does
not impact anyone not using X-PciMmio64Mb, so it shouldn't have much
regression risk outside of that.
[0]: https://lore.kernel.org/all/CAHTA-uYp07FgM6T1OZQKqAdSA5JrZo0ReNEyZgQZub4mDRrV5w@mail.gmail.com/
[1]: https://lore.kernel.org/all/20250205231728.2527186-1-alex.williamson@redhat.com/
[2]: https://lore.kernel.org/all/20250111210652.402845-1-alex.williamson@redhat.com/
[3]: https://github.com/tianocore/edk2/commit/ecb778d0ac62560aa172786ba19521f27bc3f650
[4]: https://edk2.groups.io/g/devel/topic/109651206?p=Created,,,20,1,0,0
To manage notifications about this bug go to:
https://bugs.launchpad.net/edk2/+bug/2101903/+subscriptions
More information about the foundations-bugs
mailing list