APPLIED/Cmnt: [SRU][J][PATCH 0/5] gve: Add support for non-4k page sizes.
Stefan Bader
stefan.bader at canonical.com
Thu May 8 15:48:22 UTC 2025
On 29.04.25 18:16, Ian Whitfield wrote:
> BugLink: https://bugs.launchpad.net/bugs/2109537
>
> [Impact]
>
> During startup on one of Google Compute Engine's C4A machines, the gVNIC will
> fail to initialize:
> [ 1.071899] gvnic 0000:00:00.0: enabling device (0010 -> 0012)
> [ 1.076631] ACPI: \_SB_.PCI0.GSI2: Enabled at IRQ 37
> [ 1.078075] nvme nvme0: pci function 0000:00:02.0
> [ 1.093687] nvme nvme0: 4/0/0 default/read/poll queues
> [ 1.097563] nvme0n1: p1 p15
> [ 3.886472] gvnic 0000:00:00.0: AQ commands timed out, need to reset AQ
> [ 3.888151] gvnic 0000:00:00.0: Could not get device information: err=-131
> [ 3.891458] gvnic: probe of 0000:00:00.0 failed with error -131
>
> Because this is a cloud instance, network failure means the instance is unusable.
>
> [Fix]
>
> A patchset to make the GVE driver work on both 64k page size and 4k page size
> kernels was applied in Linux 6.8, so Noble and later kernels all don't have this
> problem. Backporting the patchset to 5.15 appears to fix the issue, as I was
> able to boot and connect to the machine using the patched kernel.
>
> Patchset link: https://lore.kernel.org/all/20231128002648.320892-1-jfraker@google.com/
> Hashes:
> 955f4d3bf0a45 ("gve: Perform adminq allocations through a dma_pool.")
> 8ae980d24195f ("gve: Deprecate adminq_pfn for pci revision 0x1.")
> ce260cb114bbf ("gve: Remove obsolete checks that rely on page size.")
> 513072fb4bf81 ("gve: Add page size register to the register_page_list command.")
> da7d4b42caf1b ("gve: Remove dependency on 4k page size.")
>
> [Test plan]
>
> Boot the 64k flavor of the patched kernel on a C4A Google Compute Engine
> instance, and verify that you can ssh to it.
>
> [Regression potential]
>
> Of the applied patches, "gve: Remove dependency on 4k page size." was the only
> one to have conflicts. It's possible that there are uses of the native PAGE_SIZE
> definition that aren't covered by the backport of the patch. This patchset is
> being without including other major GVE driver patchsets that had been applied
> before it in mainline.
>
> Since the patches are isolated to the GVE driver, and since generic-64k
> previously didn't work on gVNIC instances at all, the possibility of failure
> is limited to configurations which were already not working, therefore not
> regressions.
>
> John Fraker (5):
> gve: Perform adminq allocations through a dma_pool.
> gve: Deprecate adminq_pfn for pci revision 0x1.
> gve: Remove obsolete checks that rely on page size.
> gve: Add page size register to the register_page_list command.
> gve: Remove dependency on 4k page size.
>
> drivers/net/ethernet/google/gve/gve.h | 8 +-
> drivers/net/ethernet/google/gve/gve_adminq.c | 88 ++++++++++++-------
> drivers/net/ethernet/google/gve/gve_adminq.h | 3 +-
> drivers/net/ethernet/google/gve/gve_ethtool.c | 2 +-
> drivers/net/ethernet/google/gve/gve_main.c | 2 +-
> .../net/ethernet/google/gve/gve_register.h | 9 ++
> drivers/net/ethernet/google/gve/gve_rx.c | 12 +--
> drivers/net/ethernet/google/gve/gve_tx.c | 2 +-
> 8 files changed, 78 insertions(+), 48 deletions(-)
>
Applied to [1+2 fuzz] jammy:linux/master-next. Thanks.
-Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xE8675DEECBEECEA3.asc
Type: application/pgp-keys
Size: 47863 bytes
Desc: OpenPGP public key
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250508/15895820/attachment-0001.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250508/15895820/attachment-0001.sig>
More information about the kernel-team
mailing list