Cmnt: [SRU][J:linux-gcp/J:linux-gke][PATCH v2 00/13] IDPF: TX timeout and crash
Ian Whitfield
ian.whitfield at canonical.com
Tue Apr 1 16:19:05 UTC 2025
On Mon, Mar 31, 2025 at 01:25:39PM +0200, Massimiliano Pellizzer wrote:
> On Fri, 28 Mar 2025 at 00:42, Ian Whitfield <ian.whitfield at canonical.com> wrote:
> >
> > BugLink: https://bugs.launchpad.net/bugs/2093622
> >
> > [Impact]
> > Google has requested a patchset to be backported to resolve an issue in
> > the IDPF module. The issue can lead to timeouts and potentially a system
> > crash.
> >
> > [Fix]
> > Backport a patchset from linux upstream to all IDPF-enabled Google
> > kernels. This thread is for Jammy, another thread was sent for Noble
> > and Oracular each, because they require separate patchsets. Plucky has
> > already received these fixes through upstream updates. For more details
> > on the backport process, each commit message includes a note about any
> > manual patch edits that were needed.
> >
> > [Test Plan]
> > Google provided an iperf3 stress test, which I performed on two bare
> > metal machines using the IDPF driver. There were no errors reported by
> > the kernel after running the stress test for 24+ hours.
> >
> > [Regression Potential]
> > This is a large backport and affects several kernels. Because these
> > kernels have diverged from upstream, most of the patches did not apply
> > cleanly, which increases the chance of human error in the backport
> > process. Problems related to this fix could lead to further issues in
> > the Intel IDPF module, but there have been no changes to other drivers
> > or the core kernel.
> >
> > v2: The patchset was previously reverted for all kernels over regression
> > concerns, and has been more thoroughly tested since. Google asked
> > that two additional patches were included in v2:
> > e4891e4687c8 ("idpf: split &idpf_queue into 4 strictly-typed queue structures")
> > f01032a2ca09 ("idpf: fix memory leaks and crashes while performing a soft reset")
> > which also led to the inclusion of additional commits:
> > c00d33f1fc79 ("idpf: make virtchnl2.h self-contained")
> > 66c27e3b19d5 ("idpf: stop using macros for accessing queue descriptors")
> > fea7b71b8751 ("idpf: fix corrupted frames and skb leaks in singleq mode")
> > 6aa7ca3c7dcc ("idpf: refactor some missing field get/prep conversions")
> > dd19e827d63a ("idpf: fix kernel panic on unknown packet types")
> > d38b4d0d95bc ("idpf: set scheduling mode for completion queue")
> > to reduce merge conflicts.
> >
> > Alexander Lobakin (5):
> > idpf: make virtchnl2.h self-contained
> > idpf: stop using macros for accessing queue descriptors
> > idpf: fix corrupted frames and skb leaks in singleq mode
> > idpf: split &idpf_queue into 4 strictly-typed queue structures
> > idpf: fix memory leaks and crashes while performing a soft reset
> >
> > Jesse Brandeburg (1):
> > idpf: refactor some missing field get/prep conversions
> >
> > Joshua Hay (4):
> > idpf: fix kernel panic on unknown packet types
> > idpf: enable WB_ON_ITR
> > idpf: add support for SW triggered interrupts
> > idpf: trigger SW interrupt when exiting wb_on_itr mode
> >
> > Michal Kubiak (1):
> > idpf: set scheduling mode for completion queue
> >
> > Pavan Kumar Linga (2):
> > idpf: avoid vport access in idpf_get_link_ksettings
> > idpf: fix idpf_vc_core_init error path
> >
> > drivers/net/ethernet/intel/idpf/idpf.h | 8 +-
> > drivers/net/ethernet/intel/idpf/idpf_dev.c | 5 +
> > .../net/ethernet/intel/idpf/idpf_ethtool.c | 134 ++-
> > .../net/ethernet/intel/idpf/idpf_lan_txrx.h | 2 +
> > drivers/net/ethernet/intel/idpf/idpf_lib.c | 77 +-
> > .../ethernet/intel/idpf/idpf_singleq_txrx.c | 176 +--
> > drivers/net/ethernet/intel/idpf/idpf_txrx.c | 1050 ++++++++++-------
> > drivers/net/ethernet/intel/idpf/idpf_txrx.h | 474 +++++---
> > drivers/net/ethernet/intel/idpf/idpf_vf_dev.c | 5 +
> > .../net/ethernet/intel/idpf/idpf_virtchnl.c | 82 +-
> > drivers/net/ethernet/intel/idpf/virtchnl2.h | 4 +-
> > 11 files changed, 1189 insertions(+), 828 deletions(-)
> >
> > --
> > 2.43.0
> >
> >
> > --
> > kernel-team mailing list
> > kernel-team at lists.ubuntu.com
> > https://lists.ubuntu.com/mailman/listinfo/kernel-team
>
> Smatch testing the patchset resulted in the following:
>
> drivers/net/ethernet/intel/idpf/idpf_txrx.c: In function ‘idpf_rxq_group_alloc’:
> drivers/net/ethernet/intel/idpf/idpf_txrx.c:1436:28: warning: unused
> variable ‘q’ [-Wunused-variable]
> 1436 | struct idpf_queue *q;
> | ^
>
> The patch
> - [PATCH v2 07/13] idpf: split &idpf_queue into 4 strictly-typed queue
> structures
> declares multiple variables named "q" inside for loops, shadowing the
> outer scope
> variable struct idpf_queue *q. This may lead to bugs.
> What do you think?
>
> --
> Massimiliano Pellizzer
That's reasonable, that patch should have removed the outer scope
struct idpf_queue *q declaration, I must have missed it when I was
fixing conflicts with the commit:
9b1aa3ef2328 ("idpf: add get/set for Ethtool's header split ringparam")
which required the "adapter" declaration to stay, but not the
idpf_queue. I removed the q declaration and rebuilt the kernel without
error, so I'll submit a v3 of this patchset for Jammy with that change.
-Ian Whitfield
More information about the kernel-team
mailing list