ACK: [SRU][J:linux-bluefield][PATCH v1 0/3] UBUNTU: SAUCE: Revert "can: gw: fix RCU/BH usage in cgw_create_job()"
Massimiliano Pellizzer
massimiliano.pellizzer at canonical.com
Fri Jul 18 14:25:29 UTC 2025
On Fri, 18 Jul 2025 at 09:56, Kuba Pawlak <kuba.pawlak at canonical.com> wrote:
>
> On 17.07.2025 16:27, Stav Aviram wrote:
> > BugLink: https://bugs.launchpad.net/bugs/2117163
> >
> > SRU Justification:
> >
> > [Impact]
> > In Ubuntu-bluefield-5.15.0-1071.73, which included commits from upstream
> > stable version 5.15.183, the system crashes after building the kernel,
> > building OFED driver and restarting the driver:
> >
> > Oops: 0000 [#1] SMP NOPTI
> > Workqueue: events kfree_rcu_work
> > RIP: 0010:kmem_cache_free_bulk+0x137/0x1d0
> > Call Trace:
> > kfree_rcu_work+0x1e7/0x250
> > process_one_work+0x1b0/0x350
> > worker_thread+0x50/0x3a0
> > kthread+0x124/0x150
> > ret_from_fork+0x1f/0x30
> >
> > The crash is caused by using k[v]free_rcu_mightsleep() functions, that
> > were introduced by the faulty commit 5dc583481a0a ("Add
> > kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()"). This commit
> > introduces new mightsleep functions but lacks critical infrastructure
> > changes required for proper operation. Our analysis indicates the root
> > cause is an incomplete API migration, which causes mightsleep macros to
> > pass void pointers where rcu_callback_t function pointers are expected:
> > BF5.15 (broken): void kvfree_call_rcu(struct rcu_head *head,
> > rcu_callback_t func) Required: void kvfree_call_rcu(struct rcu_head
> > *head, void *ptr) This results in invalid pointer arithmetic that
> > generates tiny memory addresses (like 0x17) which crash the kernel when
> > freed.
> >
> > [Fix]
> > Phase 1 (Immediate):
> > Revert the problematic commit to restore stability, along with the two
> > other commits from the same series:
> > * 57818f6fec6c ("can: gw: fix RCU/BH usage in cgw_create_job()")
> > * 5dc583481a0a ("rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()") (main problematic commit)
> > * 82683fabcb28 ("can: gw: use call_rcu() instead of costly synchronize_rcu()")
> >
> > Phase 2 (Proper Implementation):
> > The results of our research should be verified and applied into Jammy to
> > enable proper *_mightsleep() support for OFED driver. The most critical
> > commit to verify and apply is the upstream commit introducing the
> > kvfree_call_rcu() signature transformation:
> > * 04a522b7da3d ("rcu: Refactor kvfree_call_rcu() and high-level helpers")
> > Additionally, the following commits should be examined to determine
> > whether they are essential for avoiding future issues:
> > * 7e3f926bf453 ("rcu/kvfree: Eliminate k[v]free_rcu() single argument macro")
> > * 5da7cb193db3 ("rcu/kvfree: Avoid freeing new kfree_rcu() memory after old grace period")
> > * 23532061ad30 ("net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()")
> > A deeper investigation should also be conducted to ensure no additional
> > crucial commits are required for proper integration of this feature into
> > Jammy. Once all necessary commits are backported, the *_mightsleep()
> > functions can be safely re-introduced into Jammy.
> >
> > [Test Case]
> > Phase 1:
> > After reverting the three commits mentioned above, the compilation
> > completed successfully on the master-next branch. After reverting,
> > compiling the kernel, rebooting, building OFED and restarting the
> > driver, no crash occurred.
> >
> > Phase 2:
> > After applying all required infrastructure commits and re-adding
> > mightsleep functions, system should remain stable when building OFED and
> > restarting.
> >
> > [Regression Potential]
> > Phase 1 (Revert):
> > Very low risk. Simply removes the problematic new functionality and
> > returns to the stable state that existed before the faulty commit.
> >
> > Phase 2 (Proper implementation):
> > Medium risk as it requires backporting multiple upstream RCU
> > infrastructure changes to an older kernel base.
> >
> > Stav Aviram (3):
> > UBUNTU: SAUCE: Revert "can: gw: fix RCU/BH usage in cgw_create_job()"
> > UBUNTU: SAUCE: Revert "rcu/kvfree: Add kvfree_rcu_mightsleep() and
> > kfree_rcu_mightsleep()"
> > UBUNTU: SAUCE: Revert "can: gw: use call_rcu() instead of costly
> > synchronize_rcu()"
> >
> > include/linux/rcupdate.h | 3 -
> > net/can/gw.c | 165 +++++++++++++++------------------------
> > 2 files changed, 65 insertions(+), 103 deletions(-)
> Acked-by: Kuba Pawlak <kuba.pawlak at canonical.com>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Acked-by: Massimiliano Pellizzer <massimiliano.pellizzer at canonical.com>
--
Massimiliano Pellizzer
More information about the kernel-team
mailing list