ACK: [SRU][J:linux-bluefield][PATCH v1 0/3] UBUNTU: SAUCE: Revert "can: gw: fix RCU/BH usage in cgw_create_job()"

Kuba Pawlak kuba.pawlak at canonical.com
Fri Jul 18 07:54:49 UTC 2025


On 17.07.2025 16:27, Stav Aviram wrote:
> BugLink: https://bugs.launchpad.net/bugs/2117163
>
> SRU Justification:
>
> [Impact]
> In Ubuntu-bluefield-5.15.0-1071.73, which included commits from upstream
> stable version 5.15.183, the system crashes after building the kernel,
> building OFED driver and restarting the driver:
>
> Oops: 0000 [#1] SMP NOPTI
> Workqueue: events kfree_rcu_work
> RIP: 0010:kmem_cache_free_bulk+0x137/0x1d0
> Call Trace:
>   kfree_rcu_work+0x1e7/0x250
>   process_one_work+0x1b0/0x350
>   worker_thread+0x50/0x3a0
>   kthread+0x124/0x150
>   ret_from_fork+0x1f/0x30
>
> The crash is caused by using k[v]free_rcu_mightsleep() functions, that
> were introduced by the faulty commit 5dc583481a0a ("Add
> kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()").  This commit
> introduces new mightsleep functions but lacks critical infrastructure
> changes required for proper operation.  Our analysis indicates the root
> cause is an incomplete API migration, which causes mightsleep macros to
> pass void pointers where rcu_callback_t function pointers are expected:
> BF5.15 (broken): void kvfree_call_rcu(struct rcu_head *head,
> rcu_callback_t func) Required: void kvfree_call_rcu(struct rcu_head
> *head, void *ptr) This results in invalid pointer arithmetic that
> generates tiny memory addresses (like 0x17) which crash the kernel when
> freed.
>
> [Fix]
> Phase 1 (Immediate):
> Revert the problematic commit to restore stability, along with the two
> other commits from the same series:
> * 57818f6fec6c ("can: gw: fix RCU/BH usage in cgw_create_job()")
> * 5dc583481a0a ("rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()") (main problematic commit)
> * 82683fabcb28 ("can: gw: use call_rcu() instead of costly synchronize_rcu()")
>
> Phase 2 (Proper Implementation):
> The results of our research should be verified and applied into Jammy to
> enable proper *_mightsleep() support for OFED driver.  The most critical
> commit to verify and apply is the upstream commit introducing the
> kvfree_call_rcu() signature transformation:
> * 04a522b7da3d ("rcu: Refactor kvfree_call_rcu() and high-level helpers")
> Additionally, the following commits should be examined to determine
> whether they are essential for avoiding future issues:
> * 7e3f926bf453 ("rcu/kvfree: Eliminate k[v]free_rcu() single argument macro")
> * 5da7cb193db3 ("rcu/kvfree: Avoid freeing new kfree_rcu() memory after old grace period")
> * 23532061ad30 ("net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()")
> A deeper investigation should also be conducted to ensure no additional
> crucial commits are required for proper integration of this feature into
> Jammy.  Once all necessary commits are backported, the *_mightsleep()
> functions can be safely re-introduced into Jammy.
>
> [Test Case]
> Phase 1:
> After reverting the three commits mentioned above, the compilation
> completed successfully on the master-next branch.  After reverting,
> compiling the kernel, rebooting, building OFED and restarting the
> driver, no crash occurred.
>
> Phase 2:
> After applying all required infrastructure commits and re-adding
> mightsleep functions, system should remain stable when building OFED and
> restarting.
>
> [Regression Potential]
> Phase 1 (Revert):
> Very low risk. Simply removes the problematic new functionality and
> returns to the stable state that existed before the faulty commit.
>
> Phase 2 (Proper implementation):
> Medium risk as it requires backporting multiple upstream RCU
> infrastructure changes to an older kernel base.
>
> Stav Aviram (3):
>    UBUNTU: SAUCE: Revert "can: gw: fix RCU/BH usage in cgw_create_job()"
>    UBUNTU: SAUCE: Revert "rcu/kvfree: Add kvfree_rcu_mightsleep() and
>      kfree_rcu_mightsleep()"
>    UBUNTU: SAUCE: Revert "can: gw: use call_rcu() instead of costly
>      synchronize_rcu()"
>
>   include/linux/rcupdate.h |   3 -
>   net/can/gw.c             | 165 +++++++++++++++------------------------
>   2 files changed, 65 insertions(+), 103 deletions(-)
Acked-by: Kuba Pawlak <kuba.pawlak at canonical.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0x216A9D7E3B63DCB4.asc
Type: application/pgp-keys
Size: 3139 bytes
Desc: OpenPGP public key
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250718/4d125e06/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250718/4d125e06/attachment.sig>


More information about the kernel-team mailing list