[SRU][jammy PATCH v3 0/1] NFS: fix deadlock with pNFS flexfiles IO retry error path

Mike Snitzer snitzer at kernel.org
Mon Dec 16 19:37:46 UTC 2024


Please extend me the professional courtesy of letting me know when to
expect a jammy 5.15 kernel update that includes this fix.  It is now
in upstream stable/linux-5.15.y 5.15.174 (more below).

Thanks,
Mike

BugLink: https://bugs.launchpad.net/bugs/2089410

SRU Justification:
Impact: In production at a mutual "hyperscaler" customer that is using
the Ubuntu jammy kernel's NFS client with Hammerspace's pNFS
flexfiles: NFS client deadlock occurred due to upstream commit
7be7b3ca16a59 ("NFS: Ensure we immediately start writeback on
rescheduled writes"). Which was later fixed with upstream commit
b1a28f2eb9ea7 ("NFS: nfs_async_write_reschedule_io must not recurse
into the writeback code") in August 2022. But it unfortunately wasn't
marked for stable@ at that time. That has since been rectified and
Greg Kroah-Hartman has now included it in 5.15.174

Fix:
Apply upstream stable/linux-5.15.y commit 31545f4b7cdb6 ("NFS:
nfs_async_write_reschedule_io must not recurse into the writeback
code"). Or rebase on 5.15.174.

Testcase:
Cause buffered IO issued by NFS client using pNFS flexfiles to hit
error paths (due to heavy enterprise use, with container limits being
imposed, which makes OOM within container particularly prone to hit
error memory allocation errors _and_ additional reason for NFS IO to
be retransmitted, e.g. due to volume down/up bounces). This can lead
to deadlock in NFS due to recursion with page locks already held,
e.g.:
[<0>] wait_on_page_bit_common+0x10c/0x3d0
[<0>] wait_on_page_bit+0x3f/0x50
[<0>] wait_on_page_writeback+0x26/0x80
[<0>] write_cache_pages+0x138/0x460
[<0>] nfs_writepages+0x10d/0x200 [nfs]
[<0>] do_writepages+0xd4/0x200
[<0>] filemap_fdatawrite_wbc+0x89/0xe0
[<0>] filemap_fdatawrite_range+0x54/0x70
[<0>] nfs_async_write_reschedule_io+0x69/0x80 [nfs]
[<0>] ff_layout_reset_write+0x73/0xe0 [nfs_layout_flexfiles]
[<0>] ff_layout_write_release+0x7a/0x90 [nfs_layout_flexfiles]
[<0>] rpc_free_task+0x3d/0x70 [sunrpc]
[<0>] rpc_async_release+0x30/0x50 [sunrpc]
[<0>] process_one_work+0x228/0x3d0
[<0>] worker_thread+0x53/0x420
[<0>] kthread+0x127/0x150
[<0>] ret_from_fork+0x1f/0x30

Trond Myklebust (1):
  NFS: nfs_async_write_reschedule_io must not recurse into the writeback code

 fs/nfs/write.c | 2 --
 1 file changed, 2 deletions(-)

-- 
2.44.0




More information about the kernel-team mailing list