[PULL][SRU Focal] Avoid overwhelming NFSv4 server with recalled delegations
dann frazier
dann.frazier at canonical.com
Mon Jan 31 18:47:54 UTC 2022
BugLink: https://launchpad.net/bugs/1957986
A user reported that they were seeing terrible NFSv4 performance with
one of their workloads, which they were able to simulate using a
benchmark. I could reproduce the problem using that benchmark, and
observed read performance quickly dropping from ~700MiB/s to <
10MiB/s. This is reproducible with usptream v5.4, but not more recent
upstream kernels, so I used bisection to identify the following fix:
10717f45639f NFSv4: Limit the total number of cached delegations
= OK, but what are these other patches? =
The identified patch was patch 5/5 of this series:
https://www.spinics.net/lists/linux-nfs/msg76242.html
It has hard dependencies on patches 1-4, so all 5 are included here.
Further, that set of 5 patches depends on the following 20-part series:
https://www.spinics.net/lists/linux-nfs/msg75367.html
The first 2 of which already arrived in focal via stable, so it'd be
18 new patches there. I did investigate whether or not we need all 18
of them. I found that we could technically omit 8 of them, and the
rest of the patches would still apply and still prevent the performance
issue. This pull request does include those 8 patches anyway, based on
a recommendation from Nivedita.
Finally, I searched for any upstream patches that are marked as
"Fixes:" for these patches, and 2 were identified. Those patches are
also included here.
All patches cherry-picked cleanly except for one (see annotation) that
required a trivial context adjustment in a header file.
= Exploration of other options =
We exhausted 2 other options with the user: using the HWE kernel,
which already includes these fixes, and forcing NFSv3 mode, which is
not impacted:
== HWE Kernel ==
They have a complex stack containing other 3rd party software/drivers
that are validated only against the LTS kernel. Using the HWE kernel
would cause support issues w/ that 3rd party.
== NFSv3 ==
While NFSv3 is not impacted by this issue, it does have another issue
that causes the buffer cache to grow too large after a couple of days,
making it a poor replacement.
= Testing =
== Validation ==
The customer has tested this and confirmed that it resolves the
problem for them (not just the benchmark).
== Regression ==
I performed regression testing using the delegation tests from the
nfstest project:
https://wiki.linux-nfs.org/wiki/index.php/NFStest
I see the same passes and failures before and after applying these
patches. I discussed the failures with the maintainer of these tests,
and learned that those failures are expected with a Linux NFSv4 server
because our implementation doesn't support all possible delegations. I
did due diligence to see if I could test against a server that does -
specifically NetApp's implementation - but that proved unsuccessful.
-dann
The following changes since commit 206c12bf2ef2d607a3942b8e1b7c8959b4e95ec8:
UBUNTU: Ubuntu-5.4.0-98.111 (2022-01-28 10:54:24 +0100)
are available in the Git repository at:
git://git.launchpad.net/~dannf/ubuntu/+source/linux/+git/linux focal-nfsv4-submit
for you to fetch changes up to 93d0c71c551cbc8956726e6f52e15cfb53a69fdc:
NFSv4: Ensure the delegation cred is pinned when we call delegreturn (2022-01-31 10:48:54 -0700)
----------------------------------------------------------------
Trond Myklebust (25):
NFSv4: Fix delegation handling in update_open_stateid()
NFSv4: nfs4_callback_getattr() should ignore revoked delegations
NFSv4: Delegation recalls should not find revoked delegations
NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was revoked
NFS: Rename nfs_inode_return_delegation_noreclaim()
NFSv4: Don't remove the delegation from the super_list more than once
NFSv4: Hold the delegation spinlock when updating the seqid
NFSv4: Clear the NFS_DELEGATION_REVOKED flag in nfs_update_inplace_delegation()
NFSv4: Update the stateid seqid in nfs_revoke_delegation()
NFSv4: Revoke the delegation on success in nfs4_delegreturn_done()
NFSv4: Ignore requests to return the delegation if it was revoked
NFSv4: Don't reclaim delegations that have been returned or revoked
NFSv4: nfs4_return_incompatible_delegation() should check delegation validity
NFSv4: Fix nfs4_inode_make_writeable()
NFS: nfs_inode_find_state_and_recover() fix stateid matching
NFSv4: Fix races between open and delegreturn
NFSv4: Handle NFS4ERR_OLD_STATEID in delegreturn
NFSv4: Don't retry the GETATTR on old stateid in nfs4_delegreturn_done()
NFSv4: nfs_inode_evict_delegation() should set NFS_DELEGATION_RETURNING
NFS: Clear NFS_DELEGATION_RETURN_IF_CLOSED when the delegation is returned
NFSv4: Try to return the delegation immediately when marked for return on close
NFSv4: Add accounting for the number of active delegations held
NFSv4: Limit the total number of cached delegations
NFSv4: Ensure the delegation is pinned in nfs_do_return_delegation()
NFSv4: Ensure the delegation cred is pinned when we call delegreturn
fs/nfs/callback_proc.c | 4 +-
fs/nfs/delegation.c | 274 +++++++++++++++++++++++++++++++++++++------------
fs/nfs/delegation.h | 5 +-
fs/nfs/nfs4_fs.h | 6 ++
fs/nfs/nfs4proc.c | 18 ++--
fs/nfs/nfs4state.c | 8 +-
fs/nfs/nfs4super.c | 4 +-
7 files changed, 237 insertions(+), 82 deletions(-)
More information about the kernel-team
mailing list