[PULL][SRU Focal] Avoid overwhelming NFSv4 server with recalled delegations

dann frazier dann.frazier at canonical.com
Mon Jan 31 18:47:54 UTC 2022


BugLink: https://launchpad.net/bugs/1957986

A user reported that they were seeing terrible NFSv4 performance with
one of their workloads, which they were able to simulate using a
benchmark. I could reproduce the problem using that benchmark, and
observed read performance quickly dropping from ~700MiB/s to <
10MiB/s. This is reproducible with usptream v5.4, but not more recent
upstream kernels, so I used bisection to identify the following fix:

10717f45639f NFSv4: Limit the total number of cached delegations

= OK, but what are these other patches? =
The identified patch was patch 5/5 of this series:
  https://www.spinics.net/lists/linux-nfs/msg76242.html
It has hard dependencies on patches 1-4, so all 5 are included here.

Further, that set of 5 patches depends on the following 20-part series:
  https://www.spinics.net/lists/linux-nfs/msg75367.html
The first 2 of which already arrived in focal via stable, so it'd be
18 new patches there. I did investigate whether or not we need all 18
of them. I found that we could technically omit 8 of them, and the
rest of the patches would still apply and still prevent the performance
issue. This pull request does include those 8 patches anyway, based on
a recommendation from Nivedita.

Finally, I searched for any upstream patches that are marked as
"Fixes:" for these patches, and 2 were identified. Those patches are
also included here.

All patches cherry-picked cleanly except for one (see annotation) that
required a trivial context adjustment in a header file.

= Exploration of other options =
We exhausted 2 other options with the user: using the HWE kernel,
which already includes these fixes, and forcing NFSv3 mode, which is
not impacted:

== HWE Kernel ==
They have a complex stack containing other 3rd party software/drivers
that are validated only against the LTS kernel. Using the HWE kernel
would cause support issues w/ that 3rd party.

== NFSv3 ==
While NFSv3 is not impacted by this issue, it does have another issue
that causes the buffer cache to grow too large after a couple of days,
making it a poor replacement.

= Testing =
== Validation ==
The customer has tested this and confirmed that it resolves the
problem for them (not just the benchmark).

== Regression ==
I performed regression testing using the delegation tests from the
nfstest project:
  https://wiki.linux-nfs.org/wiki/index.php/NFStest
I see the same passes and failures before and after applying these
patches. I discussed the failures with the maintainer of these tests,
and learned that those failures are expected with a Linux NFSv4 server
because our implementation doesn't support all possible delegations. I
did due diligence to see if I could test against a server that does -
specifically NetApp's implementation - but that proved unsuccessful.

  -dann

The following changes since commit 206c12bf2ef2d607a3942b8e1b7c8959b4e95ec8:

  UBUNTU: Ubuntu-5.4.0-98.111 (2022-01-28 10:54:24 +0100)

are available in the Git repository at:

  git://git.launchpad.net/~dannf/ubuntu/+source/linux/+git/linux focal-nfsv4-submit

for you to fetch changes up to 93d0c71c551cbc8956726e6f52e15cfb53a69fdc:

  NFSv4: Ensure the delegation cred is pinned when we call delegreturn (2022-01-31 10:48:54 -0700)

----------------------------------------------------------------
Trond Myklebust (25):
      NFSv4: Fix delegation handling in update_open_stateid()
      NFSv4: nfs4_callback_getattr() should ignore revoked delegations
      NFSv4: Delegation recalls should not find revoked delegations
      NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was revoked
      NFS: Rename nfs_inode_return_delegation_noreclaim()
      NFSv4: Don't remove the delegation from the super_list more than once
      NFSv4: Hold the delegation spinlock when updating the seqid
      NFSv4: Clear the NFS_DELEGATION_REVOKED flag in nfs_update_inplace_delegation()
      NFSv4: Update the stateid seqid in nfs_revoke_delegation()
      NFSv4: Revoke the delegation on success in nfs4_delegreturn_done()
      NFSv4: Ignore requests to return the delegation if it was revoked
      NFSv4: Don't reclaim delegations that have been returned or revoked
      NFSv4: nfs4_return_incompatible_delegation() should check delegation validity
      NFSv4: Fix nfs4_inode_make_writeable()
      NFS: nfs_inode_find_state_and_recover() fix stateid matching
      NFSv4: Fix races between open and delegreturn
      NFSv4: Handle NFS4ERR_OLD_STATEID in delegreturn
      NFSv4: Don't retry the GETATTR on old stateid in nfs4_delegreturn_done()
      NFSv4: nfs_inode_evict_delegation() should set NFS_DELEGATION_RETURNING
      NFS: Clear NFS_DELEGATION_RETURN_IF_CLOSED when the delegation is returned
      NFSv4: Try to return the delegation immediately when marked for return on close
      NFSv4: Add accounting for the number of active delegations held
      NFSv4: Limit the total number of cached delegations
      NFSv4: Ensure the delegation is pinned in nfs_do_return_delegation()
      NFSv4: Ensure the delegation cred is pinned when we call delegreturn

 fs/nfs/callback_proc.c |   4 +-
 fs/nfs/delegation.c    | 274 +++++++++++++++++++++++++++++++++++++------------
 fs/nfs/delegation.h    |   5 +-
 fs/nfs/nfs4_fs.h       |   6 ++
 fs/nfs/nfs4proc.c      |  18 ++--
 fs/nfs/nfs4state.c     |   8 +-
 fs/nfs/nfs4super.c     |   4 +-
 7 files changed, 237 insertions(+), 82 deletions(-)



More information about the kernel-team mailing list