[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation
Robert Williams
2062568 at bugs.launchpad.net
Thu Sep 26 07:13:39 UTC 2024
Performed two upgrades from 22.04 yesterday, both have locked up
overnight with the below. It feels like this is the same issue - can
anyone confirm for me?
kernel: INFO: task nfsd:2029 blocked for more than 122 seconds.
kernel: Tainted: G OE 6.8.0-45-generic #45-Ubuntu
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: task:nfsd state:D stack:0 pid:2029 tgid:2029 ppid:2 flags:0x00004000
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x27c/0x6b0
kernel: ? __smp_call_single_queue+0xe0/0x180
kernel: schedule+0x33/0x110
kernel: schedule_timeout+0x157/0x170
kernel: wait_for_completion+0x88/0x150
kernel: __flush_workqueue+0x140/0x3e0
kernel: ? nfsd4_run_cb+0x30/0x70 [nfsd]
kernel: nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
kernel: nfsd4_destroy_session+0x186/0x260 [nfsd]
kernel: nfsd4_proc_compound+0x3b7/0x780 [nfsd]
kernel: nfsd_dispatch+0xd7/0x220 [nfsd]
kernel: svc_process_common+0x450/0x710 [sunrpc]
kernel: ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
kernel: svc_process+0x132/0x1b0 [sunrpc]
kernel: svc_handle_xprt+0x4d3/0x5d0 [sunrpc]
kernel: svc_recv+0x18b/0x2e0 [sunrpc]
kernel: ? __pfx_nfsd+0x10/0x10 [nfsd]
kernel: nfsd+0x8b/0xe0 [nfsd]
kernel: kthread+0xf2/0x120
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork+0x47/0x70
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork_asm+0x1b/0x30
kernel: </TASK>
Most annoying element is that nothing seems to allow recovery without a reload. Unless someone knows some trick to getting it back up?
These host VMs and the NFS share is purely for some bulk data backups. I
will shift them to the kernel mentioned above later today. If this
proves a fix, how soon may it roll out? I've got a large number of hosts
to move to 24.04 and will be holding off until this is fixed as it's
quite a showstopper.
Cheers!
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nfs-utils in Ubuntu.
https://bugs.launchpad.net/bugs/2062568
Title:
nfsd gets unresponsive after some hours of operation
Status in linux package in Ubuntu:
Confirmed
Status in nfs-utils package in Ubuntu:
Confirmed
Bug description:
I installed the 24.04 Beta on two test machines that were running
22.04 without issues before. One of them exports two volumes that are
mounted by the other machine, which primarily uses them as a secondary
storage for ccache.
After being up for a couple of hours (happened twice since yesterday
evening) it seems that nfsd on the machine exporting the volumes hangs
on something.
From dmesg on the server (repeated a few times):
[11183.290548] INFO: task nfsd:1419 blocked for more than 1228 seconds.
[11183.290558] Not tainted 6.8.0-22-generic #22-Ubuntu
[11183.290563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11183.290582] task:nfsd state:D stack:0 pid:1419 tgid:1419 ppid:2 flags:0x00004000
[11183.290587] Call Trace:
[11183.290602] <TASK>
[11183.290606] __schedule+0x27c/0x6b0
[11183.290612] schedule+0x33/0x110
[11183.290615] schedule_timeout+0x157/0x170
[11183.290619] wait_for_completion+0x88/0x150
[11183.290623] __flush_workqueue+0x140/0x3e0
[11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
[11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd]
[11183.290744] nfsd4_proc_compound+0x3af/0x770 [nfsd]
[11183.290798] nfsd_dispatch+0xd4/0x220 [nfsd]
[11183.290851] svc_process_common+0x44d/0x710 [sunrpc]
[11183.290924] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[11183.290976] svc_process+0x132/0x1b0 [sunrpc]
[11183.291041] svc_handle_xprt+0x4d3/0x5d0 [sunrpc]
[11183.291105] svc_recv+0x18b/0x2e0 [sunrpc]
[11183.291168] ? __pfx_nfsd+0x10/0x10 [nfsd]
[11183.291220] nfsd+0x8b/0xe0 [nfsd]
[11183.291270] kthread+0xef/0x120
[11183.291274] ? __pfx_kthread+0x10/0x10
[11183.291276] ret_from_fork+0x44/0x70
[11183.291279] ? __pfx_kthread+0x10/0x10
[11183.291281] ret_from_fork_asm+0x1b/0x30
[11183.291286] </TASK>
From dmesg on the client (repeated a number of times):
[ 6596.911785] RPC: Could not send backchannel reply error: -110
[ 6596.972490] RPC: Could not send backchannel reply error: -110
[ 6837.281307] RPC: Could not send backchannel reply error: -110
ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: nfs-kernel-server 1:2.6.4-3ubuntu5
ProcVersionSignature: Ubuntu 6.8.0-22.22-generic 6.8.1
Uname: Linux 6.8.0-22-generic x86_64
.etc.request-key.d.id_resolver.conf: create id_resolver * * /usr/sbin/nfsidmap -t 600 %k %d
ApportVersion: 2.28.1-0ubuntu1
Architecture: amd64
CasperMD5CheckResult: pass
Date: Fri Apr 19 14:10:25 2024
InstallationDate: Installed on 2024-04-16 (3 days ago)
InstallationMedia: Ubuntu-Server 24.04 LTS "Noble Numbat" - Beta amd64 (20240410.1)
NFSMounts:
NFSv4Mounts:
ProcEnviron:
LANG=en_US.UTF-8
PATH=(custom, no user)
SHELL=/bin/bash
TERM=xterm-256color
XDG_RUNTIME_DIR=<set>
SourcePackage: nfs-utils
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions
More information about the foundations-bugs
mailing list