[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation
Silas Horton
2062568 at bugs.launchpad.net
Wed Jan 22 23:06:00 UTC 2025
As another data point, we're seeing this bug on one of our NFS servers
running Ubuntu 24.04.1 with 6.8.0-51 kernel. Also experienced a similar
issue to one of the above users with unable to start nfs-kernel-server
and trying to restart the service.
[2164406.282362] INFO: task nfsd:8039 blocked for more than 122 seconds.
[2164406.282421] Tainted: P O 6.8.0-51-generic #52-Ubuntu
[2164406.282483] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2164406.282547] task:nfsd state:D stack:0 pid:8039 tgid:8039 ppid:2 flags:0x00004000
[2164406.282580] Call Trace:
[2164406.282585] <TASK>
[2164406.282593] __schedule+0x27c/0x6b0
[2164406.282612] schedule+0x33/0x110
[2164406.282624] io_schedule+0x46/0x80
[2164406.282636] folio_wait_bit_common+0x136/0x330
[2164406.282654] ? __pfx_wake_page_function+0x10/0x10
[2164406.282668] folio_wait_bit+0x18/0x30
[2164406.282679] folio_wait_writeback+0x2b/0xa0
[2164406.282695] truncate_inode_partial_folio+0x81/0x180
[2164406.282715] truncate_inode_pages_range+0x23a/0x530
[2164406.282746] truncate_inode_pages_final+0x40/0x50
[2164406.282761] evict+0x298/0x2b0
[2164406.282775] ? vfs_getattr_nosec+0xb7/0x100
[2164406.282791] ? vfs_getattr+0x4d/0x80
[2164406.282805] iput+0x144/0x250
[2164406.282817] nfsd_unlink+0x179/0x370 [nfsd]
[2164406.283053] nfsd3_proc_remove+0x68/0xd0 [nfsd]
[2164406.283275] nfsd_dispatch+0xd4/0x220 [nfsd]
[2164406.283495] svc_process_common+0x4fd/0x750 [sunrpc]
[2164406.283794] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[2164406.284018] svc_process+0x132/0x1b0 [sunrpc]
[2164406.284291] svc_handle_xprt+0x4d3/0x5d0 [sunrpc]
[2164406.284558] svc_recv+0x18b/0x2e0 [sunrpc]
[2164406.284788] ? __pfx_nfsd+0x10/0x10 [nfsd]
[2164406.284986] nfsd+0x8b/0xe0 [nfsd]
[2164406.285175] kthread+0xef/0x120
[2164406.285186] ? __pfx_kthread+0x10/0x10
[2164406.285196] ret_from_fork+0x44/0x70
[2164406.285205] ? __pfx_kthread+0x10/0x10
[2164406.285213] ret_from_fork_asm+0x1b/0x30
[2164406.285225] </TASK>
× nfs-server.service - NFS server and services
Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; preset: enabled)
Drop-In: /run/systemd/generator/nfs-server.service.d
└─order-with-mounts.conf
Active: failed (Result: timeout) since Wed 2025-01-22 14:39:33 PST; 11min ago
Duration: 1month 4d 18h 49min 1.130s
Main PID: 8031 (code=exited, status=0/SUCCESS)
Tasks: 2 (limit: 462611)
Memory: 392.0K (peak: 524.0K)
CPU: 3ms
CGroup: /system.slice/nfs-server.service
├─3754609 /usr/sbin/rpc.nfsd 0
└─3755576 /usr/sbin/exportfs -au
Jan 22 14:35:02 nfs-server systemd[1]: nfs-server.service: Processes still around after SIGKILL. Ignoring.
Jan 22 14:36:32 nfs-server systemd[1]: nfs-server.service: State 'stop-post' timed out. Terminating.
Jan 22 14:38:02 nfs-server systemd[1]: nfs-server.service: State 'final-sigterm' timed out. Killing.
Jan 22 14:38:02 nfs-server systemd[1]: nfs-server.service: Killing process 3755576 (exportfs) with signal SIGKILL.
Jan 22 14:38:02 nfs-server systemd[1]: nfs-server.service: Killing process 3754609 (rpc.nfsd) with signal SIGKILL.
Jan 22 14:39:33 nfs-server systemd[1]: nfs-server.service: Processes still around after final SIGKILL. Entering failed mode.
Jan 22 14:39:33 nfs-server systemd[1]: nfs-server.service: Failed with result 'timeout'.
Jan 22 14:39:33 nfs-server systemd[1]: nfs-server.service: Unit process 3754609 (rpc.nfsd) remains running after unit stopped.
Jan 22 14:39:33 nfs-server systemd[1]: nfs-server.service: Unit process 3755576 (exportfs) remains running after unit stopped.
Jan 22 14:39:33 nfs-server systemd[1]: Stopped nfs-server.service - NFS server and services.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nfs-utils in Ubuntu.
https://bugs.launchpad.net/bugs/2062568
Title:
nfsd gets unresponsive after some hours of operation
Status in linux package in Ubuntu:
In Progress
Status in nfs-utils package in Ubuntu:
Incomplete
Status in linux source package in Noble:
In Progress
Status in nfs-utils source package in Noble:
Incomplete
Bug description:
I installed the 24.04 Beta on two test machines that were running
22.04 without issues before. One of them exports two volumes that are
mounted by the other machine, which primarily uses them as a secondary
storage for ccache.
After being up for a couple of hours (happened twice since yesterday
evening) it seems that nfsd on the machine exporting the volumes hangs
on something.
From dmesg on the server (repeated a few times):
[11183.290548] INFO: task nfsd:1419 blocked for more than 1228 seconds.
[11183.290558] Not tainted 6.8.0-22-generic #22-Ubuntu
[11183.290563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11183.290582] task:nfsd state:D stack:0 pid:1419 tgid:1419 ppid:2 flags:0x00004000
[11183.290587] Call Trace:
[11183.290602] <TASK>
[11183.290606] __schedule+0x27c/0x6b0
[11183.290612] schedule+0x33/0x110
[11183.290615] schedule_timeout+0x157/0x170
[11183.290619] wait_for_completion+0x88/0x150
[11183.290623] __flush_workqueue+0x140/0x3e0
[11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
[11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd]
[11183.290744] nfsd4_proc_compound+0x3af/0x770 [nfsd]
[11183.290798] nfsd_dispatch+0xd4/0x220 [nfsd]
[11183.290851] svc_process_common+0x44d/0x710 [sunrpc]
[11183.290924] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[11183.290976] svc_process+0x132/0x1b0 [sunrpc]
[11183.291041] svc_handle_xprt+0x4d3/0x5d0 [sunrpc]
[11183.291105] svc_recv+0x18b/0x2e0 [sunrpc]
[11183.291168] ? __pfx_nfsd+0x10/0x10 [nfsd]
[11183.291220] nfsd+0x8b/0xe0 [nfsd]
[11183.291270] kthread+0xef/0x120
[11183.291274] ? __pfx_kthread+0x10/0x10
[11183.291276] ret_from_fork+0x44/0x70
[11183.291279] ? __pfx_kthread+0x10/0x10
[11183.291281] ret_from_fork_asm+0x1b/0x30
[11183.291286] </TASK>
From dmesg on the client (repeated a number of times):
[ 6596.911785] RPC: Could not send backchannel reply error: -110
[ 6596.972490] RPC: Could not send backchannel reply error: -110
[ 6837.281307] RPC: Could not send backchannel reply error: -110
ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: nfs-kernel-server 1:2.6.4-3ubuntu5
ProcVersionSignature: Ubuntu 6.8.0-22.22-generic 6.8.1
Uname: Linux 6.8.0-22-generic x86_64
.etc.request-key.d.id_resolver.conf: create id_resolver * * /usr/sbin/nfsidmap -t 600 %k %d
ApportVersion: 2.28.1-0ubuntu1
Architecture: amd64
CasperMD5CheckResult: pass
Date: Fri Apr 19 14:10:25 2024
InstallationDate: Installed on 2024-04-16 (3 days ago)
InstallationMedia: Ubuntu-Server 24.04 LTS "Noble Numbat" - Beta amd64 (20240410.1)
NFSMounts:
NFSv4Mounts:
ProcEnviron:
LANG=en_US.UTF-8
PATH=(custom, no user)
SHELL=/bin/bash
TERM=xterm-256color
XDG_RUNTIME_DIR=<set>
SourcePackage: nfs-utils
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions
More information about the foundations-bugs
mailing list