[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
David D Lowe
1906476 at bugs.launchpad.net
Sat Oct 16 10:05:20 UTC 2021
I installed Ubuntu 21.10 system with ZFS and encryption, and I installed
all APT updates. I quickly started experiencing filesystem corruption
within an hour or two, and now my system won't boot. I see that this bug
has been marked "Fix Released", but I am still experiencing it.
Even before the bug fix is released, I urge the developers to take
urgent preventative measures to stop users from experiencing filesystem
corruption, especially users who are upgrading, since they have more
data to lose. Specifically, I ask you to take these three actions:
1. Make the upgrade tool refuse to upgrade to Ubuntu 21.10 if ZFS is
being used, until a permanent fix is released.
2. If possible, don't allow users to install a fresh installation of
Ubuntu 21.10 with ZFS enabled. I don't know if it's possible to release
a new ISO, but please consider that it might be worth releasing a
21.10.01 ISO.
3. The release notes for Ubuntu 21.10 do contain warnings about
filesystem corruption when using ZFS, but you have to read 2800 words
before reaching the paragraph that warns about this. Please consider
modify the release notes to make this warning more prominent. Also,
modify any other web pages that users might check before installing
Ubuntu 21.10 or upgrading to it.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ubuntu-release-upgrader in
Ubuntu.
https://bugs.launchpad.net/bugs/1906476
Title:
PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 ==
sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED,
&zp->z_sa_hdl)) failed
Status in Native ZFS for Linux:
New
Status in linux package in Ubuntu:
Invalid
Status in ubuntu-release-upgrader package in Ubuntu:
New
Status in zfs-linux package in Ubuntu:
Fix Released
Status in linux source package in Impish:
Fix Committed
Status in ubuntu-release-upgrader source package in Impish:
New
Status in zfs-linux source package in Impish:
Fix Released
Bug description:
Since today while running Ubuntu 21.04 Hirsute I started getting a ZFS
panic in the kernel log which was also hanging Disk I/O for all
Chrome/Electron Apps.
I have narrowed down a few important notes:
- It does not happen with module version 0.8.4-1ubuntu11 built and included with 5.8.0-29-generic
- It was happening when using zfs-dkms 0.8.4-1ubuntu16 built with DKMS
on the same kernel and also on 5.8.18-acso (a custom kernel).
- For whatever reason multiple Chrome/Electron apps were affected,
specifically Discord, Chrome and Mattermost. In all cases they seem
(but I was unable to strace the processes so it was a bit hard ot
confirm 100% but by deduction from /proc/PID/fd and the hanging ls)
they seem hung trying to open files in their 'Cache' directory, e.g.
~/.cache/google-chrome/Default/Cache and ~/.config/Mattermost/Cache ..
while the issue was going on I could not list that directory either
"ls" would just hang.
- Once I removed zfs-dkms only to revert to the kernel built-in
version it immediately worked without changing anything, removing
files, etc.
- It happened over multiple reboots and kernels every time, all my
Chrome apps weren't working but for whatever reason nothing else
seemed affected.
- It would log a series of spl_panic dumps into kern.log that look like this:
Dec 2 12:36:42 optane kernel: [ 72.857033] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Dec 2 12:36:42 optane kernel: [ 72.857036] PANIC at zfs_znode.c:335:zfs_znode_sa_init()
I could only find one other google reference to this issue, with 2 other users reporting the same error but on 20.04 here:
https://github.com/openzfs/zfs/issues/10971
- I was not experiencing the issue on 0.8.4-1ubuntu14 and fairly sure
it was working on 0.8.4-1ubuntu15 but broken after upgrade to
0.8.4-1ubuntu16. I will reinstall those zfs-dkms versions to verify
that.
There were a few originating call stacks but the first one I hit was
Call Trace:
dump_stack+0x74/0x95
spl_dumpstack+0x29/0x2b [spl]
spl_panic+0xd4/0xfc [spl]
? sa_cache_constructor+0x27/0x50 [zfs]
? _cond_resched+0x19/0x40
? mutex_lock+0x12/0x40
? dmu_buf_set_user_ie+0x54/0x80 [zfs]
zfs_znode_sa_init+0xe0/0xf0 [zfs]
zfs_znode_alloc+0x101/0x700 [zfs]
? arc_buf_fill+0x270/0xd30 [zfs]
? __cv_init+0x42/0x60 [spl]
? dnode_cons+0x28f/0x2a0 [zfs]
? _cond_resched+0x19/0x40
? _cond_resched+0x19/0x40
? mutex_lock+0x12/0x40
? aggsum_add+0x153/0x170 [zfs]
? spl_kmem_alloc_impl+0xd8/0x110 [spl]
? arc_space_consume+0x54/0xe0 [zfs]
? dbuf_read+0x4a0/0xb50 [zfs]
? _cond_resched+0x19/0x40
? mutex_lock+0x12/0x40
? dnode_rele_and_unlock+0x5a/0xc0 [zfs]
? _cond_resched+0x19/0x40
? mutex_lock+0x12/0x40
? dmu_object_info_from_dnode+0x84/0xb0 [zfs]
zfs_zget+0x1c3/0x270 [zfs]
? dmu_buf_rele+0x3a/0x40 [zfs]
zfs_dirent_lock+0x349/0x680 [zfs]
zfs_dirlook+0x90/0x2a0 [zfs]
? zfs_zaccess+0x10c/0x480 [zfs]
zfs_lookup+0x202/0x3b0 [zfs]
zpl_lookup+0xca/0x1e0 [zfs]
path_openat+0x6a2/0xfe0
do_filp_open+0x9b/0x110
? __check_object_size+0xdb/0x1b0
? __alloc_fd+0x46/0x170
do_sys_openat2+0x217/0x2d0
? do_sys_openat2+0x217/0x2d0
do_sys_open+0x59/0x80
__x64_sys_openat+0x20/0x30
To manage notifications about this bug go to:
https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions
More information about the foundations-bugs
mailing list