[Bug 1914456] Re: 120 sec kernel timeout is seen during SCSI remove_device.
Guilherme G. Piccoli
1914456 at bugs.launchpad.net
Fri Mar 19 21:04:10 UTC 2021
Hi Jitendra, I am sorry for the delay - I've tried to reproduce both using regular SCSI device in a virtual machine and by using iSCSI, and neither was a successful reproducer. Based on the stack traces I see in your "echo w" output, it seems to be related with your special iSCSI target.
Also, was the SCSI device removed holding a btrfs filesystem?
I don't see any benefit in keeping the machine in this state, please go
ahead and repurpose that to try reproducing; but I'd like you to set
kdump before, if possible, so we collect the dump over there. Also,
please try using the latest Bionic kernel 4.15.0-139.
In order to setup kdump, please run the following as root user:
1) apt-get update; apt-get install linux-crashdump
2) Answer the installer questions using the default responses
3) Edit the file "/etc/default/grub.d/kdump-tools.cfg" and change the
crashkernel setting to be something like "crashkernel=440M" - you can
try a bit less memory, but since it's hard to reproduce, it's safer to
keep a large value to prevent kdump failure
4) Please execute, as root user: echo "kernel.hung_task_panic=1" >>
/etc/sysctl.conf
5) Reboot the machine and check the output of "kdump-config show" - it
should show that kdump is ready. If so, please try a dummy kdump to
check if it's working, by running:
echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger
6) If it works, the node should be rebooted and you should have a dump collected on /var/crash/ .
In that case, go ahead and try to reproduce.
Thanks for your effort here - I'll be out next week, as soon as I'm back I'll continue the work.
Cheers,
Guilherme
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to open-iscsi in Ubuntu.
https://bugs.launchpad.net/bugs/1914456
Title:
120 sec kernel timeout is seen during SCSI remove_device.
Status in open-iscsi package in Ubuntu:
Confirmed
Bug description:
Ubuntu Release
--------------
#lsb_release -rd
Description: Ubuntu 18.04.5 LTS
Release: 18.04
#cat /proc/version_signature
Ubuntu 4.15.0-122.124-generic 4.15.18
Package version
---------------
#apt-cache policy open-iscsi
open-iscsi:
Installed: 2.0.874-5ubuntu2.10
Candidate: 2.0.874-5ubuntu2.10
Problem statement and details
-----------------------------
During the automation testing of vendor's iSCSI target using Open-iscsi initiator, Initiator host reported 120 second kernel hang issue. The automation test was doing SCSI remove_device operation when the issue observered.
Automation test perform following sequence of operations,
1. Establish iSCSI session
2. Create bunch of iSCSI LUNs.
3. Discover LUNs through sysfs scan
4. Format LUNs
5. Perform IO
6. Remove LUN
7. Delete LUN
Observations from initiator host:
1. Already discovered iSCSI LUNs went to offline state.
2. New LUNs are not being discovered.
3. NOP-in/NOP-out PDU exchange works fine from the iSCSI session.
Note: Single iSCSI session is present between initiator and target.
Expected behavior
-----------------
SCSI remove_device should succeed and automation test should continue.
Issue is observed even with following commit, which has fix for similar issue.
https://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/commit/?id=27dfa4073289ee5737d45b4cfa40b11f5cdeeaa5
Stack trace
-----------
[91832.800739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[91832.809982] Call Trace:
[91832.809994] __schedule+0x24e/0x880
[91832.810002] ? __enqueue_entity+0x5c/0x60
[91832.810006] ? select_task_rq_fair+0x642/0xab0
[91832.810008] schedule+0x2c/0x80
[91832.810010] schedule_preempt_disabled+0xe/0x10
[91832.810012] __mutex_lock.isra.5+0x276/0x4e0
[91832.810017] ? kernfs_name_hash+0x17/0x80
[91832.810020] __mutex_lock_slowpath+0x13/0x20
[91832.810021] ? __mutex_lock_slowpath+0x13/0x20
[91832.810023] mutex_lock+0x2f/0x40
[91832.810030] scsi_remove_device+0x1e/0x40
[91832.810033] sdev_store_delete+0x55/0xa0
[91832.810036] dev_attr_store+0x1b/0x30
[91832.810039] sysfs_kf_write+0x3c/0x50
[91832.810040] kernfs_fop_write+0x125/0x1a0
[91832.810046] __vfs_write+0x1b/0x40
[91832.810048] vfs_write+0xb1/0x1a0
[91832.810050] SyS_write+0x5c/0xe0
[91832.810055] do_syscall_64+0x73/0x130
[91832.810058] entry_SYSCALL_64_after_hwframe+0x41/0xa6
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1914456/+subscriptions
More information about the foundations-bugs
mailing list