[Bug 1569925] Re: Shutdown hang on 16.04 with iscsi targets
Rafael David Tinoco
rafael.tinoco at canonical.com
Wed Aug 23 14:22:59 UTC 2017
Hypothesis,
Test (1) - The error is NEVER propagated to upper layers:
# xfs and ext4 mounted automatically
inaddy at iscsihang:~$ mount | grep _netde
/dev/sda1 on /ext4 type ext4 (rw,relatime,stripe=32,data=ordered,_netdev)
/dev/sdb1 on /xfs type xfs (rw,relatime,attr2,inode64,noquota,_netdev)
# no error propagation
inaddy at iscsihang:~$ sudo iscsiadm -m node -o show | grep timeo.replace
node.session.timeo.replacement_timeout = -1
node.session.timeo.replacement_timeout = -1
# target server can't give any more packets to guest:
inaddy at machete:~$ sudo iptables -A INPUT -s 192.168.49.8 -p tcp
--destination-port 3260 -j DROP
# reboot can't succeed
inaddy at iscsihang:~$ sudo reboot
[ 27.596135] connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294896692, last ping 4294897944, now 4294899196
[ 27.628109] connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294896700, last ping 4294897952, now 4294899204
Systemd hangs forever:
[ OK ] Stopped target Remote File Systems.
Unmounting /ext4...
Unmounting /xfs...
OBS: There is a tight relationship in between connection disappearing
before the umount service runs and the capability of systemd to shutdown
the machine entirely. I would say that, in case of no error propagation,
is even worse since kernel would be locked up forever:
[ 240.132208] INFO: task systemd:1094 blocked for more than 120 seconds.
[ 240.133499] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 240.134544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.136092] INFO: task umount:1199 blocked for more than 120 seconds.
[ 240.137262] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 240.138302] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.139742] INFO: task umount:1201 blocked for more than 120 seconds.
[ 240.140898] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 240.141953] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Systemd is still trying...
[ OK ] Unmounted /ext4.
[ OK ] Unmounted /xfs.
[ OK ] Stopped File System Check on /dev/disk/by-label/XFS.
[ OK ] Stopped File System Check on /dev/disk/by-label/EXT4.
[ OK ] Removed slice system-systemd\x2dfsck.slice.
[ OK ] Stopped target Remote File Systems (Pre).
Stopping Login to default iSCSI targets...
[ 360.140109] INFO: task systemd:1094 blocked for more than 120 seconds.
[ 360.141219] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 360.142100] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 360.143377] INFO: task umount:1199 blocked for more than 120 seconds.
[ 360.144451] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 360.145333] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 360.146576] INFO: task umount:1201 blocked for more than 120 seconds.
[ 360.147586] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 360.148472] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
This will happen forever. I still have to find a way of causing systemd
to shutdown network and cause this hang because error, likely, is
propagated after the umount service gives up its logic (or something
like it) <-- theory.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1569925
Title:
Shutdown hang on 16.04 with iscsi targets
Status in systemd package in Ubuntu:
Confirmed
Status in systemd source package in Xenial:
In Progress
Bug description:
I have 4 servers running the latest 16.04 updates from the development
branch (as of right now).
Each server is connected to NetApp storage using iscsi software
initiator. There are a total of 56 volumes spread across two NetApp
arrays. Each volume has 4 paths available to it which are being
managed by device mapper.
While logged into the iscsi sessions all I have to do is reboot the
server and I get a hang.
I see a message that says:
"Reached target Shutdown"
followed by
"systemd-shutdown[1]: Failed to finalize DM devices, ignoring"
and then I see 8 lines that say:
"connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311815***, last ping 43118164**, now 4311817***"
"connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311815***, last ping 43118164**, now 4311817***"
"connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311815***, last ping 43118164**, now 4311817***"
"connection4:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311815***, last ping 43118164**, now 4311817***"
"connection5:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311815***, last ping 43118164**, now 4311817***"
"connection6:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311815***, last ping 43118164**, now 4311817***"
"connection7:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311815***, last ping 43118164**, now 4311817***"
"connection8:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311815***, last ping 43118164**, now 4311817***"
NOTE: the actual values of the *'s differ for each line above.
This seems like a bug somewhere but I am unaware of any additional
logging that I could turn on to pinpoint the problem.
Note I also have similar setups that are not doing iscsi and they
don't have this problem.
Here is a screenshot of what I see on the shell when I try to reboot:
(https://launchpadlibrarian.net/291303059/Screenshot.jpg)
This is being tracked in NetApp bug tracker CQ number 860251.
If I log out of all iscsi sessions before rebooting then I do not
experience the hang:
iscsiadm -m node -U all
We are wondering if this could be some kind of shutdown ordering
problem. Like the network devices have already disappeared and then
iscsi tries to perform some operation (hence the ping timeouts).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1569925/+subscriptions
More information about the foundations-bugs
mailing list