[Bug 1958284] Re: shutdown hangs at "Waiting for process: ..." for 90s, ignoring DefaultTimeoutStopSec
Brian Murray
1958284 at bugs.launchpad.net
Mon Mar 28 18:24:09 UTC 2022
Hello Jean, or anyone else affected,
Accepted systemd into focal-proposed. The package will build now and be
available at
https://launchpad.net/ubuntu/+source/systemd/245.4-4ubuntu3.16 in a few
hours, and then in the -proposed repository.
Please help us by testing this new package. See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed. Your feedback will aid us getting this
update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
focal to verification-done-focal. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-focal. In either case, without details of your testing we will
not be able to proceed.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance for helping!
N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.
** Changed in: systemd (Ubuntu Focal)
Status: In Progress => Fix Committed
** Tags added: verification-needed verification-needed-focal
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1958284
Title:
shutdown hangs at "Waiting for process: ..." for 90s, ignoring
DefaultTimeoutStopSec
Status in systemd package in Ubuntu:
Fix Released
Status in systemd source package in Focal:
Fix Committed
Bug description:
[Impact]
The systemd shutdown sequence does not honor systemd-system.conf
settings when waiting for remaining processes. This means that, for
example, if a systemd service specifies KillMode=process and a process
remaining from that service does not properly handle SIGTERM, then the
remaining process will not be killed until after the compiled-in
default value of DefaultTimeoutStopSec (90s), even if the user has
changed the setting of DefaultTimeoutStopSec. In such cases, this
impacts users by significantly increasing the time required for
shutdown/reboot.
[Test Plan]
* Create a new script, /usr/local/bin/loop-ignore-sigterm:
```
#!/bin/bash
loop_forever() {
while true; do sleep 1; done
}
(
trap 'echo Ignoring SIGTERM...' SIGTERM
loop_forever
)
loop_forever
```
This script will spawn a subshell which will loop forever and ignore
SIGTERM. This will force systemd to wait for the subprocess at
reboot/shutdown, and eventually send SIGKILL after TimeoutStopSec
(DefaultTimeoutStopSec in this case).
* Make the script executable:
$ chmod +x /usr/local/bin/loop-ignore-sigterm
* Create a systemd service for this script. Add the following to
/etc/systemd/system/loop-ignore-sigterm.service:
```
[Service]
KillMode=process
ExecStart=/usr/local/bin/loop-ignore-sigterm
```
* Start the service:
$ systemctl start loop-ignore-sigterm.service
* Edit /etc/systemd/system.conf, and uncomment the
'DefaultTimeoutStopSec=90s' line. Modify 90s to something much shorter,
e.g. 20s.
* Re-exec the daemon so this new default takes effect:
$ systemctl daemon-reexec
* Reboot, and monitor the logs. Observe that systemd-shutdown will wait
for the loop-ignore-sigterm process for 90s, instead of the 20s
configured earlier.
[Where problems could occur]
The patch moves the reset_arguments() call to the end of main, which
means reset_arguments() is no longer called before daemon re-execution
(if that branch is taken). If anything in that code path relied on
reset_arguments() being called before re-executing, those assumptions
could be broken. Any such problems would potentially be seen during
daemon re-execution, e.g. when calling systemctl daemon-reexec.
[ Original Description ]
With systemd v245 as shipped with 20.04, the shutdown sequence does
not use the value of `DefaultTimeoutStopSec` to wait for remaining
processes, it instead uses the compiled in default of 90s.
This is most visible with services that use `KillMode=process`
(docker, k8s, k3s, etc...), especially if the remaining processes do
not handle `SIGTERM` or choose to ignore it.
For example:
```
[ OK ] Finished Reboot.
[ OK ] Reached target Reboot.
[ 243.652848 ] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, fluent-bit
--- hangs here for 90s even if DefaultTimeoutStopSec is set to a lower
value ---
```
The bug has been fixed upstream here:
https://github.com/systemd/systemd/commit/7d9eea2bd3d4f83668c7a78754d201b22
Marc was kind enough to package the patch for 20.04 so I could test it
(https://launchpad.net/~mdeslaur/+archive/ubuntu/testing/+sourcepub/13210617/+listing-
archive-extra) and with that package, I can confirm that it indeed
fixes the issue.
Here's a few github issues I stumbled upon while trying to debug this,
along with a short writeup of the workaround I ended up using:
- https://github.com/moby/moby/issues/41831
- https://github.com/k3s-io/k3s/issues/2400
- https://github.com/systemd/systemd/issues/16991
- https://raby.sh/debugging-90s-hangs-during-shutdown-on-ubuntu-2004.html
Of course, it would be much better if all the processes would properly
handle `SIGTERM`, but having a way to enforce a maximum wait time at
shutdown is a decent workaround.
Given that the patch is relatively simple, would it be possible to add
it the package for 20.04?
Thanks
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1958284/+subscriptions
More information about the foundations-bugs
mailing list