[Bug 1958284] Autopkgtest regression report (systemd/245.4-4ubuntu3.16)
Ubuntu SRU Bot
1958284 at bugs.launchpad.net
Tue Mar 29 14:31:23 UTC 2022
All autopkgtests for the newly accepted systemd (245.4-4ubuntu3.16) for focal have finished running.
The following regressions have been reported in tests triggered by the package:
gvfs/1.44.1-1ubuntu1 (arm64, ppc64el, amd64)
linux-aws-5.13/5.13.0-1019.21~20.04.1 (arm64)
snapd/2.54.3+20.04.1ubuntu0.2 (arm64, ppc64el, s390x)
docker.io/20.10.7-0ubuntu5~20.04.2 (s390x)
Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].
https://people.canonical.com/~ubuntu-archive/proposed-
migration/focal/update_excuses.html#systemd
[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions
Thank you!
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1958284
Title:
shutdown hangs at "Waiting for process: ..." for 90s, ignoring
DefaultTimeoutStopSec
Status in systemd package in Ubuntu:
Fix Released
Status in systemd source package in Focal:
Fix Committed
Bug description:
[Impact]
The systemd shutdown sequence does not honor systemd-system.conf
settings when waiting for remaining processes. This means that, for
example, if a systemd service specifies KillMode=process and a process
remaining from that service does not properly handle SIGTERM, then the
remaining process will not be killed until after the compiled-in
default value of DefaultTimeoutStopSec (90s), even if the user has
changed the setting of DefaultTimeoutStopSec. In such cases, this
impacts users by significantly increasing the time required for
shutdown/reboot.
[Test Plan]
* Create a new script, /usr/local/bin/loop-ignore-sigterm:
```
#!/bin/bash
loop_forever() {
while true; do sleep 1; done
}
(
trap 'echo Ignoring SIGTERM...' SIGTERM
loop_forever
)
loop_forever
```
This script will spawn a subshell which will loop forever and ignore
SIGTERM. This will force systemd to wait for the subprocess at
reboot/shutdown, and eventually send SIGKILL after TimeoutStopSec
(DefaultTimeoutStopSec in this case).
* Make the script executable:
$ chmod +x /usr/local/bin/loop-ignore-sigterm
* Create a systemd service for this script. Add the following to
/etc/systemd/system/loop-ignore-sigterm.service:
```
[Service]
KillMode=process
ExecStart=/usr/local/bin/loop-ignore-sigterm
```
* Start the service:
$ systemctl start loop-ignore-sigterm.service
* Edit /etc/systemd/system.conf, and uncomment the
'DefaultTimeoutStopSec=90s' line. Modify 90s to something much shorter,
e.g. 20s.
* Re-exec the daemon so this new default takes effect:
$ systemctl daemon-reexec
* Reboot, and monitor the logs. Observe that systemd-shutdown will wait
for the loop-ignore-sigterm process for 90s, instead of the 20s
configured earlier.
[Where problems could occur]
The patch moves the reset_arguments() call to the end of main, which
means reset_arguments() is no longer called before daemon re-execution
(if that branch is taken). If anything in that code path relied on
reset_arguments() being called before re-executing, those assumptions
could be broken. Any such problems would potentially be seen during
daemon re-execution, e.g. when calling systemctl daemon-reexec.
[ Original Description ]
With systemd v245 as shipped with 20.04, the shutdown sequence does
not use the value of `DefaultTimeoutStopSec` to wait for remaining
processes, it instead uses the compiled in default of 90s.
This is most visible with services that use `KillMode=process`
(docker, k8s, k3s, etc...), especially if the remaining processes do
not handle `SIGTERM` or choose to ignore it.
For example:
```
[ OK ] Finished Reboot.
[ OK ] Reached target Reboot.
[ 243.652848 ] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, fluent-bit
--- hangs here for 90s even if DefaultTimeoutStopSec is set to a lower
value ---
```
The bug has been fixed upstream here:
https://github.com/systemd/systemd/commit/7d9eea2bd3d4f83668c7a78754d201b22
Marc was kind enough to package the patch for 20.04 so I could test it
(https://launchpad.net/~mdeslaur/+archive/ubuntu/testing/+sourcepub/13210617/+listing-
archive-extra) and with that package, I can confirm that it indeed
fixes the issue.
Here's a few github issues I stumbled upon while trying to debug this,
along with a short writeup of the workaround I ended up using:
- https://github.com/moby/moby/issues/41831
- https://github.com/k3s-io/k3s/issues/2400
- https://github.com/systemd/systemd/issues/16991
- https://raby.sh/debugging-90s-hangs-during-shutdown-on-ubuntu-2004.html
Of course, it would be much better if all the processes would properly
handle `SIGTERM`, but having a way to enforce a maximum wait time at
shutdown is a decent workaround.
Given that the patch is relatively simple, would it be possible to add
it the package for 20.04?
Thanks
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1958284/+subscriptions
More information about the foundations-bugs
mailing list