[Bug 1811580] Re: systemd fails to start sshd at reboot
Matt P
1811580 at bugs.launchpad.net
Tue Feb 26 14:39:55 UTC 2019
Same situation. Ubuntu 16.04 openvz vps image of unknown origin.
Minimized image, ran security updates and rebooted. openssh server
failed to start due to systemd-tmpfiles failing with
Failed to validate path /var/run/sshd: Too many levels of symbolic
links
Which then causes ssh server to fail to start with error:
Missing privilege separation directory: /var/run/sshd
#
# pre breaking update
#
# uname -a
Linux NJ01 2.6.32-openvz-042stab120.18-amd64 #1 SMP Fri Jan 13 10:33:34 MSK 2017 x86_64 x86_64 x86_64 GNU/Linux
# cat /usr/lib/tmpfiles.d/sshd.conf
d /var/run/sshd 0755 root root
# systemd-tmpfiles --version
systemd 229
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN
# systemd-tmpfiles --create /usr/lib/tmpfiles.d/sshd.conf
# # success
# ls -ld /
drwxr-xr-x 23 root root 4096 Feb 26 09:35 /
# ls -ld /var
drwxr-xr-x 12 root root 4096 Nov 26 2016 /var
# ls -ld /var/run
lrwxrwxrwx 1 root root 4 Nov 26 2016 /var/run -> /run
# ls -ld /var/run/sshd
drwxr-xr-x 2 root root 40 Feb 26 09:35 /var/run/sshd
# apt-cache policy systemd
systemd:
Installed: 229-4ubuntu12
Candidate: 229-4ubuntu12
Version table:
*** 229-4ubuntu12 100
100 /var/lib/dpkg/status
#---BREAKING UPDATE START----
apt-get update
# "minimize" the system
export DEBIAN_FRONTEND=noninteractive
apt-get --assume-yes install aptitude ubuntu-minimal
aptitude --assume-yes markauto '~i!?name(ubuntu-minimal~|linux-generic~|openssh-server~|systemd)'
aptitude --assume-yes purge '~c'
# apply security updates
apt-get --assume-yes install unattended-upgrades
unattended-upgrade
# reboot
shutdown -r now
#---BREAKING UPDATE END----
# post update (pre-reboot).
# apt-cache policy systemd
systemd:
Installed: 229-4ubuntu21.16
Candidate: 229-4ubuntu21.16
Version table:
*** 229-4ubuntu21.16 500
500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages
100 /var/lib/dpkg/status
229-4ubuntu4 500
500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages
# ls -ld /
drwxr-xr-x 23 root root 4096 Feb 26 09:03 /
# ls -ld /var
drwxr-xr-x 12 root root 4096 Nov 26 2016 /var
# ls -ld /var/run
lrwxrwxrwx 1 root root 4 Nov 26 2016 /var/run -> /run
# ls -ld /var/run/sshd
drwxr-xr-x 2 root root 40 Feb 26 09:03 /var/run/sshd
# systemd-tmpfiles --version
systemd 229
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN
# systemd-tmpfiles --create /usr/lib/tmpfiles.d/sshd.conf
Failed to validate path /var/run/sshd: Too many levels of symbolic links
Anyway, root cause seems to be this systemd-tmpfiles error. Tmpfile gets purged at reboot and doesn't get recreated.
Seems pretty major that applying security updates would lock you out of
your server. If I didn't happen to have a serial console with this
particular VPS provider (some others I use don't provide one)...I would
have no idea what was going on.
I get this might be due to weird openvz image or older kernel...but
these ubuntu openvz images are very common.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1811580
Title:
systemd fails to start sshd at reboot
Status in systemd package in Ubuntu:
Incomplete
Bug description:
So far reported issues turned out to be:
- obsolete/buggy/vulnerable 3rd party provided kernels
- bad permissions on /
Please ensure / is owned by root:root.
Please ensure you are running up to date kernels.
===
Ubuntu 16.04.5, systemd 229-4ubuntu21.15
The latest systemd update has somehow changed the method it uses to
start 'ssh.service' i.e. 'sshd'. systemd fails to start sshd if
/etc/ssh/sshd_config contains "UsePrivilegeSeparation yes" and
/var/run/sshd/ does not already exist. Being as this is the default,
virtually EVERY Ubuntu 16.04 server in the world has
UsePrivilegeSeparation set to yes. Furthermore, at the time when the
user performs 'apt upgrade' and receives the newest version of
systemd, /var/run/sshd/ already exists, so sshd successfully reloads
for as long as the server doesn't get rebooted. BUT, as soon as the
server is rebooted for any reason, /var/run/sshd/ gets cleaned away,
and sshd fails to start, causing the remote user to be completely
locked out of his system. This is a MAJOR issue for millions of VPS
servers worldwide, as they are all about to get locked out of their
servers and potentially lose data. The next reboot is a ticking time
bomb waiting to spring. The bomb can be defused by implicitly setting
'UsePrivilegeSeparation no' in /etc/ssh/sshd_config, however
unsuspecting administrators are bound to be caught out by the
millions. I got caught by it in the middle of setting up a new server
yesterday, and it took a whole day to find the source.
The appropriate fix would be to ensure that systemd can successfully
'start ssh.service' even when 'UsePrivilegeSeparation yes' is set.
systemd needs to test that /var/run/sshd/ exists before starting sshd,
just as the init.d script for sshd does. openssl could also be patched
so that UsePrivilegeSeparation is no longer enabled by default,
however that is not going to solve the problem for millions of pre-
existing config files. Only an update to openssl to force-override
that flag to 'no' would solve the problem. Thus systemd still needs to
be responsible for ensuring that it inits sshd properly by ensuring
that /var/run/sshd/ exists before it sends the 'start' command.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811580/+subscriptions
More information about the foundations-bugs
mailing list