[Bug 1608953] Re: PostgreSQL does not start in lx-brand container
Martin Pitt
martin.pitt at ubuntu.com
Wed Sep 7 09:15:18 UTC 2016
** Description changed:
We have a 16.04 Ubuntu lx-brand container image available in our public
cloud and recently discovered a systemd bug that's related to running in
a container environment.
I'm forwarded below what one of our engineers discovered:
----
After installing postgres (apt-get install -y -q postgresql), systemd
does not actually start any of the postgres services. We tracked this
down to a failure from sed from within the /lib/systemd/system-
generators/postgresql-generator script. The sed command tries to close
stderr (fd 2) which fails, so sed returns an error code, which causes
the entire postgres generator to fail.
The root cause of the problem lies in the systemd code. Because we are
running inside of a container (see detect_container) we don't execute
the following block of code in the systemd main().
- if (getpid() == 1 && detect_container() <= 0) {
+ if (getpid() == 1 && detect_container() <= 0) {
- /* Running outside of a container as PID 1 */
- arg_running_as = MANAGER_SYSTEM;
- make_null_stdio();
+ /* Running outside of a container as PID 1 */
+ arg_running_as = MANAGER_SYSTEM;
+ make_null_stdio();
The make_null_stdio function is what sets up fd 0-2 as /dev/null in
systemd on bare metal. Having those fd's setup is what allows the
postgres system-generator to work properly since sed expects to be able
to close stderr.
Because we never call make_null_stdio when inside any container, the low
fd's wind up getting setup later using /dev/console with O_CLOEXEC, so
when we actually run the system generator script, we don't have the low
fd's setup at all like sed expects.
Interestingly, looking at the master branch of systemd, at
src/core/main.c this bug appears to no longer exist. The relevant code
block has been moved so it is no longer conditional on being in a
container, but the commit was not intended to fix this problem. It was
apparently due to color handling on the console/
commit 3a18b60489504056f9b0b1a139439cbfa60a87e1
It would be great if this fix could be pulled in to an update for Ubuntu
16.04.
- ----
+ SRU INFORMATION
+ ===============
+ Fix: https://anonscm.debian.org/cgit/pkg-systemd/systemd.git/commit/?h=ubuntu-xenial&id=6df46531727baa
+
+ Regression potential: very low; this does not affect lxc and lxd (our
+ officially supported container engines) nor nspawn, as they already set
+ up pid1's stdout/stderr. And it's hard to imagine anything depending on
+ pid1's stdout/err *not* being existant file descriptors, as in pretty
+ much all cases they already are.
+
+ Test case: Specific to lx-brand, must be verified by reporter. However,
+ we need to verify that LXC, LXD, and nspawn containers still boot with
+ this version.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1608953
Title:
PostgreSQL does not start in lx-brand container
Status in systemd package in Ubuntu:
Fix Released
Status in systemd source package in Xenial:
Fix Committed
Bug description:
We have a 16.04 Ubuntu lx-brand container image available in our
public cloud and recently discovered a systemd bug that's related to
running in a container environment.
I'm forwarded below what one of our engineers discovered:
----
After installing postgres (apt-get install -y -q postgresql), systemd
does not actually start any of the postgres services. We tracked this
down to a failure from sed from within the /lib/systemd/system-
generators/postgresql-generator script. The sed command tries to close
stderr (fd 2) which fails, so sed returns an error code, which causes
the entire postgres generator to fail.
The root cause of the problem lies in the systemd code. Because we are
running inside of a container (see detect_container) we don't execute
the following block of code in the systemd main().
if (getpid() == 1 && detect_container() <= 0) {
/* Running outside of a container as PID 1 */
arg_running_as = MANAGER_SYSTEM;
make_null_stdio();
The make_null_stdio function is what sets up fd 0-2 as /dev/null in
systemd on bare metal. Having those fd's setup is what allows the
postgres system-generator to work properly since sed expects to be
able to close stderr.
Because we never call make_null_stdio when inside any container, the
low fd's wind up getting setup later using /dev/console with
O_CLOEXEC, so when we actually run the system generator script, we
don't have the low fd's setup at all like sed expects.
Interestingly, looking at the master branch of systemd, at
src/core/main.c this bug appears to no longer exist. The relevant code
block has been moved so it is no longer conditional on being in a
container, but the commit was not intended to fix this problem. It was
apparently due to color handling on the console/
commit 3a18b60489504056f9b0b1a139439cbfa60a87e1
It would be great if this fix could be pulled in to an update for
Ubuntu 16.04.
SRU INFORMATION
===============
Fix: https://anonscm.debian.org/cgit/pkg-systemd/systemd.git/commit/?h=ubuntu-xenial&id=6df46531727baa
Regression potential: very low; this does not affect lxc and lxd (our
officially supported container engines) nor nspawn, as they already
set up pid1's stdout/stderr. And it's hard to imagine anything
depending on pid1's stdout/err *not* being existant file descriptors,
as in pretty much all cases they already are.
Test case: Specific to lx-brand, must be verified by reporter.
However, we need to verify that LXC, LXD, and nspawn containers still
boot with this version.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1608953/+subscriptions
More information about the foundations-bugs
mailing list