[Bug 1636912] Re: systemd-networkd runs too late for cloud-init.service (net)
Ryan Harper
1636912 at bugs.launchpad.net
Fri Oct 28 18:46:07 UTC 2016
On Fri, Oct 28, 2016 at 1:02 PM, Martin Pitt <martin.pitt at ubuntu.com>
wrote:
> > However, if there isn't a local seed, then we must search again *once*
> networking is up.
>
> Fair enough, but you can then of course not use that unit to configure
> the network.
Of course we can. We need to cycle the network though.
> But this "if there isn't a local seed" isn't something you
> can express as a static condition, hence my thought that it might be
> better if c-i calls s-n-wait-online if and only if it's necessary. But
> YMMV.
>
Right, though it's not clear to me that we can express this in unit terms.
It may have to be done internally; That is, it's possible that in
cloud-init "net"
mode we need to block ourselves, until networking is-up.
>
>
> > This works just fine with 'networking.service'
>
> This did/does not really work "fine" IMHO -- all of our cloud images
> hang for a long time at boot unless you give them a local data source or
> disable cloud-init.
This is by-design. cloud-init is *interposing* itself on purpose.
> It also imposes the restriction that you must be
>
online during boot, which is fine for a cloud environment, but rather
> unfriendly for other scenarios.
>
No, you need provide a datasource, or indicate (via boot params) that
you're not
interested in cloud-init running.
It's certainly true that if someone just qemu-system-x86 -hda cloud.img
that it's going
to hang. But folks are explicitly booting a *cloud* image without a cloud.
We handle this fine with uvt-kvm which provides a nocloud-net seed when
booting.
> > due to the "atomic" nature of ifup where once the oneshot service
> runs, we can assume that networking is up. However, networkd runs and
> asynchronously brings up networking; which is fine but we now no longer
> have a clear checkpoint at which cloud-init can run with networking up
> but before.
>
> Again -- s-n-wait-online.service is exactly the networkd counterpart of
> networking.service for ifupdown, that gives you the "network is fully
> configured" synchronization point. The issue is not that it doesn't
> exist, but that I think that it's not a good thing to depend on either
> one.
>
It is, but it's a separate unit "networking" == "networkd" +
"networkd-wait-online"
However, netplan generator only emits the "systemd-networkd" target wants,
so
if we use After=systemd-networkd-wait-online; that's never run since
nothing wants it.
If we add it explicitly, then it runs even when networkd doesn't
>
> > we really want something like
> > After=networking|networkd-wait-online
> > which handles determining if networkd was supposed to run or not
>
> That already exists, it's network-online.target -- whatever "implements"
> it (ifupdown, networkd, NM) will hook itself into this target. Nothing
> more, nothing less, so if cloud-init just wants to wait until it's
> online, then just make it Requires/After=network-online.target instead
> of Before= it. (But again -- this is a very strong dependency which is
> very inconvenient anywhere but cloud environments with essentially one
> virtual ethernet card).
>
It may be that network-online.target is the right place. Scott had some
reason
for not using that explicitly before; I expect some details from him.
It's no more inconvenient than cloud-init has ever been for users not
providing
a data-source, or booting with cloud-init disabled.
>
> BTW, I'm not sure if it came across -- if you play around with this,
> please drop systemd-networkd.service's After=dbus.service; that will get
> rid of the worst dependency cycles, and it's something which we can do
> in Xenial rather easily (not so easy for devel, that's the part we need
> to discuss with upstream or decide if we care enough about this feature,
> but eventually I figure we want to get rid of it either way).
>
I did play with it, but the networkd in xenial blocks for some non-trivial
amount of time (10s of seconds)
if dbus.service is not up.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1636912
>
> Title:
> systemd-networkd runs too late for cloud-init.service (net)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/systemd/+bug/1636912/+subscriptions
>
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1636912
Title:
systemd-networkd runs too late for cloud-init.service (net)
Status in systemd:
New
Status in cloud-init package in Ubuntu:
Triaged
Status in systemd package in Ubuntu:
Triaged
Status in cloud-init source package in Xenial:
New
Status in systemd source package in Xenial:
Triaged
Bug description:
Ubuntu Core 16 images using cloud-init fail to function when the
DataSource is over the network (Like OpenStack) as networking is not
yet available when cloud-init.service runs.
cloud-init service unit deps look like this:
[Unit]
Description=Initial cloud-init job (metadata service crawler)
DefaultDependencies=no
Wants=cloud-init-local.service
Wants=local-fs.target
Wants=sshd-keygen.service
Wants=sshd.service
After=cloud-init-local.service
After=networking.service
Requires=networking.service
Before=basic.target
Before=dbus.socket
Before=network-online.target
Before=sshd-keygen.service
Before=sshd.service
Before=systemd-user-sessions.service
Conflicts=shutdown.target
Here's networkd unit deps:
[Unit]
Description=Network Service
Documentation=man:systemd-networkd.service(8)
ConditionCapability=CAP_NET_ADMIN
DefaultDependencies=no
# dbus.service can be dropped once on kdbus, and systemd-udevd.service can be
# dropped once tuntap is moved to netlink
After=systemd-udevd.service dbus.service network-pre.target systemd-sysusers.service systemd-sysctl.service
Before=network.target multi-user.target shutdown.target
Conflicts=shutdown.target
Wants=network.target
# On kdbus systems we pull in the busname explicitly, because it
# carries policy that allows the daemon to acquire its name.
Wants=org.freedesktop.network1.busname
After=org.freedesktop.network1.busname
And a critical-chain output:
root at snap-test7:~# systemd-analyze critical-chain systemd-networkd
Failed to get ID: Unit name systemd-networkd is not valid.
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.
root at snap-test7:~# systemd-analyze critical-chain systemd-networkd.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.
systemd-networkd.service +440ms
└─dbus.service @11.461s
└─basic.target @11.403s
└─sockets.target @11.401s
└─dbus.socket @11.398s
└─cloud-init.service @10.127s +1.266s
└─networking.service @9.305s +799ms
└─network-pre.target @9.295s
└─cloud-init-local.service @3.822s +5.469s
└─local-fs.target @3.813s
└─run-cgmanager-fs.mount @12.687s
└─local-fs-pre.target @1.393s
└─systemd-tmpfiles-setup-dev.service @1.116s +195ms
└─kmod-static-nodes.service @887ms +193ms
└─system.slice @783ms
└─-.slice @721ms
cloud-init would need networkd to run at or before 'networking.service' so it can raise networking to then find and use network-based datasources.
# grep systemd /usr/share/snappy/dpkg.list
ii libnss-resolve:amd64 229-4ubuntu11 amd64 nss module to resolve names via systemd-resolved
ii libpam-systemd:amd64 229-4ubuntu11 amd64 system and service manager - PAM module
ii libsystemd0:amd64 229-4ubuntu11 amd64 systemd utility library
ii systemd 229-4ubuntu11 amd64 system and service manager
ii systemd-sysv 229-4ubuntu11 amd64 system and service manager - SysV links
# grep cloud-init /usr/share/snappy/dpkg.list
ii cloud-init 0.7.8-201610260005-gf7a5756-0ubuntu1~trunk~ubuntu16.04.1 all Init scripts for cloud instances
To manage notifications about this bug go to:
https://bugs.launchpad.net/systemd/+bug/1636912/+subscriptions
More information about the foundations-bugs
mailing list