[Bug 1636912] Re: systemd-networkd runs too late for cloud-init.service (net)

Ryan Harper 1636912 at bugs.launchpad.net
Tue Dec 13 13:25:28 UTC 2016


On Tue, Dec 13, 2016 at 10:02 AM, Martin Pitt <martin.pitt at ubuntu.com>
wrote:

> Ryan Harper [2016-12-06 12:54 -0000]:
> > The following change should go against systemd-networkd-wait-
> > online.service
> >
> > + # Ensure that DNS is working before reaching online target
> > + After=systemd-networkd-resolvconf-update.service
>
> For the record, this should be the other way around -- add
> Before=systemd-networkd-wait-online.service to
> s-n-resolvconf-update.service. The latter is a Debian downstream unit
> and thus avoids carrying a patch to an upstream unit that refers to a
> downstream one.
>

Well, ideally we'd have both.  Part of the challenge in dealing with
systemd units is that it's very difficult to determine the ordering.
If one doesn't look at the right file.

I won't push for a delta but I do think that these unit relationships ought
to be explicit on both sides.


>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1636912
>
> Title:
>   systemd-networkd runs too late for cloud-init.service (net)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/systemd/+bug/1636912/+subscriptions
>

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to resolvconf in Ubuntu.
https://bugs.launchpad.net/bugs/1636912

Title:
  systemd-networkd runs too late for cloud-init.service (net)

Status in systemd:
  Fix Released
Status in cloud-init package in Ubuntu:
  Triaged
Status in resolvconf package in Ubuntu:
  Fix Released
Status in systemd package in Ubuntu:
  Fix Released
Status in cloud-init source package in Xenial:
  Confirmed
Status in resolvconf source package in Xenial:
  In Progress
Status in systemd source package in Xenial:
  Fix Committed
Status in cloud-init source package in Yakkety:
  New
Status in resolvconf source package in Yakkety:
  In Progress
Status in systemd source package in Yakkety:
  Fix Committed
Status in resolvconf package in Debian:
  New

Bug description:
  Ubuntu Core 16 images using cloud-init fail to function when the
  DataSource is over the network (Like OpenStack) as networking is not
  yet available when cloud-init.service runs.

  cloud-init service unit deps look like this:

  [Unit]
  Description=Initial cloud-init job (metadata service crawler)
  DefaultDependencies=no
  Wants=cloud-init-local.service
  Wants=local-fs.target
  Wants=sshd-keygen.service
  Wants=sshd.service
  After=cloud-init-local.service
  After=networking.service
  Requires=networking.service
  Before=basic.target
  Before=dbus.socket
  Before=network-online.target
  Before=sshd-keygen.service
  Before=sshd.service
  Before=systemd-user-sessions.service
  Conflicts=shutdown.target

  Here's networkd unit deps:

  [Unit]
  Description=Network Service
  Documentation=man:systemd-networkd.service(8)
  ConditionCapability=CAP_NET_ADMIN
  DefaultDependencies=no
  # dbus.service can be dropped once on kdbus, and systemd-udevd.service can be
  # dropped once tuntap is moved to netlink
  After=systemd-udevd.service dbus.service network-pre.target systemd-sysusers.service systemd-sysctl.service
  Before=network.target multi-user.target shutdown.target
  Conflicts=shutdown.target
  Wants=network.target

  # On kdbus systems we pull in the busname explicitly, because it
  # carries policy that allows the daemon to acquire its name.
  Wants=org.freedesktop.network1.busname
  After=org.freedesktop.network1.busname

  And a critical-chain output:

  root at snap-test7:~# systemd-analyze critical-chain systemd-networkd
  Failed to get ID: Unit name systemd-networkd is not valid.
  The time after the unit is active or started is printed after the "@" character.
  The time the unit takes to start is printed after the "+" character.

  root at snap-test7:~# systemd-analyze critical-chain systemd-networkd.service
  The time after the unit is active or started is printed after the "@" character.
  The time the unit takes to start is printed after the "+" character.

  systemd-networkd.service +440ms
  └─dbus.service @11.461s
    └─basic.target @11.403s
      └─sockets.target @11.401s
        └─dbus.socket @11.398s
          └─cloud-init.service @10.127s +1.266s
            └─networking.service @9.305s +799ms
              └─network-pre.target @9.295s
                └─cloud-init-local.service @3.822s +5.469s
                  └─local-fs.target @3.813s
                    └─run-cgmanager-fs.mount @12.687s
                      └─local-fs-pre.target @1.393s
                        └─systemd-tmpfiles-setup-dev.service @1.116s +195ms
                          └─kmod-static-nodes.service @887ms +193ms
                            └─system.slice @783ms
                              └─-.slice @721ms

  cloud-init would need networkd to run at or before
  'networking.service' so it can raise networking to then find and use
  network-based datasources.

  # grep systemd /usr/share/snappy/dpkg.list
  ii  libnss-resolve:amd64          229-4ubuntu11                                            amd64        nss module to resolve names via systemd-resolved
  ii  libpam-systemd:amd64          229-4ubuntu11                                            amd64        system and service manager - PAM module
  ii  libsystemd0:amd64             229-4ubuntu11                                            amd64        systemd utility library
  ii  systemd                       229-4ubuntu11                                            amd64        system and service manager
  ii  systemd-sysv                  229-4ubuntu11                                            amd64        system and service manager - SysV links

  # grep cloud-init /usr/share/snappy/dpkg.list
  ii  cloud-init                    0.7.8-201610260005-gf7a5756-0ubuntu1~trunk~ubuntu16.04.1 all          Init scripts for cloud instances

  SRU INFORMATION FOR systemd
  ===========================
  Fix: For xenial it is sufficient to drop systemd-networkd's After=dbus.service (https://github.com/systemd/systemd/commit/5f004d1e32) and (for xenial only) drop the useless org.freedesktop.network1.busname unit (which is always "condition failed" as there is no kdbus, but it moves systemd-network.service after sockets.target which is too late for cloud-init).

  Regression potential: Low. networkd is not widely being used outside of netplan/snappy in xenial. Running it before dbus.service is running has two consequences:
   - It cannot immediately expose its D-Bus status interface. But it will retry every 5 s until that succeeds, so the D-Bus status interface will continue to work. (see test case)
   - If a DHCP response with a hostname or timezone is received before dbus.service is running, it cannot talk to systemd-hostnamed/systemd-timedated to set these properties (if enabled). However, this is broken in xenial anyway as it fails on polkit permissions (this and retrying this configuration after D-Bus is up has been fixed in upstream master now).

  As for removing the "*.busname" units in xenial: kdbus has never been
  part of any distribiution, there had just been some experimental DKMS
  package in some PPA for it. It's dead as an upstream project, so by
  dropping the *.busname unit(s) from xenial there should be no
  practical effect as these should always not start with "condition
  failed". Yakkety's systemd already has them removed.

  Test case:
   - Install nplan, set up a netplan configuration and remove /etc/network/interfaces.
   - Upgrade to the proposed packages.
   - Ensure that the network is still functional and "busctl" shows org.freedesktop.network1, i. e. networkd successfully connected to the bus.
   - Check the journal that systemd-networkd.service starts before dbus.service, which should usually be the case with this fix. Check "journalctl -b" for "Started Network Service." vs. "Started D-Bus System Message Bus."

    If it repeatedly starts the other way around, you can force it with "sudo systemctl edit systemd-networkd.service" and
     [Unit]
     Before=sysinit.target

    (This is effectively what cloud-init.service will do soon.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/systemd/+bug/1636912/+subscriptions



More information about the foundations-bugs mailing list