[Bug 1997124] Re: Netplan/Systemd/Cloud-init/Dbus Race
Lukas Märdian
1997124 at bugs.launchpad.net
Wed Jan 25 15:45:38 UTC 2023
I think the "Failed to connect system bus: No such file or directory"
stderr output rather comes from networkctl [1] than from "netplan-dbus"
(Netplan's output would be "... connect TO system bus..."). netplan-dbus
is not involved at all AFAICS, as cloud-init is calling into the
"netplan apply" CLI and not calling its "io.netplan.Netplan Apply()"
DBus method; which would fail due to missing DBus communication, too.
So the root-cause IMO is networkctl trying to talk to systemd-networkd
via DBus, which is not yet ready. Porting this communication to using
varlink instead of dbus could solve this (but is probably a big task).
Are we sure that systemd-networkd.service is already up-and-running at
this stage and dbus.service/.socket being the bottleneck? We're sorting
`After=systemd-networkd-wait-online.service`, so I assume: Yes.
Netplan's "apply" CLI could probably implement a "systemctl is-active ..." check for dbus.service/.socket and/or systemd-networkd.service/NetworkManager.service (depending on which backend is about to be (re-)configured. But generally "netplan apply" is designed to be a userspace tool and only Netplan's generator is designed to be executed during early boot. So if it's possible to postpone the execution of "netplan apply" until after systemd's initial boot transaction finished (i.e. into cloud-config.service) this would IMO be the cleaner solution and could avoid similar, future issues related to early boot.
[1] https://github.com/systemd/systemd/blob/main/src/network/networkctl.c#L2992
** Changed in: netplan
Status: New => Triaged
** Changed in: netplan
Importance: Undecided => Wishlist
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1997124
Title:
Netplan/Systemd/Cloud-init/Dbus Race
Status in cloud-init:
In Progress
Status in netplan:
Triaged
Status in systemd package in Ubuntu:
Confirmed
Bug description:
Cloud-init is seeing intermittent failures while running `netplan
apply`, which appears to be caused by a missing resource at the time
of call.
The symptom in cloud-init logs looks like:
Running ['netplan', 'apply'] resulted in stderr output: Failed to
connect system bus: No such file or directory
I think that this error[1] is likely caused by cloud-init running
netplan apply too early in boot process (before dbus is active).
Today I stumbled upon this error which was hit in MAAS[2]. We have
also hit it intermittently during tests (we didn't have a reproducer).
Realizing that this may not be a cloud-init error, but possibly a
dependency bug between dbus/systemd we decided to file this bug for
broader visibility to other projects.
I will follow up this initial report with some comments from our
discussion earlier.
[1] https://github.com/canonical/netplan/blob/main/src/dbus.c#L801
[2] https://discourse.maas.io/t/latest-ubuntu-20-04-image-causing-netplan-error/5970
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1997124/+subscriptions
More information about the foundations-bugs
mailing list