[Bug 2003250] Re: networkctl reload with bond devices causes slaves to go DOWN and UP, causing couple of seconds of network loss
Launchpad Bug Tracker
2003250 at bugs.launchpad.net
Tue Apr 1 06:22:01 UTC 2025
This bug was fixed in the package systemd - 249.11-0ubuntu3.15
---------------
systemd (249.11-0ubuntu3.15) jammy; urgency=medium
* d/systemd.prerm: call d-s-h update-state for resolved on upgrades
(LP: #2078555)
systemd (249.11-0ubuntu3.14) jammy; urgency=medium
[ Ioanna Alifieraki ]
* network: skip to reassign master ifindex if already set
(LP: #2003250)
[ Nick Rosbrook ]
* network: do not bring down a bonding port interface when it is already joined
(This is a follow-up commit required for LP: 2003250)
* networkd-test: skip test_resolved_domain_restricted_dns
(LP: #2009859)
systemd (249.11-0ubuntu3.13) jammy; urgency=medium
[ Lukas Märdian ]
* Fixing GRE6 and VTI6 on newer kernels (LP: #2037667)
[ Nick Rosbrook ]
* debian/tests/tests-in-lxd: update workaround patch (LP: #2055200)
[ Chengen Du ]
* udev: Handle PTP device symlink properly on udev action 'change'
(LP: #2077779)
-- Nick Rosbrook <enr0n at ubuntu.com> Thu, 20 Feb 2025 08:24:02 -0500
** Changed in: systemd (Ubuntu Jammy)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/2003250
Title:
networkctl reload with bond devices causes slaves to go DOWN and UP,
causing couple of seconds of network loss
Status in systemd package in Ubuntu:
Fix Released
Status in systemd source package in Jammy:
Fix Released
Status in systemd source package in Kinetic:
Won't Fix
Bug description:
[SRU TEMPLATE]
[DESCRIPTION]
We currently use Ubuntu 22.04.1 LTS including updates for our production cloud (switched from legacy Centos 7).
Although we like the distribution we recently hit serious systemd buggy behavior described in [1] bugreport using packages [2].
Unfortunatelly the clouds we are running consist of openstack on top
of kubernetes and we need to have complex network configuration
including linux bond devices.
Our observation is that every time we apply our configuration via
CI/CD infrastructure using ansible and netplan (regardless whether
there is actual network configuration change) we see approximatelly
8-16 seconds network interruptions and see bond interfaces going DOWN
and then UP.
We expect bond interfaces stay UP when there is no network
configuration change.
We went though couple of options how to solve the issue and the first
one is to add such existing patch [3] into current
systemd-249.11-0ubuntu3.6.
Could you comment whether this kind of non-security patch is likely to land in 22.04.1 LTS soon.
We are able to help to bring patch into systemd package community way if you suggest the steps.
[TESTING]
On a Jammy system, create a bond interface with two subordinate
devices. Assuming the interfaces ens3 and ens9 exist on the system,
this can be done using the following:
$ cat > /etc/netplan/bond.yaml << EOF
network:
version: 2
renderer: networkd
ethernets:
ens3:
dhcp4: no
ens9:
dhcp4: no
bonds:
bond0:
dhcp4: yes
interfaces:
- ens3
- ens9
parameters:
mode: active-backup
primary: ens3
EOF
$ netplan generate && netplan apply
From here, there are two tests that can be used to verify the fix.
1. Update the modification time of the generated network files, and
call networkctl reload. From networkctl(1), when "reload" is called:
[...] If a new, modified or removed .network file is found, then all
interfaces which match the file are reconfigured.
Hence, the following will trigger the desired code path:
$ touch /run/systemd/network/*
$ networkctl reload
Without the fix, you can see in the logs the interfaces of the bond
going up and down. With the fix, this should not happen.
$ journalctl -b -u systemd-networkd.service --grep="Link DOWN"
Finally, check that everything is back in the configured state:
$ networkctl status
2. This bug can also be triggered by calling networkctl reconfigure
directly.
$ networkctl reconfigure ens3
$ networkctl reconfigure ens9
Check the logs that the links were not brought down:
$ journalctl -b -u systemd-networkd.service --grep="Link DOWN"
Finally, check that everything is back in the configured state:
$ networkctl status
[REGRESSION POTENTIAL]
This patch is confined to the SET_LINK_MASTER logic for configuring
links in systemd-networkd. While bond interfaces are the motivation
for the fix, this early return applies for all interface types which
SET_LINK_MASTER is supported, e.g. bridge interfaces as well.
This logic has seen exercise in newer releases of systemd and Ubuntu
without further modification, so I would not expect to see regressions
for other interface types. Furthermore, the bond type is the only type
where the link is set to down in order to configure the master
interface index, so this call was already effectively a no-op for
those other interface types.
If any problems did occur, it would be related to (re-)configuring
link types which have a master interface set.
[OTHER]
This fix requires two upstream patches:
https://github.com/systemd/systemd/commit/9f913d37a0
https://github.com/systemd/systemd/commit/c3e12de0a6
The second is a follow-up to the first, to complete the fix.
These patches do not apply cleanly to v249, so some trivial conflicts
were resolved to make the patches apply. Additionally, some additional
logic is added to the patches so that the link state is correctly set
when this new branch is hit.
Specifically, we decrement the set_link_messages counter, and call
link_check_ready() before returning -EALREADY. This is necessary
because the version of systemd where these patches originate from saw
a lot of refactoring in this area of systemd-networkd since v249. So,
while in newer versions of systemd, the message counter is handled
correctly, and link_check_ready() is eventually called despite
cancelling the SET_LINK_MASTER request, this never happens when these
patches are applied to v249. Hence, we add the necessary steps to the
patch.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2003250/+subscriptions
More information about the foundations-bugs
mailing list