[SRU][Q/N/J][PATCH 0/3] Kernel regression (6.8.0-117.generic) (LP: #2153556)

Edoardo Canepa edoardo.canepa at canonical.com
Mon May 25 11:10:06 UTC 2026


BugLink: https://bugs.launchpad.net/bugs/2153556

There is a regression for the CVE-2026-31419 patch in kernel 117.generic that
was supposed to fix a vulnerability in bonding mode 3. This patch caused mode 3
bonding to completely break on a production cluster. We were able to recover
services by booting on a .111 fallback. Hardware is ConnectX-6 at 100G, all
interfaces, links and bonding status appear to be up but no packets are passed.

The regression was introduced by the CVE-2026-31419 backport
(noble: e661eaea3c07, jammy: 9c4ce0d95706) which replaced the direct
slave-list iteration in bond_xmit_broadcast() with a loop over the
bond->all_slaves array. However, bond->all_slaves is only populated by
bond_update_slave_arr(), which is gated behind bond_mode_can_use_xmit_hash().
That predicate returns false for BOND_MODE_BROADCAST, so the array is
never built, bond_xmit_broadcast() finds slaves_count=0 on every call,
and every frame is dropped.

Additionally, on Questing the broadcast TX fix is already
present, but a follow-up issue remains: when updelay is configured,
bond_update_slave_arr() populates usable_slaves with zero entries
(because during enslave the slave state is BOND_LINK_BACK). This causes
bond_miimon_inspect() to have ignore_updelay always true, so the updelay
parameter is silently ignored.

[Fix]

For Questing, cherry-pick the following commit from mainline:
- 45fc134bcfad bonding: do not set usable_slaves for broadcast mode

For Noble, cherry-pick the following commits from mainline:
- e0caeb24f538 net: bonding: update the slave array for broadcast mode
- 45fc134bcfad bonding: do not set usable_slaves for broadcast mode

For Jammy, cherry-pick the following commits from mainline:
- e0caeb24f538 net: bonding: update the slave array for broadcast mode
- 45fc134bcfad bonding: do not set usable_slaves for broadcast mode

Resolute is not affected.

[Test Plan]

1. Build and install the kernel with the patches applied.

2. Verify broadcast-mode bonding transmits frames:

   modprobe dummy
   ip link add bond-test type bond mode broadcast miimon 100
   ip link add dummy0 type dummy
   ip link add dummy1 type dummy
   ip link set dummy0 down
   ip link set dummy1 down
   ip link set dummy0 master bond-test
   ip link set dummy1 master bond-test
   ip link set dummy0 up
   ip link set dummy1 up
   ip link set bond-test up
   ip addr add 10.99.99.1/24 dev bond-test
   ping -c3 10.99.99.2
   ip -s link show bond-test

   Expected: TX bytes and packets counters increment (no TX drops).
   Before fix: TX dropped increments, zero bytes/packets transmitted.

3. Confirm CVE-2026-31419 protection is not regressed

The upstream commit 45fc134bcfad fixes a race condition where
bond_update_slave_arr() for broadcast mode populates usable_slaves
with zero entries (slave state is BOND_LINK_BACK during enslave),
causing bond_miimon_inspect() to set ignore_updelay=true. This
requires specific timing conditions during enslave and is not
trivially reproducible in a VM. The fix is verified by code
inspection: bond_set_slave_arr() now skips usable_slaves for
broadcast mode, preventing the stale empty array from affecting
ignore_updelay logic.

[Where problems could occur]

The patches modify bond_enslave(), __bond_release_one(), and
bond_set_slave_arr() — core bonding paths shared by all modes. A
regression could theoretically affect other bonding modes (802.3ad, XOR,
TLB, ALB) if the slave array update logic is inadvertently altered.
However, the risk is low: patch 1 only adds an additional condition to
existing guards without changing the logic for other modes, and patch 2
only short-circuits usable_slaves assignment specifically for broadcast
mode. Both patches are well-tested upstream and already present in HWE
kernels without reported issues.



More information about the kernel-team mailing list