[Bug 482419] Re: 802.3ad interface bonding fails if started too early

Karsten Becker 482419 at bugs.launchpad.net
Mon Jun 20 15:08:05 UTC 2011


After applying the update on my servers, I got massive problems in the
stability of my network connections. I was running bond in mode 6, which
now seems not to work anymore.

auto bond0
iface bond0 inet static
    hwaddress ether 00:30:88:88:88:88

    address 192.168.111.5
    netmask 255.255.254.0
    network 192.168.111.0
    broadcast 192.168.112.255
    gateway 192.168.111.1
    dns-nameservers 127.0.0.1
    dns-search bar.local

    # Both network interfaces
    slaves eth0 eth1

    # (balance-alb) Adaptive load balancing
    bond_mode 6

    bond_miimon 100
    bond_updelay 200
    bond_downdelay 200

My kern.log got flushed with the following "Jun 20 12:33:57 foo kernel:
[ 1043.270668] bonding: bond0: Error: found a client with no channel in
the client's hash table", followed by up/down messages from time to
time:

[...]
Jun 20 12:33:31 foo kernel: [ 1009.900122] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:31 foo kernel: [ 1009.900123] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:31 foo kernel: [ 1009.900125] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:31 foo kernel: [ 1010.774755] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Jun 20 12:33:31 foo kernel: [ 1010.862510] bonding: bond0: link status up for interface eth0, enabling it in 0 ms.
Jun 20 12:33:31 foo kernel: [ 1010.862514] bonding: bond0: link status definitely up for interface eth0.
Jun 20 12:33:31 foo kernel: [ 1010.862517] bonding: bond0: making interface eth0 the new active one.
Jun 20 12:33:31 foo kernel: [ 1010.863951] bonding: bond0: first active interface up!
Jun 20 12:33:31 foo kernel: [ 1010.900005] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:31 foo kernel: [ 1010.900007] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:31 foo kernel: [ 1010.900008] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:31 foo kernel: [ 1010.900010] bonding: bond0: Error: found a client with no channel in the client's hash table
[...]
Jun 20 12:34:53 foo kernel: [ 1050.282624] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:34:53 foo kernel: [ 1050.673490] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Jun 20 12:34:53 foo kernel: [ 1050.733759] bonding: bond0: link status up for interface eth0, enabling it in 0 ms.
Jun 20 12:34:53 foo kernel: [ 1050.733762] bonding: bond0: link status definitely up for interface eth0.
Jun 20 12:34:53 foo kernel: [ 1050.733765] bonding: bond0: making interface eth0 the new active one.
Jun 20 12:34:53 foo kernel: [ 1050.735167] bonding: bond0: first active interface up!
Jun 20 12:34:53 foo kernel: [ 1063.804210] e1000e: eth0 NIC Link is Down
Jun 20 12:34:53 foo kernel: [ 1063.833760] bonding: bond0: link status down for interface eth0, disabling it in 200 ms.
Jun 20 12:34:53 foo kernel: [ 1064.033767] bonding: bond0: link status definitely down for interface eth0, disabling it
Jun 20 12:34:53 foo kernel: [ 1064.034501] bonding: bond0: now running without any active interface !
Jun 20 12:34:53 foo kernel: [ 1064.362507] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:34:53 foo kernel: [ 1064.362510] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:34:53 foo kernel: [ 1064.362512] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:34:53 foo kernel: [ 1064.362514] bonding: bond0: Error: found a client with no channel in the client's hash table
[...]
Jun 20 12:33:31 foo kernel: [ 1010.900118] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:31 foo kernel: [ 1010.900120] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:31 foo kernel: [ 1017.245555] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Jun 20 12:33:31 foo kernel: [ 1017.272973] bonding: bond0: link status up for interface eth1, enabling it in 200 ms.
Jun 20 12:33:31 foo kernel: [ 1017.460008] bonding: bond0: link status definitely up for interface eth1.
Jun 20 12:33:56 foo kernel: [ 1042.481383] e1000e: eth1 NIC Link is Down
Jun 20 12:33:56 foo kernel: [ 1042.540011] bonding: bond0: link status down for interface eth1, disabling it in 200 ms.
Jun 20 12:33:56 foo kernel: [ 1042.740015] bonding: bond0: link status definitely down for interface eth1, disabling it
Jun 20 12:33:56 foo kernel: [ 1042.740021] device eth0 entered promiscuous mode
Jun 20 12:33:56 foo kernel: [ 1042.740435] e1000e: eth0 NIC Link is Down
Jun 20 12:33:56 foo kernel: [ 1042.840015] bonding: bond0: link status down for interface eth0, disabling it in 200 ms.
Jun 20 12:33:57 foo kernel: [ 1043.040009] bonding: bond0: link status definitely down for interface eth0, disabling it
Jun 20 12:33:57 foo kernel: [ 1043.040687] device eth0 left promiscuous mode
Jun 20 12:33:57 foo kernel: [ 1043.040913] bonding: bond0: now running without any active interface !
Jun 20 12:33:57 foo kernel: [ 1043.230020] bonding: bond0: Error: found a client with no channel in the client's hash table
Jun 20 12:33:57 foo kernel: [ 1043.270668] bonding: bond0: Error: found a client with no channel in the client's hash table
[...]

Maybe someone can check and confirm this behaviour. Unfortunately, all
of my servers are productive. I had to change them back to single NIC
operation to get them accessible again. I cannot do any further tests
with my environment.

Regards from Berlin/Germany
Karsten

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ifenslave-2.6 in Ubuntu.
https://bugs.launchpad.net/bugs/482419

Title:
  802.3ad interface bonding fails if started too early

Status in “ifenslave-2.6” package in Ubuntu:
  Fix Released
Status in “ifenslave-2.6” source package in Lucid:
  Fix Released
Status in “ifenslave-2.6” package in Debian:
  Unknown

Bug description:
  Impact: see original report below
  How the patch fixes it: pre-up sets up master before attempting to enslave and setup slaves
  Patch: https://bugs.edge.launchpad.net/ubuntu/+source/ifenslave-2.6/+bug/482419/+attachment/1455658/+files/ifenslave-2.6-sru.diff
  Reproducing: http://ubuntuforums.org/showpost.php?p=8285696&postcount=3
  Regression potential: none known

  == Original report ==
  802.3ad bonding configurations that formerly worked on jaunty are now failing on startup under karmic. After the system has started, restarting networking will bring the bond up correctly. This only applies to bond_mode 4 / 802.3ad, I've tested that switching to bond_mode 0 corrects the issue, and other users experiencing this bug all were using bond_mode 4 as well.

  dmesg output fills with "bonding: bond0: Warning: Found an
  uninitialized port", even after the system starts up and the port
  should be "initialized"

  It appears to occur on multiple drivers (bnx2, e1000 confirmed).

  One initially wants to blame the startup ordering due to the switch to
  upstart, but I believe it is an edge case that hasn't been seen before
  because we haven't been starting up so quickly that the hardware
  hasn't had time to fully initialized.

  Configuration and output from multiple users is in this thread:
  http://ubuntuforums.org/showthread.php?p=8311572

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifenslave-2.6/+bug/482419/+subscriptions




More information about the foundations-bugs mailing list