[Bug 997978] Re: KVM images lose connectivity with bridged network
Gary Cuozzo
gary at isgsoftware.net
Tue Aug 14 18:36:29 UTC 2012
I have seen this issue on 2 different servers which use bridging but not
bonding.
One server was a customer system and we were forced to back-date the OS
to an earlier release. They were experiencing the issue up to once/day
and quickly got impatient to have it resolved.
The other server is an internal system which runs multiple vm's. We
have only seen the issue on one of the vm's and only once every 2-3
weeks. The vm which experiences the issue is our LTSP server.
I have been testing a small cluster of 3 host machines which use both
bonding and bridging. I have not seen this issue affect them, but the
usage is quite light and the vm's come & go since it's a testing
environment right now. Due to this bug, we have halted any plans to
upgrade vm hosts to Precise until we can verify it's fixed.
We've seen the following when the issue has occurred:
* Absolutely nothing in any logs, dmesg, etc.
* Host machine cannot ping the guest
* arp shows guest as incomplete
* guest machine can ping its own IP, but nothing else (host, gw, etc)
* restarting networking subsystem is successful (no errors) but has no effect on the problem
* rebooting the guest fixes the problem until it happens again. The reboot does not actually kill the kvm session and get a new process ID, but somehow having the guest go through the init again fixes it (until it happens again some period later).
* This issue has occurred on one 12.04 guest and one 11.10 guest
* Both of the servers which this occured on are Dell 2950 series machines. I have not seen this issue on any of our HP Proliant (mostly DL360's) machines.
If there is some sort of test I can run to help debug, I'm happy to do
that.
Thank you for trying to address this. This is a huge bug for us.
Thanks,
gary
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to bridge-utils in Ubuntu.
https://bugs.launchpad.net/bugs/997978
Title:
KVM images lose connectivity with bridged network
Status in “bridge-utils” package in Ubuntu:
Invalid
Status in “ifenslave” package in Ubuntu:
Confirmed
Status in “libvirt” package in Ubuntu:
Confirmed
Status in “linux” package in Ubuntu:
Confirmed
Status in “qemu-kvm” package in Ubuntu:
Confirmed
Bug description:
System:
-----------
Dell R410 Dual processor 2.4Ghz w/16G RAM
Distributor ID: Ubuntu
Description: Ubuntu 12.04 LTS
Release: 12.04
Codename: precise
Setup:
---------
We're running 3 KVM guests, all Ubuntu 12.04 LTS using bridged networking.
From the host:
# cat /etc/network/interfaces
auto br0
iface br0 inet static
address 212.XX.239.98
netmask 255.255.255.240
gateway 212.XX.239.97
bridge_ports eth0
bridge_fd 9
bridge_hello 2
bridge_maxage 12
bridge_stp off
# ifconfig eth0
eth0 Link encap:Ethernet HWaddr d4:ae:52:84:2d:5a
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:11278363 errors:0 dropped:3128 overruns:0 frame:0
TX packets:14437384 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4115980743 (4.1 GB) TX bytes:5451961979 (5.4 GB)
Interrupt:36 Memory:da000000-da012800
# ifconfig br0
br0 Link encap:Ethernet HWaddr d4:ae:52:84:2d:5a
inet addr:212.XX.239.98 Bcast:212.XX.239.111 Mask:255.255.255.240
inet6 addr: fe80::d6ae:52ff:fe84:2d5a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1720861 errors:0 dropped:0 overruns:0 frame:0
TX packets:1708622 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:210152198 (210.1 MB) TX bytes:300858508 (300.8 MB)
# brctl show
bridge name bridge id STP enabled interfaces
br0 8000.d4ae52842d5a no eth0
I have no default network configured to autostart in libvirt as we're using bridged networking:
# virsh net-list --all
Name State Autostart
-----------------------------------------
default inactive no
# arp
Address HWtype HWaddress Flags Mask Iface
mailer03.xxxx.com ether 52:54:00:82:5f:0f C br0
mailer01.xxxx.com ether 52:54:00:d2:f7:31 C br0
mailer02.xxxx.com ether 52:54:00:d3:8f:91 C br0
dxi-gw2.xxxx.com ether 00:1a:30:2a:b1:c0 C br0
From one of the guests:
<domain type='kvm' id='4'>
<name>mailer01</name>
<uuid>d41d1355-84e8-ae23-e84e-227bc0231b97</uuid>
<memory>2097152</memory>
<currentMemory>2097152</currentMemory>
<vcpu>1</vcpu>
<os>
<type arch='x86_64' machine='pc-1.0'>hvm</type>
<boot dev='hd'/>
</os>
<features>
<acpi/>
</features>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/bin/kvm</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/dev/mapper/vg_main-mailer01--root'/>
<target dev='hda' bus='ide'/>
<alias name='ide0-0-0'/>
<address type='drive' controller='0' bus='0' unit='0'/>
</disk>
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/dev/mapper/vg_main-mailer01--swap'/>
<target dev='hdb' bus='ide'/>
<alias name='ide0-0-1'/>
<address type='drive' controller='0' bus='0' unit='1'/>
</disk>
<controller type='ide' index='0'>
<alias name='ide0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<interface type='bridge'>
<mac address='52:54:00:d2:f7:31'/>
<source bridge='br0'/>
<target dev='vnet0'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='pty'>
<source path='/dev/pts/0'/>
<target port='0'/>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/0'>
<source path='/dev/pts/0'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'>
<listen type='address' address='127.0.0.1'/>
</graphics>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='apparmor' relabel='yes'>
<label>libvirt-d41d1355-84e8-ae23-e84e-227bc0231b97</label>
<imagelabel>libvirt-d41d1355-84e8-ae23-e84e-227bc0231b97</imagelabel>
</seclabel>
</domain>
From within the guest:
# cat /etc/network/interfaces
# The primary network interface
auto eth0
iface eth0 inet static
address 212.XX.239.100
netmask 255.255.255.240
network 212.XX.239.96
broadcast 212.XX.239.111
gateway 212.XX.239.97
# ifconfig
eth0 Link encap:Ethernet HWaddr 52:54:00:d2:f7:31
inet addr:212.XX.239.100 Bcast:212.XX.239.111 Mask:255.255.255.240
inet6 addr: fe80::5054:ff:fed2:f731/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5631830 errors:0 dropped:0 overruns:0 frame:0
TX packets:6683416 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2027322829 (2.0 GB) TX bytes:2076698690 (2.0 GB)
A commandline which starts the KVM guest:
/usr/bin/kvm -S -M pc-1.0 -enable-kvm -m 2048 -smp 1,sockets=1,cores=1,threads=1 -name mailer01 -uuid d41d1355-84e8-ae23-e84e-227bc0231b97 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/mailer01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -drive file=/dev/mapper/vg_main-mailer01--root,if=none,id=drive-ide0-0-0,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive file=/dev/mapper/vg_main-mailer01--swap,if=none,id=drive-ide0-0-1,format=raw -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -netdev tap,fd=18,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:d2:f7:31,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4
Problem:
------------
Periodically (at least once a day), one or more of the guests lose network connectivity. Ping responds with 'host unreachable', even from the dom host. Logging in via the serial console shows no problems: eth0 is up, can ping the local host, but no outside connectivity. Restart the network (/etc/init.d/networking restart) does nothing. Reboot the machine and it comes alive again.
I've verified there's no arp games going on on the primary host (the
arp tables remain the same before - when it had connectivity - and
after - when it doesn't.
This is a critical issue affecting production services on the latest
LTS release of Ubuntu. It's similar to an issue which was 'resolved'
in 10.04 but appears to have risen its ugly head again.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/bridge-utils/+bug/997978/+subscriptions
More information about the foundations-bugs
mailing list