[Bug 2033681] Re: Calico still uses vif type tap and it causes failures with libvirt 9.5.0
Ioana Lazea
2033681 at bugs.launchpad.net
Fri Apr 10 07:17:35 UTC 2026
** Description changed:
+ [ Impact ]
+
+ Starting with libvirt 9.5.0, the default behavior for TAP devices changed; it now expects to manage the creation and lifecycle of the TAP device itself.
+ OpenStack Nova’s networking drivers are designed to pre-create the TAP device before handing it off to libvirt. Because libvirt 10.0.0 tries to "own" the device Nova already created, it fails to launch the instance, resulting in VirtDriver errors and instances stuck in an ERROR state.
+
+ The patch explicitly adds managed="no" to the interface configuration in
+ the libvirt domain XML. This tells libvirt to skip its management
+ attempt and simply use the device provided by Nova, restoring the
+ intended workflow.
+
+ [ Test Plan ]
+
+ Ubuntu does not have support for Neutron Calico (it isn't packaged) but to test and verify this issue we don't need it because it is just a matter of adding a tap device to an existing vm which we can do manually.
+
+ To test this we can do:
+
+ * Deploy Openstack with Neutron OVN
+
+ * Create a guest vm with one port (the tap device will be created by
+ libvirt)
+
+ # openstack server create --flavor m1.tiny --image cirros --network
+ test-net test-vm
+
+ * Stop the vm
+
+ # openstack server stop test-vm
+ or
+ # virsh shutdown instance-00000001
+
+ * Manually create a new tap device and add it to the vm libvirt xml
+
+ # sudo ip tuntap add dev tap1 mode tap
+ # sudo ip link set tap1 up
+ # virsh edit instance-00000001
+
+ On the <interface> section replace with <target dev='tap1'/>
+
+ * Start the vm
+
+ # virsh start instance-00000001
+ error: Failed to start domain 'instance-00000001'
+ error: Requested operation is not valid: The tap1 interface already exists
+
+ * Without the patch this will cause an error but with the patch it
+ should work.
+
+ * With the patch, the VM can boot successfully.
+
+ This patch explicitly adds managed="no" to the interface configuration
+ in the libvirt domain XML. This tells libvirt to skip its management
+ attempt and simply use the device provided by Nova, restoring the
+ intended workflow.
+
+ We will do this manually and edit the XML again and add managed = no.
+
+ Now try to start the VM again;
+ # virsh start instance-00000001
+ Domain 'instance-00000001' started
+
+ [Where problems could occur]
+
+ This change specifically targets the XML generation for TAP interfaces.
+ Since Noble requires libvirt >= 10.0.0, we are not worried about backwards compatibility with extremely old libvirt versions that might not recognize the attribute.
+
+ [ Other Info ]
+
+ The bug has been reported upstream:
+ https://bugs.launchpad.net/nova/+bug/2033681
+
+ This fix is already merged upstream in Nova (see:
+ https://review.opendev.org/c/openstack/nova/+/967570) and is required
+ for Nova to function correctly on any distribution using libvirt 9.5.0
+ or newer, which includes Ubuntu Noble.
+
+ [ Old description ]
Description
===========
Calico (out of tree) uses vif type tap. But libvirt doesn't like pre-existing tap devices https://github.com/libvirt/libvirt/commit/a2ae3d299cf from libvirt 9.5.0. This causes openstack clusters that run calico networking backend to fail during instance creation.
Steps to reproduce
==================
- Ubuntu does not have support for Neutron Calico (it isnt packaged) but to test and verify this issue we don't need it because it is just a matter of adding a tap device to an existing vm which we can do manually. To test this we can do:
-
- * Deploy openstack with neutron OVN
- * Create a guest vm with one port (the tap device will be created by libvirt)
- * Stop the vm
- * Manually create a new tap device and add it to the vm libvirt xml
- * Start the vm
- * Without the patch this will cause an error but with the patch it should work
Expected result
===============
The VM is able to boot without any problems
Actual result
Other information
=================
13:34:38 < sean-k-mooney> calico is apparently still using vif type tap
https://github.com/projectcalico/calico/blob/cf7fa35475eba84f5afcd7f53ac7d07dcb403202/networking-
calico/networking_calico/plugins/ml2/drivers/calico/test/lib.py#L66C31-L66C34
13:35:06 < sean-k-mooney> vif type tap is not supported by our os-vif code so its usign the legacy fallback
13:35:51 < sean-k-mooney> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L595-L596
13:36:15 < sean-k-mooney> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L420-L430
13:36:48 < sean-k-mooney> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/designer.py#L44-L55
13:37:40 < sean-k-mooney> zer0c00l: with that said the tap was always ment to be created by libvirt so it sound like calico might have been doing things it shoudl not have been
13:38:03 < zer0c00l> sean-k-mooney: Thanks for looking into this. :(
13:38:36 < sean-k-mooney> we could proably correct this with a bug fix
13:38:52 < sean-k-mooney> jsut setting managed='no'
13:39:13 < sean-k-mooney> here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L427
13:39:54 < sean-k-mooney> the problem is that the there is no way to test this really upstream
13:40:06 < sean-k-mooney> well beyond unit/fucntional tests
13:40:12 < sean-k-mooney> but we dont have any calico ci
13:40:37 < sean-k-mooney> calico should be the only backend using vif_type=tap
13:40:52 < sean-k-mooney> but im not sure if we woudl need a config option in the workarounds section for this or not
Potential patch
===============
diff --git a/nova/virt/libvirt/config.py b/nova/virt/libvirt/config.py
index 47e92e3..5af3ce4 100644
--- a/nova/virt/libvirt/config.py
+++ b/nova/virt/libvirt/config.py
@@ -1749,6 +1749,7 @@
self.device_addr = None
self.mtu = None
self.alias = None
+ self.managed = 'no'
def __eq__(self, other):
if not isinstance(other, LibvirtConfigGuestInterface):
@@ -1851,7 +1852,7 @@
dev.append(vlan_elem)
if self.target_dev is not None:
- dev.append(etree.Element("target", dev=self.target_dev))
+ dev.append(etree.Element("target", dev=self.target_dev, managed=self.managed))
if self.vporttype is not None:
vport = etree.Element("virtualport", type=self.vporttype)
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/2033681
Title:
Calico still uses vif type tap and it causes failures with libvirt
9.5.0
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive caracal series:
New
Status in Ubuntu Cloud Archive dalmatian series:
New
Status in Ubuntu Cloud Archive epoxy series:
New
Status in Ubuntu Cloud Archive flamingo series:
New
Status in Ubuntu Cloud Archive gazpacho series:
New
Status in OpenStack Compute (nova):
Fix Released
Status in nova package in Ubuntu:
New
Status in nova source package in Noble:
New
Status in nova source package in Questing:
New
Status in nova source package in Resolute:
New
Bug description:
[ Impact ]
Starting with libvirt 9.5.0, the default behavior for TAP devices changed; it now expects to manage the creation and lifecycle of the TAP device itself.
OpenStack Nova’s networking drivers are designed to pre-create the TAP device before handing it off to libvirt. Because libvirt 10.0.0 tries to "own" the device Nova already created, it fails to launch the instance, resulting in VirtDriver errors and instances stuck in an ERROR state.
The patch explicitly adds managed="no" to the interface configuration
in the libvirt domain XML. This tells libvirt to skip its management
attempt and simply use the device provided by Nova, restoring the
intended workflow.
[ Test Plan ]
Ubuntu does not have support for Neutron Calico (it isn't packaged) but to test and verify this issue we don't need it because it is just a matter of adding a tap device to an existing vm which we can do manually.
To test this we can do:
* Deploy Openstack with Neutron OVN
* Create a guest vm with one port (the tap device will be created by
libvirt)
# openstack server create --flavor m1.tiny --image cirros --network
test-net test-vm
* Stop the vm
# openstack server stop test-vm
or
# virsh shutdown instance-00000001
* Manually create a new tap device and add it to the vm libvirt xml
# sudo ip tuntap add dev tap1 mode tap
# sudo ip link set tap1 up
# virsh edit instance-00000001
On the <interface> section replace with <target dev='tap1'/>
* Start the vm
# virsh start instance-00000001
error: Failed to start domain 'instance-00000001'
error: Requested operation is not valid: The tap1 interface already exists
* Without the patch this will cause an error but with the patch it
should work.
* With the patch, the VM can boot successfully.
This patch explicitly adds managed="no" to the interface configuration
in the libvirt domain XML. This tells libvirt to skip its management
attempt and simply use the device provided by Nova, restoring the
intended workflow.
We will do this manually and edit the XML again and add managed = no.
Now try to start the VM again;
# virsh start instance-00000001
Domain 'instance-00000001' started
[Where problems could occur]
This change specifically targets the XML generation for TAP interfaces.
Since Noble requires libvirt >= 10.0.0, we are not worried about backwards compatibility with extremely old libvirt versions that might not recognize the attribute.
[ Other Info ]
The bug has been reported upstream:
https://bugs.launchpad.net/nova/+bug/2033681
This fix is already merged upstream in Nova (see:
https://review.opendev.org/c/openstack/nova/+/967570) and is required
for Nova to function correctly on any distribution using libvirt 9.5.0
or newer, which includes Ubuntu Noble.
[ Old description ]
Description
===========
Calico (out of tree) uses vif type tap. But libvirt doesn't like pre-existing tap devices https://github.com/libvirt/libvirt/commit/a2ae3d299cf from libvirt 9.5.0. This causes openstack clusters that run calico networking backend to fail during instance creation.
Steps to reproduce
==================
Expected result
===============
The VM is able to boot without any problems
Actual result
Other information
=================
13:34:38 < sean-k-mooney> calico is apparently still using vif type
tap
https://github.com/projectcalico/calico/blob/cf7fa35475eba84f5afcd7f53ac7d07dcb403202/networking-
calico/networking_calico/plugins/ml2/drivers/calico/test/lib.py#L66C31-L66C34
13:35:06 < sean-k-mooney> vif type tap is not supported by our os-vif code so its usign the legacy fallback
13:35:51 < sean-k-mooney> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L595-L596
13:36:15 < sean-k-mooney> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L420-L430
13:36:48 < sean-k-mooney> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/designer.py#L44-L55
13:37:40 < sean-k-mooney> zer0c00l: with that said the tap was always ment to be created by libvirt so it sound like calico might have been doing things it shoudl not have been
13:38:03 < zer0c00l> sean-k-mooney: Thanks for looking into this. :(
13:38:36 < sean-k-mooney> we could proably correct this with a bug fix
13:38:52 < sean-k-mooney> jsut setting managed='no'
13:39:13 < sean-k-mooney> here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L427
13:39:54 < sean-k-mooney> the problem is that the there is no way to test this really upstream
13:40:06 < sean-k-mooney> well beyond unit/fucntional tests
13:40:12 < sean-k-mooney> but we dont have any calico ci
13:40:37 < sean-k-mooney> calico should be the only backend using vif_type=tap
13:40:52 < sean-k-mooney> but im not sure if we woudl need a config option in the workarounds section for this or not
Potential patch
===============
diff --git a/nova/virt/libvirt/config.py b/nova/virt/libvirt/config.py
index 47e92e3..5af3ce4 100644
--- a/nova/virt/libvirt/config.py
+++ b/nova/virt/libvirt/config.py
@@ -1749,6 +1749,7 @@
self.device_addr = None
self.mtu = None
self.alias = None
+ self.managed = 'no'
def __eq__(self, other):
if not isinstance(other, LibvirtConfigGuestInterface):
@@ -1851,7 +1852,7 @@
dev.append(vlan_elem)
if self.target_dev is not None:
- dev.append(etree.Element("target", dev=self.target_dev))
+ dev.append(etree.Element("target", dev=self.target_dev, managed=self.managed))
if self.vporttype is not None:
vport = etree.Element("virtualport", type=self.vporttype)
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2033681/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list