[Bug 2150130] Re: --restart in ovn-controller leaves gateway ports active
Arif Ali
2150130 at bugs.launchpad.net
Wed Apr 29 12:36:23 UTC 2026
** Description changed:
[Impact]
Currently, if an ovn controller node goes down and the node is in the
list of the lrp gateways, then this is not removed from the list. This
means it will still try and use this gateway, and fail, and go onto the
next one if this was at the top of the priority list.
What the back-port does is that it removes the --restart from the
systemd. This then removes the gateway from the list, so OVN will not
try to use this gateway port.
+ On top of this, by removing the --restart, it can affect upgrades of the
+ ovn package, to avoid this we add the d/ovn-host.preinst that prevents
+ the runtime state of OVN at upgrade
+
[Test Plan]
- * Install Openstack using the default openstack bundles, ensure to have at least 2 or 3 nova-compute units.
+ * Install OpenStack using the default OpenStack bundles, ensure to have at least 2 or 3 nova-compute units.
* Find the node that has the northbound database as well as the southbound database, and login to both of these. It could be that they are both on the same node.
* Then run the following commands, with sample outputs
```
root at juju-71c67f-3-lxd-2:~# ovn-nbctl lr-list
8588bef1-b3fc-4097-87e7-2c34e26f4c69 (neutron-727e3681-8eb4-4025-aa54-046588a2c3ab)
root at juju-71c67f-3-lxd-2:~# ovn-nbctl show 8588bef1-b3fc-4097-87e7-2c34e26f4c69
router 8588bef1-b3fc-4097-87e7-2c34e26f4c69 (neutron-727e3681-8eb4-4025-aa54-046588a2c3ab) (aka provider-router)
- port lrp-3e157281-4075-456d-bc17-61c6fcb5ff1d
- mac: "fa:16:3e:9b:be:95"
- networks: ["10.0.22.1/24"]
- port lrp-35c4d398-9a43-4fbf-9781-bc4a6d8d4a38
- mac: "fa:16:3e:d0:25:3b"
- networks: ["192.168.21.1/24"]
- port lrp-7c725d10-8056-455d-a2d5-b84f33c16045
- mac: "fa:16:3e:81:04:03"
- networks: ["192.168.1.44/24"]
- gateway chassis: [as3-maas-node-03.maas as2-maas-node-04.maas as4-maas-node-03.maas as4-maas-node-04.maas as3-maas-node-06.maas]
- nat 3121ce08-9899-495b-b385-5ecf17be0c1b
- external ip: "192.168.1.44"
- logical ip: "192.168.21.0/24"
- type: "snat"
- nat 8323629e-dbcc-4a45-84a4-a90aa3859cd7
- external ip: "192.168.1.44"
- logical ip: "10.0.22.0/24"
- type: "snat"
- nat 8e54bb28-3418-4052-9dc8-3c4178225322
- external ip: "192.168.1.42"
- logical ip: "10.0.22.187"
- type: "dnat_and_snat"
+ port lrp-3e157281-4075-456d-bc17-61c6fcb5ff1d
+ mac: "fa:16:3e:9b:be:95"
+ networks: ["10.0.22.1/24"]
+ port lrp-35c4d398-9a43-4fbf-9781-bc4a6d8d4a38
+ mac: "fa:16:3e:d0:25:3b"
+ networks: ["192.168.21.1/24"]
+ port lrp-7c725d10-8056-455d-a2d5-b84f33c16045
+ mac: "fa:16:3e:81:04:03"
+ networks: ["192.168.1.44/24"]
+ gateway chassis: [as3-maas-node-03.maas as2-maas-node-04.maas as4-maas-node-03.maas as4-maas-node-04.maas as3-maas-node-06.maas]
+ nat 3121ce08-9899-495b-b385-5ecf17be0c1b
+ external ip: "192.168.1.44"
+ logical ip: "192.168.21.0/24"
+ type: "snat"
+ nat 8323629e-dbcc-4a45-84a4-a90aa3859cd7
+ external ip: "192.168.1.44"
+ logical ip: "10.0.22.0/24"
+ type: "snat"
+ nat 8e54bb28-3418-4052-9dc8-3c4178225322
+ external ip: "192.168.1.42"
+ logical ip: "10.0.22.187"
+ type: "dnat_and_snat"
root at juju-71c67f-3-lxd-2:~# ovn-nbctl lrp-get-gateway-chassis lrp-7c725d10-8056-455d-a2d5-b84f33c16045
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as3-maas-node-03.maas 5
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as3-maas-node-06.maas 4
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as4-maas-node-04.maas 3
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as4-maas-node-03.maas 2
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as2-maas-node-04.maas 1
```
This shows that node as3-maas-node-03.maas has the highest priority
* Now stop ovn-controller service on the the highest priority node, i.e. as3-maas-node-03.maas in this case
* Now the output of the last command will still be the same. We should expect as3-maas-node-03.maas to be removed.
* Now install the new package, and run through the same process, and we should see that the node is removed, and similar output to the one below should be seen.
```
root at juju-71c67f-3-lxd-2:~# ovn-nbctl lrp-get-gateway-chassis lrp-7c725d10-8056-455d-a2d5-b84f33c16045
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as3-maas-node-06.maas 5
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as4-maas-node-04.maas 4
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as4-maas-node-03.maas 3
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as2-maas-node-04.maas 2
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as1-maas-node-06.maas 1
```
-
[Where problems could occur]
The --restart flag was originally added the commit [1] & (LP: #1940043)
to ensure that upgrades don't cause issues. So we could potentially have
that issue.
[1]
https://git.launchpad.net/ubuntu/+source/ovn/commit/?h=import/21.09.0_git20210806.d08f89e21-0ubuntu1.1&id=d73df64c24f97b6133448b57cae8d82af51df1fe
[Other Info]
- As part of this SRU, we are also tackling the upgrade issue with LP:
- 1940043
+ Related LP: 1940043
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/2150130
Title:
--restart in ovn-controller leaves gateway ports active
Status in ovn package in Ubuntu:
New
Status in ovn source package in Jammy:
Confirmed
Bug description:
[Impact]
Currently, if an ovn controller node goes down and the node is in the
list of the lrp gateways, then this is not removed from the list. This
means it will still try and use this gateway, and fail, and go onto
the next one if this was at the top of the priority list.
What the back-port does is that it removes the --restart from the
systemd. This then removes the gateway from the list, so OVN will not
try to use this gateway port.
On top of this, by removing the --restart, it can affect upgrades of
the ovn package, to avoid this we add the d/ovn-host.preinst that
prevents the runtime state of OVN at upgrade
[Test Plan]
* Install OpenStack using the default OpenStack bundles, ensure to have at least 2 or 3 nova-compute units.
* Find the node that has the northbound database as well as the southbound database, and login to both of these. It could be that they are both on the same node.
* Then run the following commands, with sample outputs
```
root at juju-71c67f-3-lxd-2:~# ovn-nbctl lr-list
8588bef1-b3fc-4097-87e7-2c34e26f4c69 (neutron-727e3681-8eb4-4025-aa54-046588a2c3ab)
root at juju-71c67f-3-lxd-2:~# ovn-nbctl show 8588bef1-b3fc-4097-87e7-2c34e26f4c69
router 8588bef1-b3fc-4097-87e7-2c34e26f4c69 (neutron-727e3681-8eb4-4025-aa54-046588a2c3ab) (aka provider-router)
port lrp-3e157281-4075-456d-bc17-61c6fcb5ff1d
mac: "fa:16:3e:9b:be:95"
networks: ["10.0.22.1/24"]
port lrp-35c4d398-9a43-4fbf-9781-bc4a6d8d4a38
mac: "fa:16:3e:d0:25:3b"
networks: ["192.168.21.1/24"]
port lrp-7c725d10-8056-455d-a2d5-b84f33c16045
mac: "fa:16:3e:81:04:03"
networks: ["192.168.1.44/24"]
gateway chassis: [as3-maas-node-03.maas as2-maas-node-04.maas as4-maas-node-03.maas as4-maas-node-04.maas as3-maas-node-06.maas]
nat 3121ce08-9899-495b-b385-5ecf17be0c1b
external ip: "192.168.1.44"
logical ip: "192.168.21.0/24"
type: "snat"
nat 8323629e-dbcc-4a45-84a4-a90aa3859cd7
external ip: "192.168.1.44"
logical ip: "10.0.22.0/24"
type: "snat"
nat 8e54bb28-3418-4052-9dc8-3c4178225322
external ip: "192.168.1.42"
logical ip: "10.0.22.187"
type: "dnat_and_snat"
root at juju-71c67f-3-lxd-2:~# ovn-nbctl lrp-get-gateway-chassis lrp-7c725d10-8056-455d-a2d5-b84f33c16045
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as3-maas-node-03.maas 5
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as3-maas-node-06.maas 4
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as4-maas-node-04.maas 3
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as4-maas-node-03.maas 2
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as2-maas-node-04.maas 1
```
This shows that node as3-maas-node-03.maas has the highest priority
* Now stop ovn-controller service on the the highest priority node, i.e. as3-maas-node-03.maas in this case
* Now the output of the last command will still be the same. We should expect as3-maas-node-03.maas to be removed.
* Now install the new package, and run through the same process, and we should see that the node is removed, and similar output to the one below should be seen.
```
root at juju-71c67f-3-lxd-2:~# ovn-nbctl lrp-get-gateway-chassis lrp-7c725d10-8056-455d-a2d5-b84f33c16045
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as3-maas-node-06.maas 5
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as4-maas-node-04.maas 4
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as4-maas-node-03.maas 3
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as2-maas-node-04.maas 2
lrp-7c725d10-8056-455d-a2d5-b84f33c16045_as1-maas-node-06.maas 1
```
[Where problems could occur]
The --restart flag was originally added the commit [1] & (LP:
#1940043) to ensure that upgrades don't cause issues. So we could
potentially have that issue.
[1]
https://git.launchpad.net/ubuntu/+source/ovn/commit/?h=import/21.09.0_git20210806.d08f89e21-0ubuntu1.1&id=d73df64c24f97b6133448b57cae8d82af51df1fe
[Other Info]
Related LP: 1940043
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/2150130/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list