[Bug 1907081] Re: Clustered OVN database is not upgraded on package upgrade
Frode Nordahl
1907081 at bugs.launchpad.net
Mon Feb 15 17:12:13 UTC 2021
With the system used to validate Focal in #4 I performed a upgrade to Groovy. We can then immediately verify that a unchanged Groovy package exhibits the problem:
2021-02-15T17:08:14.584Z|00014|ovsdb_idl|WARN|Forwarding_Group table in OVN_Northbound database lacks external_ids column (database needs upgrade?)
2021-02-15T17:08:14.584Z|00015|ovsdb_idl|WARN|Load_Balancer table in OVN_Northbound database lacks selection_fields column (database needs upgrade?)
2021-02-15T17:08:14.584Z|00016|ovsdb_idl|WARN|Logical_Router_Policy table in OVN_Northbound database lacks external_ids column (database needs upgrade?)
2021-02-15T17:08:14.584Z|00017|ovsdb_idl|WARN|Logical_Router_Port table in OVN_Northbound database lacks ipv6_prefix column (database needs upgrade?)
2021-02-15T17:08:14.584Z|00018|ovsdb_idl|WARN|NAT table in OVN_Northbound database lacks external_port_range column (database needs upgrade?)
After installing the Groovy package from -proposed we can confirm the schema warnings are gone:
2021-02-15T17:11:32.828Z|00010|reconnect|INFO|ssl:10.247.39.51:6641: connected
2021-02-15T17:11:32.829Z|00011|ovsdb_idl|INFO|ssl:10.247.39.51:6641: clustered database server is not cluster leader; trying another server
2021-02-15T17:11:32.830Z|00012|ovsdb_idl|INFO|ssl:10.247.39.149:16642: clustered database server is not cluster leader; trying another server
2021-02-15T17:11:32.830Z|00013|reconnect|INFO|ssl:10.247.39.51:6641: connection attempt timed out
2021-02-15T17:11:32.830Z|00014|reconnect|INFO|ssl:10.247.39.149:16642: connection attempt timed out
2021-02-15T17:11:32.830Z|00015|reconnect|INFO|ssl:10.247.39.127:6641: connecting...
2021-02-15T17:11:32.830Z|00016|reconnect|INFO|ssl:10.247.39.127:16642: connecting...
2021-02-15T17:11:32.834Z|00017|reconnect|INFO|ssl:10.247.39.127:6641: connected
2021-02-15T17:11:32.836Z|00018|reconnect|INFO|ssl:10.247.39.127:16642: connected
2021-02-15T17:11:32.836Z|00019|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2021-02-15T17:11:32.839Z|00020|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
** Tags removed: verification-needed verification-needed-groovy
** Tags added: verification-done verification-done-groovy
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/1907081
Title:
Clustered OVN database is not upgraded on package upgrade
Status in charm-ovn-central:
Fix Released
Status in ovn package in Ubuntu:
Fix Released
Status in ovn source package in Focal:
Fix Committed
Status in ovn source package in Groovy:
Fix Committed
Status in ovn source package in Hirsute:
Fix Released
Bug description:
[Impact]
On upgrade of the OVN packages it may be necessary to perform a upgrade to the Northbound and Southbound databases.
Failure to do so may lead to loss of connectivity between
participating nodes as the software components will attempt to make
use of columns that are not available in the database.
The upgrade process has been performed automatically by the upstream
init script by default since inception, both for a local and clustered
setup. But as discussed below recent changes has inadvertently omitted
this behavior for clustered databases.
[Test Case]
Non-clustered scenario as reference test:
Install the ovn-central package in a container using the in-release focal package and start the database and ovn-northd services.
Upgrade the container to the OVN packages from in-release Groovy and
observe the package performing the database upgrade and subsequently
ovn-northd service not complaining about missing columns in the
database.
Clustered scenario:
Install the ovn-central charm across three containers and necessary dependencies. Perform package upgrade as outlined above and compare how in-relase and proposed packages behave.
[Regression Potential]
As we are restoring the intended behavior the regression potential is minimal.
[Original Bug Report]
In the systemd service we make use of the `ovn-ctl` script `run_nb_ovsdb` and `run_sb_ovsdb` sub-commands introduced in [0]. These sub-commands fit nicely with systemd's expectations of modern daemons to no longer detachand run in the background.
However, the change in [0] has the side effect of disabling automatic
upgrading of clustered databases. Previously this would have been done
on every startup [1].
A recent commit to master [2] addresses this and uses the combination
of presence of `--db-*-cluster-local-addr` and non-presence of the
`--db-*-cluster-remote-addr` to determine if the upgrade should be
run.
We should backport [2] to our supported OVN packages to prepare for
supporting upgrades that require database schema changes. We may also
need to change the behavior of the ovn-central charm to not set the
`--db-*-cluster-remote-addr` argument on the leader unit.
0: https://github.com/ovn-org/ovn/commit/6444059b5f9444ce06634794d275257f945a6ce5
1: https://github.com/ovn-org/ovn/blob/5c2d311b8b7b4d5c3a619de72be6a433aa4c44db/utilities/ovn-ctl#L312-L314
2: https://github.com/ovn-org/ovn/commit/67e2f386cc838d0b0f9b4b5da7fe611e1113b70c
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ovn-central/+bug/1907081/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list