[Bug 1972648] [NEW] Schema upgrade unsuccessful at first attempt, succeds at second attempt
Frode Nordahl
1972648 at bugs.launchpad.net
Mon May 9 13:25:35 UTC 2022
Public bug reported:
When upgrading the OVN package on the unit that first was used to create
a DB cluster (i.e. `--db-Xb-cluster-remote-addr` is empty in
/etc/default/ovn-central) the `ovn-ctl` script will perform Schema
upgrade of the cluster.
This sometimes fails in the first attempt, but succeeds in the second
attempt by just restarting the appropriate OVN DB systemd service. This
is with only upgrading the package on the lead unit, leaving the other
nodes available in the cluster.
No trace of the failure is to be found in the logs so we can only guess
what is happening.
Looking at the CTL library code [0] I suspect that the default of using
a 30 second timeout [1] for the entire `ovsdb-client` invocation is
insufficient if the system is slow/busy/has large db/need
compaction/snapshot etc etc.
0: https://github.com/openvswitch/ovs/blob/9dd3031d2e0e9597449e95428320ccaaff7d8b3d/utilities/ovs-lib.in#L490
1: https://github.com/openvswitch/ovs/blob/9dd3031d2e0e9597449e95428320ccaaff7d8b3d/lib/timeval.c#L257-L271
** Affects: ovn (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/1972648
Title:
Schema upgrade unsuccessful at first attempt, succeds at second
attempt
Status in ovn package in Ubuntu:
New
Bug description:
When upgrading the OVN package on the unit that first was used to
create a DB cluster (i.e. `--db-Xb-cluster-remote-addr` is empty in
/etc/default/ovn-central) the `ovn-ctl` script will perform Schema
upgrade of the cluster.
This sometimes fails in the first attempt, but succeeds in the second
attempt by just restarting the appropriate OVN DB systemd service.
This is with only upgrading the package on the lead unit, leaving the
other nodes available in the cluster.
No trace of the failure is to be found in the logs so we can only
guess what is happening.
Looking at the CTL library code [0] I suspect that the default of
using a 30 second timeout [1] for the entire `ovsdb-client` invocation
is insufficient if the system is slow/busy/has large db/need
compaction/snapshot etc etc.
0: https://github.com/openvswitch/ovs/blob/9dd3031d2e0e9597449e95428320ccaaff7d8b3d/utilities/ovs-lib.in#L490
1: https://github.com/openvswitch/ovs/blob/9dd3031d2e0e9597449e95428320ccaaff7d8b3d/lib/timeval.c#L257-L271
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1972648/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list