[Bug 2122551] Re: [SRU] Backport feature for disabling migration to Noble and Plucky
Quang Ngo
2122551 at bugs.launchpad.net
Mon Mar 2 23:48:29 UTC 2026
Hi Paride, thanks for the review!
> Some comments mention "merge the patch into UCA Epoxy", but that's
unrelated to what what ubuntu-sponsor can (and should) do. I hope we are
on the same page on this?
Yes, I think we're on the same page regarding the scope. For the UCA
Epoxy, as I see from Myles's comment above, the patch is already
uploaded to epoxy-staging, so I guess it should be handled by UCA team
already and not require ubuntu-sponsors action. Sorry for the confusion
as we're tracking both efforts in one place.
> You would like the attached lp2122551_noble.demo.patch sponsored to
Noble, correct?
Yes, it is. We have a failed CI for it though, but I guess should be
understandable as I explained in #17
> Right now, I find the description of the problem to be a bit lacking.
For a detailed context, please refer to :
https://specs.openstack.org/openstack/watcher-
specs/specs/2025.2/implemented/host-maintenance-strategy-disable-
migration.html#problem-description
Tldr, when putting a compute node into maintenance mode, Watcher's
host_maintenance strategy attempts to migrate ALL instances, both active
and inactive. However, if the deployment doesn't support a certain
migration type (for example inactive instances requires cold migration),
this causes the maintenance operation to fail entirely with error,
blocking users from doing necessary maintenance.
I updated the SRU description with this as well, thank you!
** Description changed:
Watcher upstream has adopted a new feature to the Host Maintenance
strategy to disable live or cold migration and safely stop active
instances when migration cannot proceed. This feature is planned for
Watcher 15.0.0 shipped with OpenStack 25.02, but will be useful if it
can be backported to Ubuntu Plucky and Noble.
For example, currently Sunbeam/Canonical OpenStack is mainly using
Watcher rock images which are pulled from Ubuntu Cloud Archive Epoxy
(watcher 2:14.0.0) and Caracal (watcher 2:12.0.0). Having this new
feature will help address this known issue: https://canonical-
openstack.readthedocs-hosted.com/en/latest/how-
to/operations/maintenance-mode/#known-issues
Upstream commit:
https://opendev.org/openstack/watcher/commit/cc26b3b334e5d60bf04c927c771d572445e4a8bc
[ Impact ]
- * User problem: Current host maintenance strategy forces migration operations that may not be suitable for all deployment scenarios.
+ * User problem: Current host maintenance strategy forces migration operations that may not be suitable for all deployment scenarios. For a detailed context, please refer to : https://specs.openstack.org/openstack/watcher-specs/specs/2025.2/implemented/host-maintenance-strategy-disable-migration.html#problem-description
+
+ Tldr, when putting a compute node into maintenance mode, Watcher's
+ host_maintenance strategy attempts to migrate ALL instances, both active
+ and inactive. However, if the deployment doesn't support a certain
+ migration type (for example inactive instances requires cold migration),
+ this causes the maintenance operation to fail entirely with error,
+ blocking users from doing necessary maintenance.
* Functional Enhancement: Introduces two new input parameters to the Host Maintenance strategy:
- `disable_live_migration`: When True, forces cold migration instead of live migration
- `disable_cold_migration`: When True, prevents cold migration of inactive instances
- Combined usage: When both are True, only stop actions are performed on active instances
* Backward Compatibility: All changes are additive with sensible
defaults (both new parameters default to False), ensuring existing Host
Maintenance strategy deployments continue working unchanged.
* No API Changes: The feature adds only configuration parameters to
current schema and internal action handling - no API modifications or
breaking changes to existing interfaces.
* Target Releases:
- Ubuntu 25.04 (Plucky) with watcher 2:14.0.0 -> enables UCA Epoxy
- Ubuntu 24.04 (Noble) with watcher 2:12.0.0
[ Test Case ]
Prerequisite:
* OpenStack cluster with Watcher enabled.
* At least two compute nodes in the cluster
* Test instances running on the maintenance target node
1. Test Case 1: Backward Compatibility
# Verify existing behavior is unchanged
openstack optimize audit create -g cluster_maintaining -s host_maintenance \
-p maintenance_node=compute01 -p backup_node=compute02
# Expected: Traditional live/cold migration behavior (no stop actions)
openstack actionplan show <audit_uuid>
2. Test Case 2: Both Migrations Disabled
# Test stop-only behavior (the new stop action)
openstack optimize audit create -g cluster_maintaining -s host_maintenance \
-p maintenance_node=compute01 -p disable_live_migration=True \
-p disable_cold_migration=True
# Expected: Action plan contains only "stop" actions for instances
openstack actionplan show <audit_uuid>
3. Test Case 3: Live Migration Disabled Only
# Test cold migration fallback
openstack optimize audit create -g cluster_maintaining -s host_maintenance \
-p maintenance_node=compute01 -p disable_live_migration=True
# Expected: Active instances use cold migration, inactive instances use cold migration
openstack actionplan show <audit_uuid>
4. Test Case 4: Cold Migration Disabled Only
# Test live migration with no cold migration
openstack optimize audit create -g cluster_maintaining -s host_maintenance \
-p maintenance_node=compute01 -p disable_cold_migration=True
# Expected: Active instances use live migration, inactive instances remain untouched
openstack actionplan show <audit_uuid>
The testing can also be done via Ubuntu OpenStack CI system using
Tempest to verify the backward compatibility.
[ Regression Potential / Where problems could occur]
Configuration Conflicts:
* Risk: Administrators might misconfigure parameters, leading to unexpected behavior
* Manifestation: Instances stopped when migration was intended, or vice versa
* Detection: Review action plans before execution; monitor Watcher logs for parameter validation
Stop Action Failures:
* Risk: New stop action might fail on instances with complex configurations (attached volumes, special networking, etc.)
* Manifestation: Action plan execution failures; instances left in inconsistent states
* Detection: Failed action plan execution; Nova API errors in Watcher applier logs
Otherwise, due to no changes in API and the new parameters are set to
False by default, the regression potential is mitigated with low risk.
All existing CI jobs in upstream have passed.
[ Discussion ]
N/A
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/2122551
Title:
[SRU] Backport feature for disabling migration to Noble and Plucky
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive epoxy series:
New
Status in watcher package in Ubuntu:
Fix Released
Status in watcher source package in Noble:
New
Status in watcher source package in Plucky:
Won't Fix
Status in watcher source package in Questing:
Fix Released
Bug description:
Watcher upstream has adopted a new feature to the Host Maintenance
strategy to disable live or cold migration and safely stop active
instances when migration cannot proceed. This feature is planned for
Watcher 15.0.0 shipped with OpenStack 25.02, but will be useful if it
can be backported to Ubuntu Plucky and Noble.
For example, currently Sunbeam/Canonical OpenStack is mainly using
Watcher rock images which are pulled from Ubuntu Cloud Archive Epoxy
(watcher 2:14.0.0) and Caracal (watcher 2:12.0.0). Having this new
feature will help address this known issue: https://canonical-
openstack.readthedocs-hosted.com/en/latest/how-
to/operations/maintenance-mode/#known-issues
Upstream commit:
https://opendev.org/openstack/watcher/commit/cc26b3b334e5d60bf04c927c771d572445e4a8bc
[ Impact ]
* User problem: Current host maintenance strategy forces migration operations that may not be suitable for all deployment scenarios. For a detailed context, please refer to : https://specs.openstack.org/openstack/watcher-specs/specs/2025.2/implemented/host-maintenance-strategy-disable-migration.html#problem-description
Tldr, when putting a compute node into maintenance mode, Watcher's
host_maintenance strategy attempts to migrate ALL instances, both
active and inactive. However, if the deployment doesn't support a
certain migration type (for example inactive instances requires cold
migration), this causes the maintenance operation to fail entirely
with error, blocking users from doing necessary maintenance.
* Functional Enhancement: Introduces two new input parameters to the Host Maintenance strategy:
- `disable_live_migration`: When True, forces cold migration instead of live migration
- `disable_cold_migration`: When True, prevents cold migration of inactive instances
- Combined usage: When both are True, only stop actions are performed on active instances
* Backward Compatibility: All changes are additive with sensible
defaults (both new parameters default to False), ensuring existing
Host Maintenance strategy deployments continue working unchanged.
* No API Changes: The feature adds only configuration parameters to
current schema and internal action handling - no API modifications or
breaking changes to existing interfaces.
* Target Releases:
- Ubuntu 25.04 (Plucky) with watcher 2:14.0.0 -> enables UCA Epoxy
- Ubuntu 24.04 (Noble) with watcher 2:12.0.0
[ Test Case ]
Prerequisite:
* OpenStack cluster with Watcher enabled.
* At least two compute nodes in the cluster
* Test instances running on the maintenance target node
1. Test Case 1: Backward Compatibility
# Verify existing behavior is unchanged
openstack optimize audit create -g cluster_maintaining -s host_maintenance \
-p maintenance_node=compute01 -p backup_node=compute02
# Expected: Traditional live/cold migration behavior (no stop actions)
openstack actionplan show <audit_uuid>
2. Test Case 2: Both Migrations Disabled
# Test stop-only behavior (the new stop action)
openstack optimize audit create -g cluster_maintaining -s host_maintenance \
-p maintenance_node=compute01 -p disable_live_migration=True \
-p disable_cold_migration=True
# Expected: Action plan contains only "stop" actions for instances
openstack actionplan show <audit_uuid>
3. Test Case 3: Live Migration Disabled Only
# Test cold migration fallback
openstack optimize audit create -g cluster_maintaining -s host_maintenance \
-p maintenance_node=compute01 -p disable_live_migration=True
# Expected: Active instances use cold migration, inactive instances use cold migration
openstack actionplan show <audit_uuid>
4. Test Case 4: Cold Migration Disabled Only
# Test live migration with no cold migration
openstack optimize audit create -g cluster_maintaining -s host_maintenance \
-p maintenance_node=compute01 -p disable_cold_migration=True
# Expected: Active instances use live migration, inactive instances remain untouched
openstack actionplan show <audit_uuid>
The testing can also be done via Ubuntu OpenStack CI system using
Tempest to verify the backward compatibility.
[ Regression Potential / Where problems could occur]
Configuration Conflicts:
* Risk: Administrators might misconfigure parameters, leading to unexpected behavior
* Manifestation: Instances stopped when migration was intended, or vice versa
* Detection: Review action plans before execution; monitor Watcher logs for parameter validation
Stop Action Failures:
* Risk: New stop action might fail on instances with complex configurations (attached volumes, special networking, etc.)
* Manifestation: Action plan execution failures; instances left in inconsistent states
* Detection: Failed action plan execution; Nova API errors in Watcher applier logs
Otherwise, due to no changes in API and the new parameters are set to
False by default, the regression potential is mitigated with low risk.
All existing CI jobs in upstream have passed.
[ Discussion ]
N/A
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2122551/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list