[Bug 2088620] Re: [SRU] Deprecated usage of cpu_util

Bryan Fraschetti 2088620 at bugs.launchpad.net
Fri Apr 4 20:18:52 UTC 2025


Hi Robie, apologies for the delay, I've been waiting for the customer to
respond with a clarification of the problem for quite a few weeks.
Though they have not confirmed the functional issue, the cpu_util metric
was deprecated in Rocky and fully removed in Victoria and as a result
within Watcher the metric instance_cpu_usage, which is obtained from
Ceilometer's cpu_util, is reported as "None" rather than a percentage.
There are various Watcher strategies (eg. the workload_balancing audit
which recommends vm host migrations to balance the cluster's cpu usage)
which rely on instance_cpu_usage to determine action plans. Since
cpu_util was reporting None, these strategies were rendered ineffective.
This would be a better description of the impact, though as I said I'm
still in the process of verifying with the customer that this is what
they are experiencing. The purpose of this patch is to calculate in-
place what cpu_util would be reporting were it to exist. This is what my
verification was based on but admittedly it was not clear in my SRU
template - I thought the metric's deprecation would be sufficient
justification for an SRU but now understand that the impact needs to be
the functional issue(s). Once I hear back from the customer I'll rewrite
the template. Thanks!

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/2088620

Title:
  [SRU] Deprecated usage of cpu_util

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive antelope series:
  Fix Committed
Status in Ubuntu Cloud Archive bobcat series:
  Fix Committed
Status in Ubuntu Cloud Archive caracal series:
  Fix Released
Status in Ubuntu Cloud Archive dalmatian series:
  Fix Released
Status in Ubuntu Cloud Archive epoxy series:
  Fix Released
Status in Ubuntu Cloud Archive yoga series:
  New
Status in Ubuntu Cloud Archive zed series:
  Won't Fix
Status in watcher package in Ubuntu:
  Fix Released
Status in watcher source package in Focal:
  Confirmed
Status in watcher source package in Jammy:
  Confirmed
Status in watcher source package in Noble:
  Fix Released
Status in watcher source package in Oracular:
  Fix Released

Bug description:
  [ Impact ]

    * The watcher releases targeted by this SRU are using a deprecated
  ceilometer metric, cpu_util, which reported cpu utilization as a
  percentage. This metric was deprecated in Openstack Rocky in favor of
  the Gnocchi rate calculation equivalent [1].

    * Upstream Watcher continued to use cpu_util until the commit at [2]
  landed on master for 2024.1. This commit correctly performs the cpu
  calculation and removes the deprecated metric. The calculation is
  summarized in the next bullet point and there is an example
  calculation in the original commit

    * The gnocchi calculation uses the cumulative cpu time in ns
  (reported by the cpu metric), taken as a rate (the difference in
  cumulative time over the last two sampling intervals) to find the
  total cpu time during the previous sampling period. Dividing the cpu
  time in one interval by the duration of the interval multiplied by the
  number of vcpus provides the cpu utilization as a percentage:
  cpu_usage = [cpu_time / (period * 10^9 * nvcpus)] * 100%. A sample
  calculation is provided in the original commit message.

    * I cherry-picked to stable/2023.2 [3], but the other branches have
  gone unmaintained

  [ Test Plan ]

    * Deploy openstack yoga on jammy with watcher and gnocchi services

    * Launch a server and take note of it's resource id. Then find the
  gnocchi cpu metric associated with the instance

    * Create a watcher audit based on a goal that previously depended on instance cpu utilization. For example the workload_balance goal [4]
      Ex. openstack optimize audit create -t CONTINUOUS -i 60 -g workload_balancing -s workload_balance --auto-trigger
      Without the patch instance_cpu_usage appears as None in the audits. With the patch you can observe the correct cpu utilization percentage in the watcher-decision-engine.log

    * Wait for at least one sampling period to elapse and check
  /var/log/watcher/watcher-decision-engine.log for entries showing
  "instance_cpu_usage" - this is the cpu utilization as a percentage.

    * To verify the percentage with a manual calculation, run gnocchi
  measure show <metric uuid> --aggregation "rate:mean" and perform the
  calculation instance_cpu_usage = 100*[<value> / (period * 10^9 *
  nvcpus) using the cpu time from the corresponding sampling period

  [ What can go wrong ]

    * While this is replacing a deprecated methodology and metric and
  should lead to improvements, any custom strategies relying on cpu_util
  may be affected.

  [1] https://docs.openstack.org/releasenotes/ceilometer/rocky.html
  [2] https://review.opendev.org/c/openstack/watcher/+/898791
  [3] https://review.opendev.org/c/openstack/watcher/+/934181
  [4] https://docs.openstack.org/watcher/rocky/strategies/workload_balance.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2088620/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list