[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart
Jorge Niedbalski
1885430 at bugs.launchpad.net
Thu Mar 18 21:26:45 UTC 2021
---> Installed version
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# dpkg -l |grep -i ceilometer
ii ceilometer-agent-compute 1:12.1.1-0ubuntu1~cloud1 all ceilometer compute agent
ii ceilometer-common 1:12.1.1-0ubuntu1~cloud1 all ceilometer common files
ii python3-ceilometer 1:12.1.1-0ubuntu1~cloud1 all ceilometer python libraries
Run through 2 cases
1) Service restart
2) Reboot
---> Service restart case
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2021-03-18 21:20:01 UTC; 2min 35s ago
Main PID: 27650 (ceilometer-poll)
Tasks: 6 (limit: 4702)
CGroup: /system.slice/ceilometer-agent-compute.service
├─27650 ceilometer-polling: master process [/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf --polling-namespaces compute --log-file=/var/log/cei
└─27735 ceilometer-polling: AgentManager worker(0)
Mar 18 21:20:01 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped Ceilometer Agent Compute.
Mar 18 21:20:01 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 18 21:20:03 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[27650]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
Active: active (running) since Thu 2021-03-18 18:46:56 UTC; 2h 35min ago
Main PID: 2199 (nova-compute)
Tasks: 22 (limit: 4702)
CGroup: /system.slice/nova-compute.service
└─2199 /usr/bin/python3 /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf --log-file=/var/log/nova/nova-compute.log
Mar 18 18:46:56 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started
OpenStack Compute.
--
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl stop nova-compute
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl disable nova-compute.service
Synchronizing state of nova-compute.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable nova-compute
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2021-03-18 21:23:30 UTC; 7s ago
Main PID: 2199 (code=exited, status=0/SUCCESS)
Mar 18 18:46:56 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started OpenStack Compute.
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopping OpenStack Compute...
Mar 18 21:23:30 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped OpenStack Compute.
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu#
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2021-03-18 21:23:24 UTC; 29s ago
Main PID: 761 (code=exited, status=0/SUCCESS)
Mar 18 21:23:13 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 18 21:23:14 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[761]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopping Ceilometer Agent Compute...
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped Ceilometer Agent Compute.
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# /etc/init.d/ceilometer-agent-compute restart
[ ok ] Restarting ceilometer-agent-compute (via systemctl): ceilometer-agent-compute.service.
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2021-03-18 21:24:13 UTC; 2s ago
Main PID: 1549 (ceilometer-poll)
Tasks: 6 (limit: 4702)
CGroup: /system.slice/ceilometer-agent-compute.service
├─1549 ceilometer-polling: master process [/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf --polling-namespaces compute --log-file=/var/log/ceil
└─1604 ceilometer-polling: AgentManager worker(0)
Mar 18 21:24:13 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 18 21:24:14 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[1549]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT"
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
Active: active (running) since Thu 2021-03-18 21:24:13 UTC; 6s ago
Main PID: 1548 (nova-compute)
Tasks: 22 (limit: 4702)
CGroup: /system.slice/nova-compute.service
└─1548 /usr/bin/python3 /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf --log-file=/var/log/nova/nova-compute.log
Mar 18 21:24:13 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started
OpenStack Compute.
---> Reboot testing
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl disable nova-compute.service
Synchronizing state of nova-compute.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable nova-compute
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl stop nova-compute
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# reboot
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# uptime
21:25:44 up 0 min, 1 user, load average: 1.60, 0.38, 0.13
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
Active: active (running) since Thu 2021-03-18 21:25:32 UTC; 13s ago
Main PID: 3099 (nova-compute)
Tasks: 22 (limit: 4702)
CGroup: /system.slice/nova-compute.service
└─3099 /usr/bin/python3 /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf --log-file=/var/log/nova/nova-compute.log
Mar 18 21:25:32 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started OpenStack Compute.
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2021-03-18 21:25:32 UTC; 15s ago
Main PID: 3100 (ceilometer-poll)
Tasks: 6 (limit: 4702)
CGroup: /system.slice/ceilometer-agent-compute.service
├─3100 ceilometer-polling: master process [/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf --polling-namespaces compute --log-file=/var/log/ceil
└─3229 ceilometer-polling: AgentManager worker(0)
Mar 18 21:25:32 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 18 21:25:35 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[3100]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT"
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu#
** Tags removed: verification-stein-needed
** Tags added: verification-stein-done
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceilometer in Ubuntu.
https://bugs.launchpad.net/bugs/1885430
Title:
[Bionic/Stein] Ceilometer-agent fails to collect metrics after restart
Status in OpenStack ceilometer-agent charm:
Confirmed
Status in Ubuntu Cloud Archive:
Fix Committed
Status in Ubuntu Cloud Archive stein series:
Fix Committed
Status in Ubuntu Cloud Archive train series:
Fix Committed
Status in Ubuntu Cloud Archive ussuri series:
Fix Committed
Status in Ubuntu Cloud Archive victoria series:
Fix Committed
Status in ceilometer package in Ubuntu:
Fix Released
Status in ceilometer source package in Focal:
Fix Committed
Status in ceilometer source package in Groovy:
Fix Committed
Status in ceilometer source package in Hirsute:
Fix Released
Bug description:
Bionic/Stein - stable 20.05 charms
Juju 2.7.6
I am aware of: https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1850846
Decided to open a new bug since there was no activity on the previous one and it expired.
After rebooting my cloud (rack-by-rack), I got into a situation where
I could not collect memory.usage from VMs anymore.
Looking into: openstack metric resource --type instance <ID>
I could not see memory.usage there.
Access to ceilometer-agent and I could see the services were on active/running status, but following log was present:
Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory
Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: message repeated 33 times: [ libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory]
stat on that /var/run file shows me:
stat /var/run/libvirt/libvirt-sock-ro
File: /var/run/libvirt/libvirt-sock-ro
Size: 0 Blocks: 0 IO Block: 4096 socket
Device: 17h/23d Inode: 1289 Links: 1
Access: (0777/srwxrwxrwx) Uid: ( 0/ root) Gid: ( 118/ libvirt)
Access: 2020-06-28 14:28:47.292838669 +0000
Modify: 2020-06-27 22:34:11.010520529 +0000
Change: 2020-06-27 22:34:11.010520529 +0000
Birth: -
So, I guess there is a race-condition here, where libvirt is opening the socket after ceilometer-agent-compute tried to reach out for it; which gives up and stop working.
Restarting it restores memory.usage back to normal.
However, I still cannot see all the metrics as shown in:
https://bugzilla.redhat.com/show_bug.cgi?id=1437927
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1885430/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list