[Bug 1607694] Re: Exclusive-Lock Issue
Daniel Kraft
daniel.kraft at d9t.de
Fri Aug 12 20:43:25 UTC 2016
Got it. This is the log output that results in the dead kvm process.
2016-08-12 22:29:44.499285 7f6dcffff700 10 librbd::ImageWatcher: 0x7f6db400b020 C_NotifyAck start: id=234071422725522, handle=140096867143248
2016-08-12 22:29:44.518673 7f6dcffff700 10 librbd::ImageWatcher: 0x7f6db4003b60 image header updated
2016-08-12 22:29:44.518680 7f6dcffff700 20 librbd::ImageState: refresh_seq = 5, last_refresh = 4
2016-08-12 22:29:44.518689 7f6dcffff700 10 librbd::ImageWatcher: 0x7f6db400b020 C_NotifyAck finish: id=234071422725522, handle=140096867143248
2016-08-12 22:29:44.573373 7f6dcffff700 10 librbd::ImageWatcher: 0x7f6db400b020 C_NotifyAck start: id=234071422725523, handle=140096867143248
2016-08-12 22:29:44.573386 7f6dcffff700 10 librbd::ImageWatcher: 0x7f6db4003b60 image header updated
2016-08-12 22:29:44.573389 7f6dcffff700 20 librbd::ImageState: refresh_seq = 6, last_refresh = 4
2016-08-12 22:29:44.573393 7f6dcffff700 10 librbd::ImageWatcher: 0x7f6db400b020 C_NotifyAck finish: id=234071422725523, handle=140096867143248
2016-08-12 22:29:45.249895 7f6dcffff700 -1 librbd::ImageWatcher: 0x7f6db4003b60 image watch failed: 140096867143248, (107) Transport endpoint is not connected
2016-08-12 22:29:45.262607 7f6dcffff700 10 librbd::ImageWatcher: 0x7f6db4003b60 current lock owner: [0,0]
2016-08-12 22:29:45.319110 7f6dcd6f9700 10 librbd::ImageWatcher: 0x7f6db4003b60 re-registering image watch
2016-08-12 22:29:45.319145 7f6dcd6f9700 10 librbd::ExclusiveLock: 0x7f6db40029f0 release_lock
2016-08-12 22:29:45.319150 7f6dcd6f9700 10 librbd::ExclusiveLock: 0x7f6db40029f0 send_release_lock
2016-08-12 22:29:45.319251 7f6dcf7fe700 10 librbd::exclusive_lock::ReleaseRequest: send_cancel_op_requests
2016-08-12 22:29:45.319263 7f6dcf7fe700 10 librbd::exclusive_lock::ReleaseRequest: handle_cancel_op_requests: r=0
2016-08-12 22:29:45.319267 7f6dcf7fe700 10 librbd::exclusive_lock::ReleaseRequest: send_block_writes
2016-08-12 22:29:45.319271 7f6dcf7fe700 5 librbd::AioImageRequestWQ: block_writes: 0x55ee44904f50, num=1
2016-08-12 22:29:45.319285 7f6dcf7fe700 10 librbd::exclusive_lock::ReleaseRequest: handle_block_writes: r=0
2016-08-12 22:29:45.319352 7f6dcf7fe700 10 librbd::ExclusiveLock: 0x7f6db40029f0 handle_releasing_lock
2016-08-12 22:29:45.319361 7f6dcf7fe700 10 librbd::exclusive_lock::ReleaseRequest: send_flush_notifies
2016-08-12 22:29:45.319366 7f6dcf7fe700 10 librbd::exclusive_lock::ReleaseRequest: handle_flush_notifies
2016-08-12 22:29:45.319368 7f6dcf7fe700 10 librbd::exclusive_lock::ReleaseRequest: send_close_object_map
2016-08-12 22:29:45.319378 7f6dcf7fe700 10 librbd::object_map::UnlockRequest: 0x7f6dc407b8c0 send_unlock: oid=rbd_object_map.40c5612ae8944a
2016-08-12 22:29:45.320613 7f6dcffff700 10 librbd::object_map::UnlockRequest: 0x7f6dc407b8c0 handle_unlock: r=-108
2016-08-12 22:29:45.320626 7f6dcffff700 -1 librbd::object_map::UnlockRequest: failed to release object map lock: (108) Cannot send after transport endpoint shutdown
2016-08-12 22:29:45.320639 7f6dcffff700 10 librbd::exclusive_lock::ReleaseRequest: handle_close_object_map: r=0
2016-08-12 22:29:45.320647 7f6dcffff700 10 librbd::exclusive_lock::ReleaseRequest: send_unlock
2016-08-12 22:29:45.321490 7f6dcffff700 10 librbd::exclusive_lock::ReleaseRequest: handle_unlock: r=-108
2016-08-12 22:29:45.321498 7f6dcffff700 -1 librbd::exclusive_lock::ReleaseRequest: failed to unlock: (108) Cannot send after transport endpoint shutdown
2016-08-12 22:29:45.321551 7f6dcf7fe700 10 librbd::ExclusiveLock: 0x7f6db40029f0 handle_release_lock: r=0
2016-08-12 22:29:45.321565 7f6dcf7fe700 10 librbd::ImageWatcher: 0x7f6db4003b60 notify released lock
2016-08-12 22:29:45.321568 7f6dcf7fe700 10 librbd::ImageWatcher: 0x7f6db4003b60 current lock owner: [0,0]
2016-08-12 22:29:45.324256 7f6dcf7fe700 20 librbd::image_watcher::Notifier: notify: pending=1
2016-08-12 22:29:45.896206 7f6dcffff700 20 librbd::image_watcher::Notifier: handle_notify: pending=0
2016-08-12 22:29:45.896273 7f6dcd6f9700 -1 librbd::ImageWatcher: 0x7f6db4003b60 failed to re-register image watch: (108) Cannot send after transport endpoint shutdown
2016-08-12 22:29:53.107299 7f6e051adc00 20 librbd::AioImageRequestWQ: aio_write: ictx=0x55ee44904f50, completion=0x55ee45331a70, off=75430789120, len=8192, flags=0
2016-08-12 22:29:53.107326 7f6e051adc00 20 librbd::AioImageRequestWQ: queue: ictx=0x55ee44904f50, req=0x55ee4593c850
2016-08-12 22:29:53.107334 7f6e051adc00 20 librbd::ExclusiveLock: 0x7f6db40029f0 is_lock_owner=0
2016-08-12 22:29:53.107337 7f6e051adc00 10 librbd::ExclusiveLock: 0x7f6db40029f0 request_lock
2016-08-12 22:29:53.107340 7f6e051adc00 10 librbd::ExclusiveLock: 0x7f6db40029f0 send_acquire_lock
HTH
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1607694
Title:
Exclusive-Lock Issue
Status in ceph package in Ubuntu:
Confirmed
Bug description:
After few days running a VM over RBD, under heavy loads. We get freeze
the VMs and get this log:
2016-07-26 16:20:47.069+0000: starting up libvirt version: 1.3.1, package: 1ubuntu10~cloud0 (Openstack Ubuntu Testing Bot <openstack-testing-bot at ubuntu.com> Fri, 15 Apr 2016 08:05:47 +0000), qemu version: 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.24), hostname: tckvmhv01.todoencloud
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name opennebula -S -machine pc-i440fx-trusty,accel=kvm,usb=off -m 8192 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 7733f4c6-c72d-af36-2e84-cbc44c5c8da0 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-opennebula/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot order=c,menu=on,strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 'file=rbd:sata/opennebula.img:id=sata:key=AQB69U1Xq+bxHhAAcw7ZWZlOjE0Bf88QMsTqMQ==:auth_supported=cephx\;none:mon_host=10.255.128.238\:6789,format=raw,if=none,id=drive-virtio-disk0' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/usr/src/systemrescuecd.iso,format=raw,if=none,id=drive-ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:1c:07:7f,bus=pci.0,addr=0x3 -netdev tap,fd=35,id=hostnet1,vhost=on,vhostfd=36 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:42:b6:74,bus=pci.0,addr=0x6 -netdev tap,fd=37,id=hostnet2,vhost=on,vhostfd=38 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:2b:bb:89,bus=pci.0,addr=0x7 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:2 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
char device redirected to /dev/pts/1 (label charserial0)
librbd/ExclusiveLock.cc: In function 'std::string librbd::ExclusiveLock<ImageCtxT>::encode_lock_cookie() const [with ImageCtxT = librbd::ImageCtx; std::string = std::basic_string<char>]' thread 7fded06a2a00 time 2016-07-28 23:44:09.845595
librbd/ExclusiveLock.cc: 197: FAILED assert(m_watch_handle != 0)
ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
1: (()+0x24c4bb) [0x7fdebcbde4bb]
2: (()+0x7cf54) [0x7fdebca0ef54]
3: (()+0x7dadb) [0x7fdebca0fadb]
4: (()+0x7e827) [0x7fdebca10827]
5: (()+0x65d46) [0x7fdebc9f7d46]
6: (()+0x66218) [0x7fdebc9f8218]
7: (rbd_aio_write()+0x29) [0x7fdebc9ec629]
8: (()+0xde816) [0x7fded0108816]
9: (()+0xde910) [0x7fded0108910]
10: (()+0xb69b0) [0x7fded00e09b0]
11: (()+0xbbc44) [0x7fded00e5c44]
12: (()+0xbbc44) [0x7fded00e5c44]
13: (()+0xbc8fc) [0x7fded00e68fc]
14: (()+0xfc36a) [0x7fded012636a]
15: (()+0x49800) [0x7fdeb84ac800]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-07-28 21:44:10.296+0000: shutting down
root at tckvmhv01:/var/log/libvirt/qemu# dpkg -l | grep ceph
ii ceph-common 10.2.0-0ubuntu0.16.04.1~cloud0 amd64 common utilities to mount and interact with a ceph storage cluster
ii libcephfs1 10.2.0-0ubuntu0.16.04.1~cloud0 amd64 Ceph distributed file system client library
ii python-ceph 10.2.0-0ubuntu0.16.04.1~cloud0 all Meta-package for python libraries for the Ceph libraries
ii python-cephfs 10.2.0-0ubuntu0.16.04.1~cloud0 amd64 Python libraries for the Ceph libcephfs library
root at tckvmhv01:/var/log/libvirt/qemu# dpkg -l | grep libvirt
ii libvirt-bin 1.3.1-1ubuntu10.1~cloud0 amd64 programs for the libvirt library
ii libvirt0:amd64 1.3.1-1ubuntu10.1~cloud0 amd64 library for interfacing with different virtualization systems
ii python-libvirt 1.3.1-1ubuntu1~cloud0 amd64 libvirt Python bindings
root at tckvmhv01:/var/log/libvirt/qemu# uname -a
Linux tckvmhv01 3.13.0-86-generic #131-Ubuntu SMP Thu May 12 23:33:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1607694/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list