[Bug 1227327] [NEW] ceph osd repair fails with assert(missing.num_missing() == 0)
James Troup
james.troup at canonical.com
Wed Sep 18 19:20:57 UTC 2013
Public bug reported:
After an unfortunate incident with dhcpd going away, we lost 3/6 of
our ceph cluster and had to remotely power cycle them to get them
back. Now that everything is back up, the ceph cluster has mostly
recovered but we had a couple of pg's stuck in an inconsistent state,
so I ran 'ceph osd repair' on one of the osds involved in the
inconsistent pgs. It ran for a while and fixed some things, and then
exploded with this:
2013-09-18 18:52:24.116439 7fdf4e2d9700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::recover_got(hobject_t, eversion_t)' thread 7fdf4e2d9700 time 2013-09-18 18:52:24.035055
osd/ReplicatedPG.cc: 5351: FAILED assert(missing.num_missing() == 0)
ceph version 0.48.3argonaut (commit:920f82e805efec2cae05b79c155c07df0f3ed5dd)
1: (ReplicatedPG::recover_got(hobject_t, eversion_t)+0x4d4) [0x7fdf60c29794]
2: (ReplicatedPG::submit_push_complete(ObjectRecoveryInfo&, ObjectStore::Transaction*)+0x490) [0x7fdf60c2c950]
3: (ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0x4c6) [0x7fdf60c4ac26]
4: (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0x96) [0x7fdf60c4ba66]
5: (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x3f7) [0x7fdf60c4bf17]
6: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0xa7) [0x7fdf60d03a07]
7: (OSD::dequeue_op(PG*)+0x23a) [0x7fdf60cc156a]
8: (ThreadPool::worker()+0x4c4) [0x7fdf60e86dd4]
9: (ThreadPool::WorkThread::entry()+0xd) [0x7fdf60cdab2d]
10: (()+0x7e9a) [0x7fdf604aee9a]
11: (clone()+0x6d) [0x7fdf5e9baccd]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Along with 10K more lines of spew about what it was doing. This is
ceph 0.48.3-0ubuntu1~cloud0 from the Folsom pocket of the Ubuntu Cloud
Archive and the machine is running Ubuntu 12.04 LTS.
** Affects: ceph (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1227327
Title:
ceph osd repair fails with assert(missing.num_missing() == 0)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1227327/+subscriptions
More information about the Ubuntu-server-bugs
mailing list