A cautionary tale - mgo asserts
Tim Penhey
tim.penhey at canonical.com
Wed Jun 8 09:05:29 UTC 2016
Hi folks,
tl;dr: not use structs in transaction asserts
I have spent the last two days looking at this bug:
https://bugs.launchpad.net/juju-core/+bug/1537585
It was one where the instance poller got itself in a knot, and couldn't
get out. Transaction asserts were failing, and often hard to reproduce.
Nate spent last week on it, and I pulled Menno in for his mgo expertise
yesterday and today because I was getting stumped.
Here is what was happening:
The machine document was being read, and we were wanting to update the
preferred public and private addresses. An assert was added to make sure
that the value was either what we expected it to be (from reading the
doc), or unset. Now, really, it wouldn't be unset, but that is a
different story.
The problem is that the assert was failing. Logging was added to check
what the db had, and what the asserts were transformed to with the multi
model layer. All the asserts matched. The data was there, everything was
good, except it wasn't.
This morning, I pulled the data out with a bson.D, and it became readily
apparent that the address structure ordering was different in the times
it failed. Now normally we are guaranteed ordering. When the struct is
marshalled to bson, it is as a bson.D with the order defined in by the
order of the members. However when using the struct in the assert, it
was matching different orders, so the assert failed.
Now, if mgo is not under load, the transaction can be performed
immediately, so the values stored use the bson.D.
However, when under load, the whole transaction may be stashed
temporarily in a collection, and then loaded for execution at a later
date. Reading the operations back out of the collection end up in a
bson.M, and it loses all ordering. The ordering doesn't matter when
unmarshalling back into a struct, but it sure does matter when asserting.
The solution is to not use a field struct equality, even though it is
easy to write, but to use the dotted field notation to check the
embedded field values.
Yet another thing to add to the list of things to check when doing reviews.
Tim
More information about the Juju-dev
mailing list