A cautionary tale - mgo asserts

Tim Penhey tim.penhey at canonical.com
Wed Jun 8 09:05:29 UTC 2016


Hi folks,

tl;dr: not use structs in transaction asserts

I have spent the last two days looking at this bug:
https://bugs.launchpad.net/juju-core/+bug/1537585

It was one where the instance poller got itself in a knot, and couldn't 
get out. Transaction asserts were failing, and often hard to reproduce.

Nate spent last week on it, and I pulled Menno in for his mgo expertise 
yesterday and today because I was getting stumped.

Here is what was happening:

The machine document was being read, and we were wanting to update the 
preferred public and private addresses. An assert was added to make sure 
that the value was either what we expected it to be (from reading the 
doc), or unset. Now, really, it wouldn't be unset, but that is a 
different story.

The problem is that the assert was failing. Logging was added to check 
what the db had, and what the asserts were transformed to with the multi 
model layer. All the asserts matched. The data was there, everything was 
good, except it wasn't.

This morning, I pulled the data out with a bson.D, and it became readily 
apparent that the address structure ordering was different in the times 
it failed. Now normally we are guaranteed ordering. When the struct is 
marshalled to bson, it is as a bson.D with the order defined in by the 
order of the members.  However when using the struct in the assert, it 
was matching different orders, so the assert failed.

Now, if mgo is not under load, the transaction can be performed 
immediately, so the values stored use the bson.D.

However, when under load, the whole transaction may be stashed 
temporarily in a collection, and then loaded for execution at a later 
date. Reading the operations back out of the collection end up in a 
bson.M, and it loses all ordering.  The ordering doesn't matter when 
unmarshalling back into a struct, but it sure does matter when asserting.

The solution is to not use a field struct equality, even though it is 
easy to write, but to use the dotted field notation to check the 
embedded field values.

Yet another thing to add to the list of things to check when doing reviews.

Tim



More information about the Juju-dev mailing list