Wish: configurable juju's write concern when connecting to mongd

Mario Splivalo mario.splivalo at canonical.com
Mon Jan 11 16:14:04 UTC 2016


Hello.

I have a customer that ended up with broken juju database - the issue
and possible cause is explained here: http://pad.lv/1528261

I don't have enough data to verify what exactly happened, but most
likely something like this happened:

1. jujud wrote to PRIMARY, with write concern of 'majority'
2. as SECONDARYs are lagged (usually the customers run them inside VMs),
only when changes replicated to ONE of the SECONDARYes, mongod returned
'all good, carry on' to jujud
3. PRIMARY lost connectivity with the rest of the replicaset.
4. SECONDARYs decided to vote for the new PRIMARY - the SECONDAY which
haven't had all the data replicated to it was choosen as new PRIMARY.
5. former-PRIMARY joins the replicaset, and destroys (rollbacks) all of
the unreplicated changes

And now we have a situation that juju thinks that the data is written to
the database, where in fact that data doesn't exists.

Now, if we could tell juju to use 'writeconcern' of 3, situation like
above wouldn't happen, as we are always sure that all the data changes
are written to all of the servers (assuming, of course, we run
replicaset with three nodes).
In the event of one server going down, writes to mongodb would stop, as
there are now only two servers to write to, and we are asking mongo to
confirm writes to three servers. But, we are safe, data-wise, and no
data will be lost.
With the option to rec-configure jujud to use write-concern of 2, we
could re-enable writes to the mongod, at least until we bring back
broken SECONDARY back to life.

Does this makes any sense?

I'm asking this because in real-life deployments situation as described
above is not so uncommon. Especially when jujud state servers (and their
mongodb nodes) are inside VMs, there is a possibility that the slaves
would lag behind the master.

	Mario



More information about the Juju-dev mailing list