Coordinating actions in a service

Stuart Bishop stuart.bishop at canonical.com
Fri May 8 12:17:31 UTC 2015


Hi.

I have several potentially long running and expensive database
operations I'd like to wrap as actions, which will generally be run on
just one unit or on all units of the service.

The problem I have is running an action on all units of the service.
For HA, I need to ensure that, if there is more than one unit, then it
may only run on (num_units/2)-1 units at a time. ie. if I fire off an
action on all units of a 5 unit cluster, then only two units at a time
may run the action and the other units will block until they are done.

Leadership is needed to do coordination like this, but I can't see how
to use it with actions. The action has no way to request permission
from the leader and no way to get a response.

Can anyone tell me how to do this with the current model?

I don't think it can be done, which I guess makes this a feature
request for Actions 2.0. Or perhaps this is another use case for a
general locking service providing semaphores (which would need some
tricky semantics, since I'd need a semaphore that can be acquired by
max(1,(num_units/2)-1) units at a time and num_units might change
while waiting for the lock and before releasing it).

Or is this just out of scope, and the operator needs to do this sort
of coordination themselves? There is plenty of other coordination that
can't be embedded in the charm as it is (eg. don't drop an old unit if
it is still being used to bootstrap a new unit into the cluster).

-- 
Stuart Bishop <stuart.bishop at canonical.com>



More information about the Juju mailing list