relation departure timing changes

Gustavo Niemeyer gustavo at niemeyer.net
Fri Aug 23 10:56:34 UTC 2013


On Fri, Aug 23, 2013 at 7:37 AM, William Reade
<william.reade at canonical.com> wrote:
> Currently, we *guarantee* that destroying server_service/0 will cause its
> relationship with client_service units to be torn down before those units
> become aware. At this stage, I'm narrowly proposing that we move the
> synchronization point such that it's *possible* (not guaranteed) for units
> of client_service to respond to server_service/0's destruction before it
> materially affects them.

I agree it's suboptimal, but unless we render the alternative scenario
into a pretty strong possibility rather than mere chance, there would
be little point in investing much time on this. Saying "oh, perhaps it
will run if you're in a good day" isn't any better in terms of API
than "you cannot depend on this". It would make people spend time
trying to follow the pattern, and then wonder why it doesn't work.

> It's important to realise that the *guarantees* made by the system do not in
> fact become any stronger under the proposed model. If a unit of
> client_service is (say) running a slow config-changed hook when (2a) comes
> to pass, server_service/0 will *not* wait for that unit to handle depart
> before cutting off access. It *would* in fact be possible to do this; but
> the tradeoff in play there is whether we want an unresponsive or missing
> unit of client_service to be capable of blocking the shutdown of
> server_service/0. I'm +1 in theory, but nervous in practice; without
> implementing `destroy-unit --force`, which is not entirely trivial (largely
> but not entirely because it's blocked on "" [0]), that change could lead to
> deadlocked environments. If you'd like the system to make this guarantee as
> well, please let me know: I can't promise anything about scheduling
> decisions there, but it will be useful input into those decisions ;).

If we're investing on this, it definitely sounds like we should at
least have a clear and well-defined behavior for when or it will or
will not run, and the cases where it will not run should be mapped to
understandable events, otherwise we're not improving the situation.


gustavo @ http://niemeyer.net



More information about the Juju mailing list