juju is slow to do anything

John Arbash Meinel john at arbash-meinel.com
Fri Aug 30 18:15:30 UTC 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2013-08-30 14:28, Peter Waller wrote:
> For the record, I sent the link privately. The run took about 22s 
> but I have measured 30s to 1m.

Some thoughts, nothing that I can give absolute confirmation on.

1) Next week we have a group sprinting on moving a lot of the command
line operations from being evaluated by the client into being
evaluated by the API server (running in the cloud) and then returned.
The explicit benefits that I've seen in other commands are pretty good.

'juju status' is going to be a command that should see a rather large
improvement. Because it does round trip queries for a lot of things
(what machines are there, what are the details of each machine, what
are the units running on each one, etc).

I've prototyped doing those queries in parallel, or trying to do bulk
ops, which actually helped a lot in testing (this was for hundreds of
units/machines).

Doing it on the API server means any round trips are "local" rather
than from your machine out to Amazon.

2) From here, 'time juju status' with a single instance running on ec2
is 10s. Which breaks down roughly 4s to lookup the IP address, 2s to
establish the state, and 4s to "finish up". (resolution of this is 1s
granularity)

Similarly "time juju-1.13.2 get not-service" takes 8.5s to run. 4s to
lookup the address, 2s to connect, and 3s to give the final 'not
found' result.

With trunk, "time ./juju get not-service" is 4.6s. 2s to lookup IP
address, 2s to connect, and the not-found result is instantaneous.

So I would expect the 10s of a generic "juju status" to easily drop
down to sub 5s. Regardless of any round-trip issues.

3) We are also looking to cache the IP address of the API server, to
shave off another ~2-4s for the common case that the address hasn't
changed. (We'll fall back to our current discovery mechanism.)

4) There seems to be an odd timing of your screen cast. It does report
22s which matches the times reported in the debug output. But the
total time of the video is 20s including typing. Is it just running
2:1 speed?

You can see from the debug that you have 7s to lookup the address to
connect to, and then about 1s to connect. The rest is time spent
gathering the information.

I expect it to get a whole lot faster in a couple more weeks, but I'm
not going to guarantee that until we've finished the work.

5) If I counted correctly, you have about 23 "machines" that are being
considered. A bunch of them down/pending/errored.

I would think for the errored ones you could do some sort of "juju
destroy-machine". It might make things better (less time spent
checking on machines you don't care about.)

What happens when you try it? (There may be other issues that make us
think we are waiting for something to happen with a machine that we
don't want to destroy it.)


Anyway in summary, this should be getting better, but I won't have
explicit numbers until the work is done.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (Cygwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlIg4UIACgkQJdeBCYSNAAOfxQCeMaRQdqvdyQ11WyRnJ/WPAccp
IysAniDrUq6IDtM0fu9SuZg+2AQto8rP
=JaZw
-----END PGP SIGNATURE-----



More information about the Juju mailing list