Notes from Scale testing

Wed Oct 30 14:11:45 UTC 2013

On Wed, Oct 30, 2013 at 9:23 AM, John Arbash Meinel
<john at arbash-meinel.com>wrote:

> 2) Agents seem to consume about 17MB resident according to 'top'. That
> should mean we can run ~450 agents on an m1.large. Though in my
> testing I was running ~450 and still had free memory, so I'm guessing
> there might be some copy-on-write pages (17MB is very close to the
> size of the jujud binary).
>

17MB seems just fine for an agent. I don't think it's worth worrying much
about that size, since it's fairly static and you generally aren't going to
run 450 copies on the same machine :)

> 3) On the API server, with 5k active connections resident memory was
> 2.2G for jujud (about 400kB/conn), and only about 55MB for mongodb. DB
> size on disk was about 650MB.
>

400kB per connection seems atrocious.  Goroutines take about 4k on their
own.  I have a feeling we're keeping copies of a lot of stuff in memory per
connection that doesn't really need to be copied for each connection.  It
would be good to get some profiling on that, to see if we can get it down
to something like 1/10th that size, which would be more along the lines of
what I'd expect per connection.

> The log file could grow pretty big (up to 2.5GB once everything was up
> and running though it does compress to 200MB), but I'll come back to
> that later.
>

interesting question - are our log calls asynchronous, or are we waiting
for them to get written to disk before continuing?  Wonder if that might
cause some slowdowns.

> 4) If I bring up the units one by one (for i in `seq 500`; do for j in
> `seq 10` do juju add-unit --to $j &; time wait; done), it ends up
> triggering O(N^2) behavior in the system. Each unit agent seems to
> have a watcher for other units of the same service. So when you add 1
> unit, it wakes up all existing units to let them know about it.

I tried to talk about this in the hangout this morning, but I'm not sure if
I got my point across.  I don't know that this really qualifies as N^2
given that no single machine sends or receives more than N messages.  The
network takes an N^2 hit. It's really only O(N) per unit agent.  It might
be N^2 for the state server if each agent pings the state server when it
receives the unit-add message... but it seems unlikely that we'd do that
(and if we do, we should fix that).

> 8) We do end up CPU throttled fairly often (especially if we don't set
> GOMAXPROCS). It is probably worth spending some time profiling what
> jujud is doing. I have the feeling all of those calls to CharmURL are
> triggering DB reads from Mongo, which is a bit inefficient.
>
> I would be fine doing max(1, NumCPUs()-1) or something similar. I'd
> rather do it inside jujud rather than in the cloud-init script,
> because computing NumCPUs is easier there. But we should have *a* way
> to scale up the central node that isn't just scaling out to more API
> servers.
>

It seems as though GOMAXPROCS = NumCPUs is probably better, and just let
the OS handle scheduling.

>
> 9) We also do seem to hit MongoDB limits. I ended up at 100% CPU for
> mongod, and I certainly was never above 100%. I didn't see any way to
> configure mongo to use more CPU. I wonder if it is limited to 1 CPU
> per connection, or if it is just always 1 CPU.
>

http://stackoverflow.com/questions/9773606/is-mongodb-somehow-limited-to-a-single-core

> I certainly think we need a way to scale Mongo as well. If it is just
> 1 CPU per connection then scaling horizontally with API servers should
> get us around that limit.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20131030/357f1ac7/attachment.html>