millions of warnings per day, state DB grows by 4GB/day
John Meinel
john at arbash-meinel.com
Wed Aug 26 15:55:13 UTC 2015
Juju 1.23 has some known issues with Lease operations. All of those have
been fixed in the 1.24.4 (5?) release. It is good practice to take a backup
before upgrading, and if you want we can try to help sort out the specific
growth to make sure it gets cleaned out after the upgrade. (Even just
getting mongo stats on the collections as a whole is likely to help us
understand where the size is coming from.)
If it is what I think it is, then I would expect the txns collection to be
growing rapidly. IIRC there is already code in the 1.24 series to prune old
transactions after they've been successfully applied. But that *might* be a
1.25 thing. I do believe we have a program that can do the work directly if
it isn't integrated already.
As an aside how could we have messaged you better to upgrade away from
1.23? We never made it an official release in Trusty so clearly you were
informed about our PPA. So there was some sort of communication between
Canonical and your group, but I don't think our team was aware that you
were running 1.23.
John
=:->
On Aug 26, 2015 10:05 AM, "Peter Grandi" <pg at juju.list.sabi.co.uk> wrote:
> Looking at a MAAS+Juju collection of 12 hosts, with 3 Juju
> "state" nodes. The collection runs OpenStack and Ceph,
> apparently still working. It was mostly installed in May, 3
> months ago, on ULTS 14 with package versions:
>
> ii juju-core 1.23.2-0ubuntu amd64 Juju is devops
> distilled - client
> ii juju-deployer 0.4.3-0ubuntu1 all Deploy complex stacks
> of services using J
> ii juju-local 1.23.2-0ubuntu all dependency package
> for the Juju local pro
> ii juju-mongodb 2.4.9-0ubuntu3 amd64 MongoDB
> object/document-oriented database
> ii juju-quickstart 2.0.1+bzr124+p all Easy configuration of
> Juju environments
> ii python-jujuclient 0.17.5-0ubuntu all Python API client for
> juju
>
> The 2 "state" nodes are 'node01' (machine 0), 'node02' (machine
> 9), 'node09' (machine 10). There are some worrying symptoms:
>
> * The MongoDB database size is 204GiB on 'node01', 199GiB on
> 'node02', 208GiB on 'node09' (roughly the same of course)
> and grows by around 4GiB per day. That is the number of
> 'juju.NNN' files grows constantly, and is currently around
> 800.
>
> * Probably correlatedly there are several MB/s of transfer
> rate among the "state" nodes on the Juju port, mostly
> currently from 'node02' to 'node01' and 'node09'.
>
> * 'jujud' consumes from 50% of a rather speedy recent Xeon
> CPU to 2-3 CPUS, except on 'node09'.
>
> * 'mongod' consumes 1-3 CPUs on 'node01' and 'node02'.
>
> * The 'machine-0.log*', 'machine-9.log*', 'machine-10.log*'
> are large, and in particular there are often millions of
> lines per day of these warnings:
>
> juju.lease lease.go:301 A notification timed out after 1m0s
>
> Sometimes with a frequency of thousands per second.
>
> As to the logs I have prepared these statistical summaries:
>
> Number of notification time outs per day, worst days, per node:
> http://paste.ubuntu.com/12199755/ 'node01'
> http://paste.ubuntu.com/12199756/ 'node02'
> http://paste.ubuntu.com/12199757/ 'node09'
>
> Most popular non-timeout warnings and errors, by worst day,
> per node:
>
> http://paste.ubuntu.com/12199759/ 'node01'
> http://paste.ubuntu.com/12199760/ 'node02'
> http://paste.ubuntu.com/12199761/ 'node03'
>
> Apparently there have been a couple of operation mishaps, but
> the dates when they are reported to happen are not quite those
> on which I see the most logged errors, or later. Some collegues
> think that some endpoints are "dangling".
>
> Please let me know what to look at and ideally where there is
> some internals documentation, as I am not at all familiar with
> the internals of the Juju state system.
>
>
> --
> Juju mailing list
> Juju at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju/attachments/20150826/fec76134/attachment.html>
More information about the Juju
mailing list