[Bug 1078217] [NEW] zookeeper connection is not using exponential backoff
Zygmunt Krynicki
zygmunt.krynicki at canonical.com
Tue Nov 13 09:30:18 UTC 2012
Public bug reported:
My juju cluster had some connection issues to zookeeper. While I was
reading the charm.log of my jenkins-slave unit I noticed that juju had
logged many thousands of exceptions such as this one:
2012-11-09 06:51:07,514: twisted at ERROR: Traceback (most recent call last):
2012-11-09 06:51:07,514: twisted at ERROR: File "/usr/lib/python2.7/dist-packages/txzookeeper/managed.py", line 319, in _cb_created
2012-11-09 06:51:07,514: twisted at ERROR: if self._check_result(result_code, d):
2012-11-09 06:51:07,514: twisted at ERROR: File "/usr/lib/python2.7/dist-packages/txzookeeper/client.py", line 219, in _check_result
2012-11-09 06:51:07,514: twisted at ERROR: self, error)
2012-11-09 06:51:07,515: twisted at ERROR: File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 134, in maybeDeferred
2012-11-09 06:51:07,515: twisted at ERROR: result = f(*args, **kw)
2012-11-09 06:51:07,515: twisted at ERROR: File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1181, in unwindGenerator
2012-11-09 06:51:07,515: twisted at ERROR: return _inlineCallbacks(None, gen, Deferred())
2012-11-09 06:51:07,516: twisted at ERROR: --- <exception caught here> ---
2012-11-09 06:51:07,516: twisted at ERROR: File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1039, in _inlineCallbacks
2012-11-09 06:51:07,516: twisted at ERROR: result = g.send(result)
2012-11-09 06:51:07,516: twisted at ERROR: File "/usr/lib/python2.7/dist-packages/txzookeeper/managed.py", line 257, in _cb_connection_error
2012-11-09 06:51:07,517: twisted at ERROR: raise error
2012-11-09 06:51:07,517: twisted at ERROR: zookeeper.ConnectionLossException: connection loss
I can see about 300 such exceptions _every second_. This is very bad on two levels:
1) It quickly fills the log with pointless exceptions, using disk space, saturating slow virtual IO
2) It is against proven network practice of using exponential backoff when retrying failed communication
ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: juju 0.5+bzr531-0ubuntu1.3
ProcVersionSignature: User Name 3.2.0-32.51-virtual 3.2.30
Uname: Linux 3.2.0-32-virtual x86_64
ApportVersion: 2.0.1-0ubuntu14
Architecture: amd64
Date: Tue Nov 13 09:26:31 2012
Ec2AMI: ami-000000bf
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.small
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
PackageArchitecture: all
ProcEnviron:
TERM=xterm-256color
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: juju
UpgradeStatus: No upgrade log present (probably fresh install)
** Affects: juju (Ubuntu)
Importance: Undecided
Status: New
** Tags: amd64 apport-bug ec2-images precise
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to juju in Ubuntu.
https://bugs.launchpad.net/bugs/1078217
Title:
zookeeper connection is not using exponential backoff
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/juju/+bug/1078217/+subscriptions
More information about the Ubuntu-server-bugs
mailing list