add-unit relation question

Stuart Bishop stuart.bishop at canonical.com
Mon Jun 3 18:56:08 UTC 2013


On Tue, Jun 4, 2013 at 12:35 AM, Andreas Hasenack <andreas at canonical.com> wrote:
> On Mon, Jun 3, 2013 at 2:25 PM, David Britton <david.britton at canonical.com>
> wrote:
>>
>> Stuart --
>>
>> This may true but for my particular case, I'm only dealing with one
>> postgresql unit.  What is breaking is the addition of a second *client* to
>> one postgresql server on the admin relation.  I mean, it's racy, I actually
>> did test successfully, but most of the time it fails.  Simply retrying the
>> relation works since by that time the pg_hba.conf file certainly has had a
>> chance to be updated already.
>>
>
> Here are some logs:
>
> This is from landscape/2, the unit that was added:
> 2013/05/29 19:39:41 DEBUG worker/uniter/jujuc: hook context id
> "landscape/2:db-admin-relation-joined:2896530511008022547"; dir
> "/var/lib/juju/agents/unit-landscape-2/charm"
> 2013/05/29 19:39:41 INFO worker/uniter: HOOK Error connecting to database as
> db_admin_4_landscape_admin
> 2013/05/29 19:39:41 ERROR worker/uniter: hook failed: exit status 1

Is this the db-admin-relation-joined hook? You will never be able to
connect to the database there as the credentials have not yet been
setup. Clients can ignore the -joined hook entirely and only implement
the -changed hook.

My guess is you are running the same code for db-admin-relation-joined
and db-admin-relation-changed. It works the first time because your
code is smart enough to back off if the credentials are not available,
and the first unit happily connects when the -changed hook is invoked.
The second unit however is less lucky because it is seeing the
incorrect details on the relation in the -joined hook and failing, and
the -changed hook never gets invoked which is the first time it will
get to see the (hopefully) correct connection details.

Here is the relevant code in the PostgreSQL charm, invoked by both the
-joined and -changed hooks:

def db_admin_relation_joined_changed(user, database='all'):
    if not user_exists(user):
        password = create_user(user, admin=True)
        run("relation-set user='%s' password='%s'" % (user, password))
    host = get_unit_host()
    config_data = config_get()
    run("relation-set host='%s' port='%s'" % (
        host, config_data["listen_port"]))
    generate_postgresql_hba(postgresql_hba)

Line 3 where we only publish the username and password if the user did
not already exist. This is the bug I cited - users will often already
exist because they are in the cluster scope, not the database scope.

generate_postgresql_hba(...) is the code that is supposed to be
iterating over all the clients and generating the pg_hba.conf file so
they can all connect.

If you are getting invalid credentials in your
db-admin-relation-changed hook, then that is a bug in the PostgreSQL
charm. This would not surprise me - I'm already seeing suspicious
bits. It needs a good refactoring, which is why I want tests so it can
be done safely.

(I would test but the openstack has run out of internal DHCP addresses
and I'm having trouble with lxc...)

-- 
Stuart Bishop <stuart.bishop at canonical.com>



More information about the Juju mailing list