opaque ids vs. natural keys

Thu May 30 00:39:59 UTC 2013

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 29/05/13 20:48, William Reade wrote:
> FWIW, we do currently have opaque, unique (in-environment, anyway)
> keys for one entity: the relation. That has Key and Id fields
> (mapped in mongo, ever so helpfully, to "_id" and "id"
> respectively) where Key is the human-readable one (space-separated
> endpoint names in canonical order) and Id is just an
> environment-unique int [0]. It should be noted that we're probably
> *not* free to change the form of the "id" field, because it's
> exposed to charms; I think that I do favour env-unique entity ids
> in general, (that is, plain old ints rather than uuids), because we
> can always combine an env uuid with an env-unique entity id to get
> a universally unique entity id.

I have no strong feelings as to using an int vs. a uuid.

> Service just has Name -> _id, as does Unit, and those are
> problematic because while they're env-unique at any given *point*
> in time that property is not guaranteed over any given time
> *period*, and this is problematic for the GUI [1]. Machine has Id
> -> _id, and extending the semantics of Id simply puts it in the
> same situation as Service and Unit.

So generally we are agreeing that we should have an environment unique
value which is different to Name?

> So, basically, I'm making a consistency argument for extending
> machine _id in this way: that we currently use poor "primary keys",
> that encode important information primarily for human benefit, for
> every other entity. I don't think there's any argument *against*
> also providing parallel "primary keys" that are actually opaque and
> unique... *except* that it's a schema change for which we are not
> currently prepared (it's high on the list but not being actively
> developed). So, pragmatically, we can make valid progress on
> containerization without blocking on MV upgrades, and the only cost
> we thereby take on is to slightly increase the size of the
> "fix-the-ids" task we know we'll have to undertake before too long
> anyway.

Fair enough.  I'm not suggesting that we are blocked on
containerisation, just that the proposed naming scheme without an
agreement to add the unique value later would make me extremely sad.

> I'm still open to arguments that it's *fundamentally* bad to encode
> this information in a string rather than to break it out into
> separate fields -- and if you can convince me of that I concede we
> have no option but to develop this in a parallel branch and suck up
> a merge once we have MV upgrades on trunk -- but I think that the
> PK argument is misplaced in our current situation.

I have no concerns around using a semantically rich ID field as long
as it isn't the "primary unique key", as that is *fundamentally* bad
in my experience.  However I'm ok with continuing as planned under the
agreement that we will be adding this unique value later with the
major version upgrade work.

I do have some ideas though around upgrades in general, and how to do
schema changes with every upgrade (major or minor), but that will be
the topic of another email (and probably not today).

Cheers,
Tim
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlGmn98ACgkQd1fvI4G7WRDC/ACfd+FV601efIAx8nVFdAGZ4a+a
wrAAoLoSVVK8gYvL/z5XMrCgJ7uroJGm
=9D8b
-----END PGP SIGNATURE-----