Propagating state change to multiple peers
Adam Gandelman
adamg at canonical.com
Wed Jul 13 20:10:32 UTC 2011
Hi-
I've begun trying to deploy Openstack swift via ensemble and I've run
into an issue I do not know how to tackle. If functionality already
exists in ensemble, I'd appreciate any pointers. If not, maybe this
would be a good time to brainstorm.
To configure swift's replication, you configure "rings" which setup
storage nodes in various zones, and sets policy as to how replication
should work. When configuration changes (a new node or zone is added or
removed), configuration needs to be updated (the rings need to be
"balanced") and propagated to all nodes in the cluster. The
re-balancing of the rings takes place centrally, and updated
configuration copied to corresponding nodes.
In terms of deploying via ensemble, I would like to have a central
swift-proxy node that manages ring configuration. When a new storage
node joins, it relates to swift-proxy. swift-proxy updates the ring and
the new storage node receives updated configuration [1], presumably via
relation-changed hook. This works fine when its a 1-to-1 relation.
But when I begin adding more new swift storage nodes, the rings need to
be balanced for each new member and new configuration propagated to
*all* nodes relating to swift-proxy, not only the new node. It seems
ensemble needs to have some notion of global state/relation changes that
fire corresponding hooks on the central server and all its peers.
Perhaps a global-relation-changed and global-relation-joined hooks that
fire in addition to the current relation-changed/joined hooks? The
global hooks can be skipped if they do not exist, or perhaps they do not
need to be fired by default and are instead triggered from within
another hook?
Swift is the first use case I've run into that requires state to be
synchronized between many nodes. I think others will run into similar
issues when using ensemble to deploy other multi-node clustered services.
Thoughts?
Adam
[1] What gets copied to other nodes are several gzip archives that are
generated when the rings get re-balanced. We'll need a way to pass
files between nodes similar to how KEY=VALUEs are sent via
relation-set/relation-get, but thats topic for another thread.
More information about the Ensemble
mailing list