LXC Directions

Mon Jul 18 17:03:54 UTC 2011

LXC has been something I've heard mentioned hand in hand with Ensemble
since the very first time I heard about Ensemble at UDS-M. I think I
understand the directions that have been proposed, and I think its time
we had the discussion so that we can enable LXC for local dev at the very
least, and hopefully also for many other purposes in Ensemble.

Here are the ways I see LXC being useful in Ensemble:

::Unit isolation

Multiple units per machines isn't just about saving money, but also about
being able to serve multiple services with a single machine. "Cloud in the
cloud". :) As some people have suggested, the wordpress example does not
need two machines.  (Or even 4 if you were to go High-Availability). But
putting the db server and the app server on the same box means there's a
potential for privilege escalation.  The app server is probably going to
be exposed to the internet, at which point it will be quite vulnerable. If
its ever compromised, this may lead to an attacker finding the database
available locally, and compromise of said data.

LXC at least would hide the existence of the database server from the
web app server. As far as the attacker is concerned, its just another
host that will have to be compromised over the network, instead of
locally. LXC's security model is not actually much stronger than chroot's,
but chroot *is* a best practice and will prevent many attacks simply by
disallowing direct file access. Written as a user story:

"As a systems administrator I want to contain any compromised services
as much as possible without severely impacting the maintainability of
the system."

I think Ensemble is a big win here, and there's a fair argument that
chroot should actually be implemented even if LXC is not.

The really confusing part for me is how to do the network. Right now you
have a hostname, and that hostname is useful, as it resolves to the IP
that one can contact your machine on.

But if you create a container, it needs an IP. You can't use NAT, because
now you have to make sure ports don't collide. On EC2, I'm not aware of
a way for a single machine to get more than one internal IP. There are
elastic IPs for public access, but if you use these from inside EC2,
you are going to pay bandwidth charges.

Another option I see is to create a VPN between all of the machines
so that all of the containers can directly address all of the other
containers on a network. This seems overly complex. Because its being
implemented at layer 4, it would require clever network topology schemes
whereas right now one can just let the magic of EC2 or Eucalyptus or
OpenStack's layer 2 handle that.

::Machine Re-use

Being able to host a bunch of low-impact service units on a single machine
is, I think, an essential part of the Ensemble story for development, testing,
and also for hosting. Written as a "user story":

"As a SaaS provider of web based applications , I want to be able to
deploy a new app service for clients rapidly, while retaining the ability to
scale it up quickly with their traffic and/or payment."

If you're in EC2, sure you can keep spawning new m1.small's and then move  them
to x1.large when you're ready. However, if you already have an x1.large that has
just been freed up, re-using it will save you $0.68, and probably a few minutes.

Chroot's and/or snapshots would also accomplish this goal, if the
machine agent simply reverted the system back to the way it was after
removing the unit.

::Local Repeatability

Using LXC locally means saving money, and time. It should also mean working
while disconnected from your cloud, if you're using Ensemble to drive a private
cloud. Being able to iterate *rapidly* is a key feature. Written as a user story:

"As a systems administrator I want to be able to develop new features in my
ensemble formulas without affecting production or waiting for cloud resources
to be allocated."

"As a developer I want to be able to write and test my cloud based
application without waiting for the latency of a cloud provider or
worrying about overpaying for over-utilizing my IaaS account."

One could potentially run Eucalyptus locally, but that seems like a huge barrier
that we can knock down easily with local LXC support.

::: Two proposals :::

We have, I think, two opportunities here and now, to provide some of
these benefits.

* LXC as a Provider - This is the way my branch works. Containers are
  at the same level as instances on EC2.

* LXC as Unit Isolation - This would be realized by the machine agent,
  which would create a container and spawn the unit agent inside it.

Here are, I think, the pro's and cons, as succinctly as possible, for
both:

:: LXC as a Provider

Pro's:

* Experimental branch already implemented: lp:~clint-fewbar/ensemble/lxc-container
* Known to work and solve the LXC Local Repeatability
* Adds support for LXC which can be used later for other needs.
* Isolated from the rest of Ensemble, as it mostly adds a single provider on the
  same level as EC2.

Con's:

* Does not provide unit isolation 
* Does not provide machine re-usability

:: LXC as Unit Isolation

Pro's:

* Provides unit isolation
* Provides machine re-usability
* Enhances EC2 experience
* Can take advantage of some of the LXC control work from lxc provider approach.

Con's:

* No implementation
* Network complexity is very high 
* Does not directly solve local dev story.

::: Recommendation

I feel that the network complexity issue is *giant*, and until it is
resovled, LXC for isolating units is going to be stuck. If that is solved,
then we should move forward with an implementation. We can change the
LXC provider I did into a "local" provider which simply runs everything
on the one local machine agent.

However, if that problem is not easily solvable, then we should
move forward with a clear specification for an LXC provider for local
development, and accept that until the network complexity involved with
unit isolation is resolved, we will need to use this provider for the
local dev story. Once we figure that out, we can deprecate the provider,
and still make use of a large portion of the code and work that we did
with it.

::: Comments

Please do respond here soon on the mailing list with thoughts and
ideas. Work will commence on a specification for whatever direction comes
from this discussion very soon. We have a container sprint in Austin
starting August 8 where we need to at least have some big questions,
if not some big task-lists to work on during that week.