twisted much?
Robert Collins
robertc at robertcollins.net
Wed Jun 15 03:50:41 BST 2005
On Wed, 2005-06-08 at 08:54 -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Magnus Therning wrote:
> | On Wed, Jun 08, 2005 at 01:32:24PM +1000, Martin Pool wrote:
> |
> |>(Maybe some of the Canonical people have an opinion on this.)
> |>
> |>To get really good HTTP download performance it seems that we need to
> |>do parallel/overlapped downloads.
>
> Yes, I think so, at least as a lowest-common-denominator. Pipeline
> support isn't universal, but I don't know how widely it is supported.
> Maybe we can find out from Netcraft?
Any web server installed these days will support http/1.1 which supports
pipelining. You should be able to get 99% bw efficiency out of a single
tcp connection via that. For fixed length files you can get reasonable
pipelining out of a HTTP/1.0 server for that matter.
> | There are a few other packages that provide asynchronous communication
> | besides Twisted:
> |
> | http://squirl.nightmare.com/medusa/
> | asyncore (part of the Python distro)
>
> Cool. This looks more like the scale we're looking for-- the
> documentation isn't overwhelming, and it knows it's a library, not a
> framework. It looks like this is the basis for the http client Fredrik
> Lundh offers.
I think that before anything async is introduced, some thought should be
put into the programming needs for windows & gtk guis - guis are
inherently event based, and that is hella-easier to build on top of a
event based library than a synchronous one. See for instance the
gymnastics pybaz had to do provide even rudimentary events on top of
baz. With respect to library vs framework - I don't think that that
should even be a consideration : if we want to start doing async
activity, its pretty well established that event based programming is
the most efficient way to do that for performance (see Dan Kegels 10K
page for instance), and also for debugging/programmer time (single
threaded race conditions are much easier to debug that multithreaded
ones). That said, asyncore is the root level required to build a event
framework on top of - something that twisted has already done. More
compelling for me are the existing protocol support and the win32 and
gtk specific 'reactors' - twisted can directly coordinate all async
activity on all the platforms we would want guis or command line
programs. The reactor can also be run in the same limited fashion that
the asyncore loop, for cautious or casual introduction.
Given how much of a PITA HTTP1.1 conformance is, I /really/ recommend
against rolling our own, or using one that has been rolled on the cheap.
(And yes, we need conformance to be able to get the pipelining that is
desired).
To answer Aarons question about why Twisted would be better for a smart
server, there are several reasons:
* If the bzrlib core is event driven, a smart server can essentially
just broker requests and responses - it becomes a protocol
implementation only, which is very small (probably under 100 lines) or
even less if its mapped into ftp/sftp/http namespace as then you just
answer request events, no protocol overhead required at all.
* threads in python suck quite badly, being forced to introduce them in
a smart server would be a great way to make it slow. (python is single
threaded, threads enter a single interpreter lock to have bytecode
execute).
* There are a bunch of existing protocol implementations in twisted that
a smart server could trivially use - for instance IRC for control or
reporting, ssh straight into the server for administration,
sftp/ftp/http for serving data.
Apache have moved away from the multiple child process model - they now
have threaded and event based mpms, which are a much better choice for
high performance, robust & reliable networking. Again, see Dan Kegels
page for a bunch of useful info.
Anyway, my 2c summary:
* Don't put anything async in today without due care: async programming
styles are not as easy to switch between as sync to async is in the
first place.
* The pipelining performance shouldn't need asyncore or twisted today,
though I don't know if urllib et al have the right api to do without it.
* Twisted does offer a very significant set of features for both gui and
smart server development, all of which one *can build* on asyncore, but
why would we want to start from scratch?
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050615/96fe4981/attachment.pgp
More information about the bazaar
mailing list