Rev 4072: (mbp) hpss streaming design docs in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Tue Mar 3 04:24:14 GMT 2009
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 4072
revision-id: pqm at pqm.ubuntu.com-20090303042409-qa96pox029nf2zus
parent: pqm at pqm.ubuntu.com-20090303034049-faaink61hujui1sy
parent: mbp at sourcefrog.net-20090119101414-tj4r8rhmnzofs2lz
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Tue 2009-03-03 04:24:09 +0000
message:
(mbp) hpss streaming design docs
modified:
doc/developers/network-protocol.txt networkprotocol.txt-20070903044232-woustorrjbmg5zol-1
------------------------------------------------------------
revno: 3944.1.1
revision-id: mbp at sourcefrog.net-20090119101414-tj4r8rhmnzofs2lz
parent: pqm at pqm.ubuntu.com-20090119030630-3xdyyi4xj69md8e4
committer: Martin Pool <mbp at sourcefrog.net>
branch nick: doc-hpss
timestamp: Mon 2009-01-19 21:14:14 +1100
message:
Notes with Andrew about hpss streaming
modified:
doc/developers/network-protocol.txt networkprotocol.txt-20070903044232-woustorrjbmg5zol-1
=== modified file 'doc/developers/network-protocol.txt'
--- a/doc/developers/network-protocol.txt 2008-05-16 07:15:57 +0000
+++ b/doc/developers/network-protocol.txt 2009-01-19 10:14:14 +0000
@@ -2,7 +2,7 @@
Network Protocol
================
-:Date: 2007-09-03
+:Date: 2009-01-07
.. contents::
@@ -221,19 +221,24 @@
The underlying message format is::
- MESSAGE := "bzr message 3 (bzr 1.6)" NEWLINE HEADERS MESSAGE_PARTS
+ MESSAGE := MAGIC NEWLINE HEADERS CONTENTS END_MESSAGE
+ MAGIC := "bzr message 3 (bzr 1.6)"
HEADERS := LENGTH_PREFIX bencoded_dict
- MESSAGE_PARTS := MESSAGE_PART [MORE_MESSAGE_PARTS]
- MORE_MESSAGE_PARTS := END_MESSAGE_PARTS | MESSAGE_PARTS
- END_MESSAGE_PARTS := "e"
+ END_MESSAGE := "e"
+ BODY := MESSAGE_PART+
MESSAGE_PART := ONE_BYTE | STRUCTURE | BYTES
ONE_BYTE := "o" byte
STRUCTURE := "s" LENGTH_PREFIX bencoded_structure
BYTES := "b" LENGTH_PREFIX bytes
+(Where ``+`` indicates one or more.)
+
This format allows an arbitrary sequence of message parts to be encoded
-in a single message.
+in a single message. The contents of a MESSAGE have a higher-level
+message, but knowing just this amount of data it's possible to
+deserialize and consume a message, so that implementations can respond to
+messages sent by later versions.
Headers
~~~~~~~
@@ -254,36 +259,54 @@
describes how such messages are encoded. All requests and responses
defined by earlier protocol versions must be encoded in this way.
-Conventional requests will send a sequence of:
-
-* Arguments (a STRUCTURE of a tuple)
-
-* (Optional) body
-
- * Single body (BYTES), or
-
- * Streamed body (multiple BYTES parts), followed by a status (ONE_BYTE)
-
- * if status is "E", followed by an Error (STRUCTURE)
-
-Conventional responses will send a sequence of:
-
-* Status (ONE_BYTE)
-
-* Arguments (a STRUCTURE of a tuple)
-
-* (Optional) body
-
- * Single body (BYTES), or
-
- * Streamed body (multiple BYTES parts), followed by a status (ONE_BYTE)
-
- * if status is "E", followed by an Error (STRUCTURE)
-
-In all cases, the ONE_BYTE status is either "S" for Success or "E" for
-Error. Note that the streamed body from version two is now just multiple
+Conventional requests will send a CONTENTS of ::
+
+ CONV_REQ := ARGS SINGLE_OR_STREAMED_BODY?
+ SINGLE_OR_STREAMED_BODY := BYTES
+ | BYTES+ TRAILER
+
+ ARGS := STRUCTURE(argument_tuple)
+ TRAILER := SUCCESS_STATUS | ERROR
+ SUCCESS_STATUS := ONE_BYTE("S")
+ ERROR := ONE_BYTE("E") STRUCTURE(argument_tuple)
+
+Conventional responses will send CONTENTS of ::
+
+ CONV_RESP := RESP_STATUS ARGS SINGLE_OR_STREAMED_BODY?
+ RESP_STATUS := ONE_BYTE("S") | ONE_BYTE("E")
+
+If the RESP_STATUS is success ("S"), the arguments are the
+method-dependent result.
+
+For errors (where the Status byte of a response or a streamed body is
+"E"), the situation is analagous to requests. The first item in the
+encoded sequence must be a string of the error name. The other arguments
+supply details about the error, and their number and types will depend on
+the type of error (as identified by the error name).
+
+Note that the streamed body from version two is now just multiple
BYTES parts.
+The end of the request or response is indicated by the lower-level
+END_MESSAGE. If there's only one BYTES element in the body, the TRAILER
+may or may not be present, depending on whether it was sent as a single
+chunk or as a stream that happens to have one element.
+
+ *(Discussion)* The success marker at the end of a streamed body seems
+ redundant; it doesn't have space for any arguments, and the end of the
+ body is marked anyhow by the end of the message. Recipients shouldn't
+ take any action on it, though they should map an error into raising an
+ error locally.
+
+ 1.10 clients don't assert that they get a status byte at the end of the
+ message. They will complain (in
+ ``ConventionalResponseHandler.byte_part_received``) if they get an
+ initial success and then another byte part with no intervening bytes.
+ If we stop sending the final success message and only flag errors
+ they'll only get one if the error is detected after streaming starts but
+ before any bytes are actually sent. Possibly we should wait until at
+ least the first chunk is ready before declaring success.
+
For new methods, these sequences are just a convention and may be varied
if appropriate for a particular request or response. However, each
request should at least start with a STRUCTURE encoding the arguments
@@ -292,11 +315,105 @@
bencoded. As a result, unlike previous protocol versions, arguments in
this version are 8-bit clean.)
-For errors (where the Status byte of a response or a streamed body is
-"E"), the situation is analagous to requests. The first item in the
-encoded sequence must be a string of the error name. The other arguments
-supply details about the error, and their number and types will depend on
-the type of error (as identified by the error name).
+ (Discussion) We're discussing having the byte segments be not just a
+ method for sending a stream across the network, but actually having them
+ be preserved in the rpc from end to end. This may be useful when
+ there's an iterator on one side feeding in to an iterator on the other,
+ if it avoids doing chunking and byte-counting at two levels, and if
+ those iterators are a natural place to get good granularity. Also, for
+ cases like ``insert_record_stream`` the server can't do much with the
+ data until it gets a whole chunk, and so it'll be natural and efficient
+ for it to be called with one chunk at a time.
+
+ On the other hand, there may be times when we've got some bytes from the
+ network but not a full chunk, and it might be worthwhile to pass it up.
+ If we promise to preserve chunks, then to do this we'd need two separate
+ streaming interfaces: "we got a chunk" and "we got some bytes but not
+ yet a full chunk". For ``insert_record_stream`` the second might not be
+ useful, but it might be good when writing to a file where any number of
+ bytes can be processed.
+
+ If we promise to preserve chunks, it'll tend to make some RPCs work only
+ in chunks, and others just on whole blocks, and we can't so easily
+ migrate RPCs from one to the other transparently to older
+ implementations.
+
+ The data inside those chunks will be serialized anyhow, and possibly the
+ data inside them will already be able to be serialized apart without
+ understanding the chunks. Also, we might want to use these formats e.g.
+ for pack files or in bundles, and so they don't particularly need
+ lower-level chunking. So the current (unmerged, unstable) record stream
+ serialization turns each record into a bencoded tuple and it'd be
+ feasible to parse one tuple at a time from a byte stream that contains a
+ sequence of them.
+
+ So we've decided that the chunks won't be semantic, and code should not
+ count on them being preserved from client to server.
+
+Early error returns
+~~~~~~~~~~~~~~~~~~~
+
+ *(Discussion)* It would be nice if the server could notify the client of
+ errors even before a streaming request has finished. This could cover
+ situtaions such as the server not understanding the request, it being
+ unable to open the requested location, or it finding that some of the
+ revisions being sent are not actually needed.
+
+ Especially in the last case, we'd like to be able to gracefully notice
+ the condition while the client is writing, and then have it adapt its
+ behaviour. In any case, we don't want to have drop and restart the
+ network stream.
+
+ It should be possible for the client to finish its current chunk and
+ then its message, possibly with an error to cancel what's already been
+ sent.
+
+ This relies on the client being able to read back from the server while
+ it's writing. This is technically difficult for http but feasible over
+ a socket or ssh.
+
+ We'd need a clean way to pass this back to the request method, even
+ though it's presumably in the middle of doing its body iterator.
+ Possibly the body iterator could be manually given a reference to the
+ request object, and it can poll it to see if there's a response.
+
+ Perhaps we need to distinguish error conditions, which should turn into
+ a client-side error regardless of the request code, from early success,
+ which should be handled only if the request code specifically wants to
+ do it.
+
+Full-duplex operation
+~~~~~~~~~~~~~~~~~~~~~
+
+ Code not geared to do pipelined requests, and this might require doing
+ asynchrony within bzrlib. We might want to either go fully pipelined
+ and asynchronous, but there might be a profitable middle ground.
+
+ The particular case where duplex communication would be good is in
+ working towards the common points in the graphs between the client and
+ server: we want to send speculatively, but detect as soon as they've
+ matched up.
+
+ So we could for instance have a synchronous core, but rely on the OS
+ network buffering to allow us to work on batches of say 64kB. We can
+ also pipeline requests and responses, without allowing for them
+ happening out of order, or mixed requests happening at the same time.
+
+ Wonder how our network performance would have turned out now if we'd
+ done full-duplex from the start, and ignored hpss over http. We have
+ pretty good (readonly) http support just over dumb http, and that may be
+ better for many users.
+
+
+
+APIs
+====
+
+On the client, the bzrlib code is "in charge": when it makes a request, or
+asks from data from the network, that causes network IO. The server is
+event driven: the network code tells the response handler when data has
+been received, and it takes back a Response object from the request
+handler that is then polled for body stream data.
Paths
=====
More information about the bazaar-commits
mailing list