HTTP APPEND

Vincent LADEUIL v.ladeuil at alplog.fr
Wed Aug 30 10:36:37 BST 2006


>>>>> "jam" == John Arbash Meinel <john at arbash-meinel.com> writes:

    jam> Goffredo Baroncelli wrote:
    >> On Tuesday 22 August 2006 17:48, Vincent LADEUIL wrote:
    >> Hello Vincent,
    >> 
    >> [...]
    jam> Does WebDAV not have any sort of Append command?
    >>> 
    >>> Don't hold your  breath. I don't have the  reference handy, but a
    >>> draft  exists   (dated  07/2006),  proposing   APPEND  and  PATCH
    >>> additions to DAV.
    >> 
    >> On the basis of my search, in order to write a part of a file we have to use 
    >> the PUT method + the "Content-Range:" header directive. A typical request 
    >> should be:
    >> 

Yes. This is what I've done in the plugin (upgrade ! :)

But that's nice to see that we agree on how to do it. Thanks for doing it.

    >> PUT /path/to/file HTTP/1.1
    >> Host: www.somehost.net
    >> Content-Range: bytes 5-14/15
    >> Content-Length: 10
    >> [...]
    >> 
    >> The example above highlight how append 10 bytes of data to a file with a 
    >> (initial) length of 5 bytes.
    >> 
    >> The enclosed patch implement an example of the append method. The patch was 
    >> tested on a apache webdav implementation.

Which version of Apache were you using on which OS ?

I did tests on Mac OS X :
 Apache/1.3.33 (Darwin) mod_ssl/2.8.24 OpenSSL/0.9.7i DAV/1.0.3 PHP/4.4.1

and on ubuntu 5.10

Server: Apache/2.0.54 (Ubuntu) DAV/2

I'll welcome any other combination !

    >> I don't know if it work with other web servers ( IIS ??
    >> ). If you want understand better the protocol you can
    >> watch on what curl does ( see the -C curl option ).

Or set the pycurl.VERBOSE option in the webdav._set_curl_common_options.

    >> 
    >> Open point:
    >> 1) the append phase is performed in two steps:
    >>   a) get the length of the destination (via the HEAD
    >>      method)
    >>   b) append the content at the end of the file ( via PUT +
    >>      Content-Range: )
    >> We have to prevent that another client update the file
    >> between "a" and "b"

As John pointed out, the lock should cover us, but there is
another problem: Apache ignores an invalid Range header instead
of issuing a '501 Not implemented' error. 

I plan to test that at the first connection before choosing to
use the Range header or not, but I would like some feedback on
that (there is a TODO in test_webdav.py talking about it).

    >> 
    >> 2) test with other web server

Put an 's' to server and I'll agree :-)

    >> 3) it is not very clear the meaning of the third parameter
    >> of the Contenet-Range option ( the one after the '/' ): is
    >> it the length of the old or the new file ?

http://www.rfc.net/rfc2616.html#s14.16 says:

,----
| A byte-content-range-spec with a byte-range-resp-spec whose
| last- byte-pos value is less than its first-byte-pos value, or
| whose instance-length value is less than or equal to its
| last-byte-pos value, is invalid. The recipient of an invalid
| byte-content-range- spec MUST ignore it and any content
| transferred along with it.
`----

As 'instance-length > last-byte-pos', it's the length of the new
file.

Note also that Apache developers use this definition to ignore
the invalid Range header, even if
http://www.rfc.net/rfc2616.html#s9.6 says:

,----
| The recipient of the entity MUST NOT ignore any Content-*
| (e.g. Content-Range) headers that it does not understand or
| implement and MUST return a 501 (Not Implemented) response in
| such cases.
`----

hence the need to explicitly test that the Range header is supported.

    >> 
    >> Comment are welcome

As is your participation.

<snip/>

    jam> Interstingly enough Vincent has also come across this
    jam> information. And was working on something similar.

<ad>
Available at:

https://launchpad.net/people/v-ladeuil/+branch/bzr.webdav/trunk

Test today !

</ad>

    jam> As far as the 'prevent someone else from appending
    jam> content', we do hold a lock. However, it is possible for
    jam> someone to break a lock underneath us.

Yes. The append is not atomic.

    jam> And if someone breaks the lock underneath us, and then
    jam> races with us to update files, it would be possible for
    jam> us to have data loss, and actual corruption.  Because
    jam> after we write, we would expect that the data exists, so
    jam> we would write the inventory referencing the texts, but
    jam> not all of the indexes may exist for those texts
    jam> (because some were overwritten).

a) We can check that our lock is still existing. But I can't see
where the transport can find that information though... (1
roundtrip and Exception on error)

b) We can issue another HEAD command to check the length (1
roundtrip and exception on error or retry the append until we get
the impression that we succeed :)

c) We can use DAV locks to make the append atomic (2 roundtrips
but again someone could steal our lock and we are still not
guarded against other non-DAV access methods)

So I guess the question is: do we trust the lock enough to
prevent corruption ?

If not, I will prefer the b) option with the retry dance. But we
are in a grey area here because if the lock have been breaked, we
should stop updating anything as soon as possible.

    jam> Now, if the portion after the '/' had the meaning: If
    jam> this is incorrect, then puke, and reject the append, we
    jam> could at least now right away that something changed,
    jam> and we could stop. Unfortunately HTTP doesn't seem to
    jam> have a track record of ensuring integrity. More of
    jam> trying to be helpful and doing what it can, even if it
    jam> is incorrect/incomplete.

Indeed... That's why the APPEND command still have some merits if
implemented in a atomic way.

    jam> (Well, at least HTML is that way, I won't say *too* much
    jam> about HTTP).


    jam> At least the spec says that if the server *doesn't*
    jam> support PUT with a Content-Range, then it must abort and
    jam> alert the client.

But Apache do not respect that.

    Vincent




More information about the bazaar mailing list