[RFC] Ways to make initial knit creation faster (both push and commit)

Robey Pointer robey at lag.net
Wed Aug 23 19:03:40 BST 2006


On 18 Aug 2006, at 12:02, John Arbash Meinel wrote:

> I spent a little time today doing an --lsprof of why 'bzr push' is so
> slow when creating a new remote branch. Here are a few notes I found:

[...]

> 4) Opening a file in append mode requires a round trip.
>    I'm not sure if we can do much better than this, but because
>    sftp.write() is actually a pwrite() call (it includes the offset
>    to write at), it has to do a stat of the remote file to figure out
>    where the append should start.

Something about this didn't seem "right" to me so I dug into it, in  
paramiko.  I'm now pretty convinced that the stat() is unnecessary.

The later sftp RFC drafts make it clear that the offset in a write is  
ignored when a file is opened for append.  (That makes sense because  
it follows posix: writes to an append-mode file go to the end, no  
matter where the seek pointer is.)

The earlier drafts -- the ones that describe the sftp protocol  
version in common use -- are shorter and don't say anything about it,  
but after looking at the sftp code in openssh, they're relying on the  
same posix behavior.  So I think it's safe to assume that if both  
openssh *and* the RFC agree, that's the right thing to do.  (The  
reason I'm hedging so much here is that the RFC tends, in some  
places, to drift way out beyond what any actual implementation does.)

I've changed paramiko's trunk to avoid the stat() on files opened in  
append mode, postponing it until/if we need it (for example, seek()  
on an O_APPEND|O_RDWR file).  If you think it might still help, I'd  
appreciate any feedback.

robey





More information about the bazaar mailing list