sftp tests are slow
Robey Pointer
robey at lag.net
Tue Jan 10 19:57:49 GMT 2006
I changed the subject because a bunch of different threads had the
same meaningless subject line. :)
On 9 Jan 2006, at 18:05, John A Meinel wrote:
> Robert Collins wrote:
>> On Thu, 2006-01-05 at 11:01 -0600, John Arbash Meinel wrote:
>>
>>> Oh, and one other thing. For me, the sftp w/ server tests each take
>>> around 1.5s even if they only do something trivial. I noticed
>>> that you
>>> test adapter runs a function sequentially, so it run
>>> (test_put(LocalTransport), test_put(SftpTranport), ...).
>>> If you can make the adapter track resources and instantiate the sftp
>>> server for all tests rather than recreating it each time, I'll
>>> give you
>>> a +2. :) 10 tests = 15s just spawning the sftp server.
>>
>> Yes, this is slow. We chatted on IRC about this - its about 10 times
>> slower than starting openssh... I find it hard to believe that
>> this is
>> intrinsic to python - I have to regard it as a bug in the stub
>> server.
10 times is hard to believe -- what did you use to test that?
> Well, that would make it seem to be a paramiko issue.
> Robey- have you tried profiling paramiko by itself? It is startup
> time,
> or connection time, or something else?
Not to sound like a broken record ;) but starting doing the SSH
handshaking again for every test is going to be expensive no matter
how you slice it. The fact that using a stub sftp server means we do
both sides of the handshaking at once doesn't help.
On the train this morning I ran the "test_sftp" unit tests 3 ways: as
they exist in bzr.dev; without the stub sftp server (using 'sftp://
robey at localhost/%s/' % self.test_dir); and without the stub sftp
server, using openssh instead of paramiko. I ran "selftest
test_sftp". Two tests fail when not using the stub sftp server,
apparently because they make assumptions about the stub sftp server
behavior and logging -- but I don't think the failures shortened the
test significantly so I ignored it.
sftp method 1st 2nd 3rd avg
----------------------------------------------
current 32.9s 34.5s 35.1s 34.2s
no stub server 27.5s 25.6s 24.6s 25.9s (+24%)
w/openssh 23.8s 25.6s 23.5s 24.3s (+29%)
There would be significant gain from not using a stub sftp server, if
we could figure out some way to swing that. Actually the only reason
it's there is because I wanted us to test the sftp stuff as
completely as possible without external requirements -- maybe it'd be
okay to require that you have a localhost sftp server to run those
tests.
Last night I ran the (unmodified) "selftest test_sftp" tests on my
Linux box under lsprof and posted the results here:
http://www.lag.net/~robey/test_sftp.html
Apparently lsprof only tracks the main thread so a lot of work is
hidden away by "ask the work thread to do this and wait for the
results" -- including the entire stub sftp server. But
significantly, 13.3 seconds out of 21.9 (61%) are spent inside
paramiko.Transport.start_client -- which is where the SSH negotiation
happens. That does confirm to me that a large chunk of the time
spent in sftp tests is just spent negotiating new secure connections.
Now, there are ways to fix that, like having the sftp transport
plugin cache connections... but it will change the tests
significantly, and I'm not totally convinced that optimizing the unit
tests is a high priority yet. If it is, I can play around with some
caching strategies, or possibly we could rethink how we want to test
this stuff. I don't have any strong opinions yet.
robey
More information about the bazaar
mailing list