[RFC] question about different behavior of WorkingTree.iter_changes

Alexander Belchenko bialix at ukr.net
Fri Apr 11 19:03:12 BST 2008


Aaron Bentley пишет:
> Alexander Belchenko wrote:
>> I'm working hard to finish my line-endings support work ASAP.
>> With my eol changes (running from python sources):
> 
>> python bzr --no-plugins --no-aliases st
>> time: 1.656 sec.
> 
>> python bzr --no-plugins --no-aliases st -S
>> time: 5.828 sec.
> 
>> This difference seems a bit TOO much. Something really wrong here.
>> Can somebody give me a hint why it behaves *so* different?
> 
> I haven't had a chance to look at this, but Bazaar has profiling support
> built-in, with the --lsprof flag.  This can help you track down
> performance problems.

I did profiling but it shows me too much info from which I can't really
understand where is bottleneck. So I just carefully inspect my code.

>> One suspicious point about my eol code and its interaction with dirstate:
>> for obtaining 'eol' versioned propertiy value I run for every entry
>> method tree.id2path(fileid). I have suspicious that in the case
>> of status --short inventory is not built, and every my lookup ends in
>> building inventory in memory or something similar. Could be this is the
>> case?
> 
> id2path is a surprisingly expensive operation, and if you're doing it
> repeatedly, it's best to do something else instead.

The best what I can do instead -- it's not doing id2path at all. This is how
I solve this problem right now. But there is many other places where I don't
have this luxury just because I have only file_id to work with.

It's inevitable for me to use id2path for this cases because I need to know
file basename or at least its file extension. Otherwise my work for
versioned properties will not work as expected.

So I think sooner or later id2path (and path2id too) need to be speed up.

> Note that we are trying to avoid *ever* generating a full inventory, but
> if we do generate one, we cache it.  I suppose it's possible that you're
> doing something that invalidates the cache, but I think it's unlikely.

Actually I don't need full inventory, just mapping between file_id and filenames
and vice versa.

>> Even if I rework my code to provide lazy lookup for 'eol' property this
>> will not help for the case of many changes in the tree.
> 
> No, but scaling with the amount of change is the best we can hope to
> accomplish anyhow.

OK.



More information about the bazaar mailing list