[MERGE][#230567] Faster (local) branch

Thu May 29 01:47:20 BST 2008

Ian Clatworthy wrote:
> Aaron Bentley wrote:
>> Ian Clatworthy wrote:
>>> John Arbash Meinel wrote:
> 
>>>> I can say that if it is costing us a lot, then we might want to
>>>> reconsider how
>>>> we do it.
>>> On large trees like Mozilla and OOo, it certainly is costing a lot.
>>> The bigger the tree, the higher the cost.
>> Could you please explain *what* is costing us a lot?  Your patch is
>> fairly opaque.
> 
> Sorry. Our current algorithm - always checking for and managing
> any existing files in the working tree even though they can't exist
> when running bzr branch - is costing us a lot. For example:

The only code that needs to check for existing files is the conflict
resolution code.  In your example data, that takes up only 4.69 ticks.

Aside from find_conflicts, everything else should be scaling down to
zero when there are no existing files.

Perhaps you're referring to Tree.id2path.  We ought to expect that to be
fast.  Only one dictionary lookup should be needed.

If it's not fast, there are things we can do:
A. Use a different Tree API
B. Use the TreeTransform data
C. Add TreeTransform data

A. Use a different API
======================
We can use a different API on Tree.  We don't have an ids2paths API, but
we really should.  But iter_entries_by_dir can serve.

You can build up a dictionary mapping id to path using dict(e.id, for p,
e in tree.iter_entries_by_dir(specific_file_ids=ids))

If *that* isn't fast for an empty tree, something is terribly wrong.

(We could use iter_changes similarly, but I doubt it would be
faster.)

B. Use the TreeTransform data
=============================
We can tell that an id is either new, or has been moved to a different
file if the id is present in tt._r_new_id.  If it is in tt._r_new_id,
then we can tell that there was no previous file if it is not in
tt._removed_id.

So we could rewrite this portion as:
brand_new_ids = set(r.new_id.keys()).difference(tt._removed_id)
...
if new_entry.file_id in brand_new_ids:
    old_path = None
else:
    old_path = self._tree.id2path(new_entry.file_id)

Note that this also eliminates a ton of exception handling, which is
relatively expensive in Python.  So it may eliminate some of the 2.64
self ticks, as well as the 9.35 ticks attributed to id2path.

This is guaranteed to scale with the size of the change, not the size of
the tree, and relies on set operations, which are pretty darn fast.

C. Add TreeTransform data
=========================
It would be fairly simple to keep track of whether there were any
existing files as the transform was being built up.  Then that flag
could be used like this:

if not tt._any_existing:
    old_path = None
else:
    try:
        old_path = self._tree.id2path(new_entry.file_id)
    except NoSuchId:
        old_path = None

For accelerating build_tree, strategy B seems like a winner to me.  For
large-scale merges and reverts, strategy A would make sense.  Because in
those cases, there are lots of old paths to deal with, so getting lots
of paths quickly is important.  The two strategies could also be combined.

> I've attached a zip file with the 2 callgrind.out files.
> In summary, bzr.dev is taking 35% of its time in tt.apply()
> and resolve_conflicts(). In my branch, tt._apply_for_build_tree()
> takes 9.3% (and nearly all of that - 8% total - is taken in
> applying the inventory delta) and resolve_conflicts() isn't
> called at all when creating a branch.

It's a contract violation not to.  That's how we handle files on Windows
that case-insensitively have the same filename.

>> That TODO is stale.  I and Rob worked on that once, but there was no
>> observable performance improvement, so I never merged it.
> 
> Fair enough. Right now, we're chmod'ing each file where executable
> is not None. I changed this to only set the mode is it was True.

AFAIK, that's not accurate.  Without further data, you don't know
whether the execute bit is being set on newly-created files.

> Even with hundreds of executable bits set in the mozilla repo, the
> time drops from 5% of a big number (24 secs) to < 1% of a smaller
> number (15 secs). I can update the TODO if you like.

Hmm.  I can't seem to find that old branch.  But the basic strategy was
to use os.open instead of the builtin open.  os.open allows the
permissions to be specified at file creation time, so that we only have
to do one syscall, and can skip the chmod.

>>> This isn't an issue. One issue is walking 100K files to find that we
>>> only need to rename the top few from limbo across.
>> Okay, so how can we fix this in the general case?
> 
> My initial patch was using a faster code path if a parameter was
> passed to build_tree enabling it. John has suggested that detecting
> when that fast path is appropriate is better. I agree - detection
> ought to be safer and ensure that the fast path is adopted whenever
> it's applicable, not just when it's requested. The trick is ensuring
> that the detection check is the right one. I'm hoping you and/or Rob
> can help me there.

So far, the high-level changes seem to be
1. not calling id2path
2. better chmod handling
3. not calling resolve_conflicts

3 is not acceptable.  1 and 2 don't seem to require a new code path.  So
I think this idea of adopting a fast path when applicable isn't
necessary.  Are there other details I'm missing?

I'll freely admit I've left branch creation speed on the back burner,
because I was waiting for a new implementation of iter_files_bytes.
These numbers are a good kick in the pants, because they show room for
improvement even without it.

They've stimulated new thinking for me, and I hope my feedback is
helpful to you.  But yes, I am very much wedded to the conceptual
simplicity of a single codepath.  I want to our efficiency wins to apply
to merge and revert as well.  I want our behavior to be the same for all
operations.  And of course, less code is usually better.

Using a single codepath is a strategy that's worked well so far-- the
single-codepath implementation is only 1.6x slower than the fast-path
implementation.  Let's see how much farther we can push it!

Aaron