Slow inventory extraction from weavefile

John A Meinel john at arbash-meinel.com
Tue Oct 4 03:17:22 BST 2005


Johan Rydberg wrote:
> John A Meinel <john at arbash-meinel.com> writes:
>
>
>>>I don't know what the target speed is, but with the latest bzr.newformat
>>>tree, on my old 450MHz celeron, it takes >1.5 seconds per inventory that
>>>I want to extract. (For 20 revision, it took 36 seconds, over the course
>>>of all revisions, it took between 1.8-2.0s per revision).
>
>
> I have also noticed this.  Things like "bzr log FILE" has become
> _really_ slow.

To agree with you:
$ cd bzr.dev
$ time bzr log -r -100.. README > /dev/null
real    0m44.891s
user    0m43.941s
sys     0m0.637s

$ cd ../bzr.new-weaved
$ time bzr5 log -r -100.. README > /dev/null
real    6m22.017s
user    6m3.986s
sys     0m15.937s

That is about an 8x slowdown.

>
>
>>>Is there a specific performance desired? Because a commit is going to
>>>need to insert an entry into the weave, which at the very least means
>>>extracting the old inventory, so this adds a sizeable amount of time to
>>>the commit, just to update the inventory.
>
>
> Have you tried this with your own append-only weave format?  Any
> differences in performance?

My append-only format will probably work much worse than the default.
Because currently I load the whole thing into the same in-memory format,
which means I have to parse and process all the insertions each time.
While the current format is pretty close to a plain dump of the
in-memory structure.

In theory mine could evolve into something that wouldn't need to read
the entire file, but since it looks like Aaron actually created the
WeaveDiff (which is supposed to do that) his might be the better branch
to follow.

I've also been pondering what if we only store X number of revisions per
weavefile, but then have more than 1 weavefile at a time.
So you could do something like 100 revisions. Or maybe 200 revisions,
with a 100 revision window (1-200, 100-300, 200-400, etc).

If we had a marker for "from an ancestor not in this weave", and then a
reasonably fast/simple way of pulling two of these weave files together
into a major weave.

Now, it might be just that the current implementation is slower than it
would have to be. It might be faster if implemented in C, but it would
have to be quite a bit faster to be worth anything. (But if it was 10x
faster, that would be worth a lot).

John
=:-

>
> ~j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051003/482eee53/attachment.pgp 


More information about the bazaar mailing list