[plugin] dirstate experiments
John Arbash Meinel
john at arbash-meinel.com
Wed Jun 14 23:41:18 BST 2006
John Arbash Meinel wrote:
> Robert Collins wrote:
>
> ...
>
>> Well, there are two cases here. The stat cache is just a validator for
>> the fs, so unpacking it does not make sense. However, the sha1 is stored
>> in the inventory in commit, and code expects to get at it, so I think
>> having the sha be accessible is a good idea - I'd rather pay a little
>> bit more in size for a format that is directly usable in the inventory
>> logic, as streaming reads are fast - its seeks that are slow.
>>
>>> It most certainly is not editable in that form, but it is readable.
>>>
>>> Also, it turns out that the major overhead at this point is the giant
>>> 'text.decode()', plus the overhead of operating on unicode objects
>>> rather than operating on str() objects.
>> I was wondering when that would start to bite :(.
>
>
...
>
> This is the breakdown based on number of parent entries:
> num str unicode file_size
> 0 100ms 150ms 3.1MB
> 1 177ms 268ms 5.7MB
> 2 275ms 352ms 5.9MB
> 3 370ms 470ms 6.1MB
>
I have good news and bad news about the above numbers, and both are the
same thing. I was unable to beat these numbers with a C++ extension.
str.split() is just really really fast, and doing slicing on them is
also super fast.
Maybe with an alternate format could we do better in C++, but lots of
calls to string.find (both C++ std::string and python str) is just much
worse than building up the list with str.split() and then slicing into
that list.
I tried quite a few methods. But a list comprehension with slicing is
pretty much impossible to beat.
Maybe if we weren't splitting on newline characters we could use some
sort of inner loop in C++ that would handle length prefixed chars (like
the blob format I worked on).
But this is just so much better than anything else I was able to create.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060614/13bafcd2/attachment.pgp
More information about the bazaar
mailing list