Rev 2643: (John Arbash Meinel) Implement DirState._read_dirblocks() in pyrex in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Fri Jul 20 20:48:25 BST 2007
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 2643
revision-id: pqm at pqm.ubuntu.com-20070720194822-smqttk05w6efypf0
parent: pqm at pqm.ubuntu.com-20070720172520-i2ezksmrduaonojd
parent: john at arbash-meinel.com-20070720182620-948wu6weli9aupkq
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Fri 2007-07-20 20:48:22 +0100
message:
(John Arbash Meinel) Implement DirState._read_dirblocks() in pyrex
added:
bzrlib/_dirstate_helpers_c.pyx dirstate_helpers.pyx-20070503201057-u425eni465q4idwn-3
bzrlib/_dirstate_helpers_py.py _dirstate_helpers_py-20070710145033-90nz6cqglsk150jy-1
bzrlib/benchmarks/bench_dirstate.py bench_dirstate.py-20070503203500-gs0pz6zkvjpq9l2x-1
bzrlib/tests/test__dirstate_helpers.py test_dirstate_helper-20070504035751-jsbn00xodv0y1eve-2
modified:
.bzrignore bzrignore-20050311232317-81f7b71efa2db11a
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/benchmarks/__init__.py __init__.py-20060516064526-eb0d37c78e86065d
bzrlib/dirstate.py dirstate.py-20060728012006-d6mvoihjb3je9peu-1
bzrlib/tests/__init__.py selftest.py-20050531073622-8d0e3c8845c97a64
bzrlib/tests/test_dirstate.py test_dirstate.py-20060728012006-d6mvoihjb3je9peu-2
bzrlib/workingtree_4.py workingtree_4.py-20070208044105-5fgpc5j3ljlh5q6c-1
setup.py setup.py-20050314065409-02f8a0a6e3f9bc70
------------------------------------------------------------
revno: 2474.1.74
merged: john at arbash-meinel.com-20070720182620-948wu6weli9aupkq
parent: john at arbash-meinel.com-20070720173448-cn7og836bl8dovwv
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-07-20 13:26:20 -0500
message:
Revert the accidental removal of the Unicode normalization check code.
It was done to profile how much it was costing us, but it wasn't meant to be removed.
------------------------------------------------------------
revno: 2474.1.73
merged: john at arbash-meinel.com-20070720173448-cn7og836bl8dovwv
parent: john at arbash-meinel.com-20070720170136-pa6kb99lxxmekyji
parent: pqm at pqm.ubuntu.com-20070720161548-nppg3mvd38gbuaid
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-07-20 12:34:48 -0500
message:
[merge] bzr.dev 2641
------------------------------------------------------------
revno: 2474.1.72
merged: john at arbash-meinel.com-20070720170136-pa6kb99lxxmekyji
parent: john at arbash-meinel.com-20070718204238-5gi11fx04q7zt72d
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-07-20 12:01:36 -0500
message:
Document a bit more what is going on in _dirstate_helpers_c.pyx, from Martin's comments
------------------------------------------------------------
revno: 2474.1.71
merged: john at arbash-meinel.com-20070718204238-5gi11fx04q7zt72d
parent: john at arbash-meinel.com-20070718203014-u8gpbqn5z9ftx1tu
parent: pqm at pqm.ubuntu.com-20070717180333-5smmeduk2q3sbzvw
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Wed 2007-07-18 15:42:38 -0500
message:
[merge] bzr.dev 2625
------------------------------------------------------------
revno: 2474.1.70
merged: john at arbash-meinel.com-20070718203014-u8gpbqn5z9ftx1tu
parent: john at arbash-meinel.com-20070713212835-m330r85zq4xwgipi
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Wed 2007-07-18 15:30:14 -0500
message:
Lot's of fixes from Martin's comments.
Fix signed/unsigned character issues
Add lots of comments to help understand the code
Add tests for proper Unicode handling (we should abort if we get a Unicode string,
and we should correctly handle utf-8 strings)
------------------------------------------------------------
revno: 2474.1.69
merged: john at arbash-meinel.com-20070713212835-m330r85zq4xwgipi
parent: john at arbash-meinel.com-20070713175009-sylhp1kst6145v0f
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-07-13 16:28:35 -0500
message:
Thanks to Jan 'RedBully' Seiffert, some review cleanups
changes size_t to unsigned.
Check alignment on strings before using integer loops.
Just use a simple backwards checking loop for _memrchr
------------------------------------------------------------
revno: 2474.1.68
merged: john at arbash-meinel.com-20070713175009-sylhp1kst6145v0f
parent: john at arbash-meinel.com-20070712181059-xnomv3tzzsb2hpx5
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-07-13 12:50:09 -0500
message:
Review feedback from Martin, mostly documentation updates.
------------------------------------------------------------
revno: 2474.1.67
merged: john at arbash-meinel.com-20070712181059-xnomv3tzzsb2hpx5
parent: john at arbash-meinel.com-20070712163402-lp91q157w5etslrj
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-07-12 13:10:59 -0500
message:
Add NEWS entries for performance improvements.
------------------------------------------------------------
revno: 2474.1.66
merged: john at arbash-meinel.com-20070712163402-lp91q157w5etslrj
parent: john at arbash-meinel.com-20070712052601-n0bcu3r5nlu1skj4
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-07-12 11:34:02 -0500
message:
Some restructuring.
Move bisect_path_* to private functions
Move cmp_path_by_dirblock to a private function,
since it is only used by the bisect_path functions.
Add tests that the compiled versions are actually used.
This catches cases when the import fails for the wrong reason.
Move some code around to make it closer to sorted by name.
------------------------------------------------------------
revno: 2474.1.65
merged: john at arbash-meinel.com-20070712052601-n0bcu3r5nlu1skj4
parent: john at arbash-meinel.com-20070712051503-ntboo0z3prcrcg3t
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-07-12 00:26:01 -0500
message:
Found an import dependency bug if the compiled version is not available.
Basically, we need a constant from dirstate.py, but we can't import the module directly
because before the module finishes loading, it imports _dirstate_helper*.
but bzrlib.dirstate.DirState *has* been defined at that point,
so we can import it.
But now the tests pass with and without running 'make' first.
------------------------------------------------------------
revno: 2474.1.64
merged: john at arbash-meinel.com-20070712051503-ntboo0z3prcrcg3t
parent: john at arbash-meinel.com-20070712051426-u9auufylv5cba940
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-07-12 00:15:03 -0500
message:
Fix dirstate benchmarks for new layout.
------------------------------------------------------------
revno: 2474.1.63
merged: john at arbash-meinel.com-20070712051426-u9auufylv5cba940
parent: john at arbash-meinel.com-20070711234520-do3h7zw8skbathpz
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-07-12 00:14:26 -0500
message:
Found a small bug in the python version of _read_dirblocks.
This reveals that the code is not as directly tested as it should be.
Consider refactoring all test_dirstate to use both implementations.
Or at least at more direct tests.
------------------------------------------------------------
revno: 2474.1.62
merged: john at arbash-meinel.com-20070711234520-do3h7zw8skbathpz
parent: john at arbash-meinel.com-20070711225935-llcal92udviwxfp4
parent: pqm at pqm.ubuntu.com-20070711162842-8fx9cc0c3ogyxudl
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Wed 2007-07-11 18:45:20 -0500
message:
[merge] bzr.dev 2601
------------------------------------------------------------
revno: 2474.1.61
merged: john at arbash-meinel.com-20070711225935-llcal92udviwxfp4
parent: john at arbash-meinel.com-20070711215705-x6l2fdioh050zxzp
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Wed 2007-07-11 17:59:35 -0500
message:
Finish fixing DirState._bisect and the bisect tests
------------------------------------------------------------
revno: 2474.1.60
merged: john at arbash-meinel.com-20070711215705-x6l2fdioh050zxzp
parent: john at arbash-meinel.com-20070711214905-e2cxwnuoxr9r1o9r
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Wed 2007-07-11 16:57:05 -0500
message:
Get rid of strchr in favor of memchr
------------------------------------------------------------
revno: 2474.1.59
merged: john at arbash-meinel.com-20070711214905-e2cxwnuoxr9r1o9r
parent: john at arbash-meinel.com-20070711000154-4et8yf8si3jgxmgc
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Wed 2007-07-11 16:49:05 -0500
message:
Make sure to set basename_len. With that patch, the tests pass.
------------------------------------------------------------
revno: 2474.1.58
merged: john at arbash-meinel.com-20070711000154-4et8yf8si3jgxmgc
parent: john at arbash-meinel.com-20070710145123-jv3wcj10qdvkgmt8
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Tue 2007-07-10 19:01:54 -0500
message:
(broken) Try to properly implement DirState._bisect*
Involves rewriting some helper functions.
Currently something is wrong.
------------------------------------------------------------
revno: 2474.1.57
merged: john at arbash-meinel.com-20070710145123-jv3wcj10qdvkgmt8
parent: john at arbash-meinel.com-20070509152850-spj91ozbgzpgxmw7
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Tue 2007-07-10 09:51:23 -0500
message:
Move code around to refactor according to our pyrex extension design.
This creates a _dirstate_helpers_py.py next to _dirstate_helpers_c.pyx
Rather than having a 'bzrlib.compiled.*' directory.
------------------------------------------------------------
revno: 2474.1.56
merged: john at arbash-meinel.com-20070509152850-spj91ozbgzpgxmw7
parent: john at arbash-meinel.com-20070507231309-mtyzwjrascrg5tiq
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Wed 2007-05-09 10:28:50 -0500
message:
Remove a lot of unused definitions.
------------------------------------------------------------
revno: 2474.1.55
merged: john at arbash-meinel.com-20070507231309-mtyzwjrascrg5tiq
parent: john at arbash-meinel.com-20070507230047-53ozoz7og6n2j24i
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 18:13:09 -0500
message:
Remove an unused (and ugly) pyrex function.
------------------------------------------------------------
revno: 2474.1.54
merged: john at arbash-meinel.com-20070507230047-53ozoz7og6n2j24i
parent: john at arbash-meinel.com-20070507221117-l6pjpggfs9p2dtwy
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 18:00:47 -0500
message:
Optimize the simple case that the strings are the same object.
Add some TODO statements that we might consider.
------------------------------------------------------------
revno: 2474.1.53
merged: john at arbash-meinel.com-20070507221117-l6pjpggfs9p2dtwy
parent: john at arbash-meinel.com-20070507214233-czz6gaimsje4qka6
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 17:11:17 -0500
message:
Changing Reader.get_next_str (which returns a Python String)
into a c function saves a lot of time.
Specifically it avoids a GetAttr call, and a PyObject_CallObject
This drops the times down to:
...test__read_dirblocks_20k_tree_0_parents_c OK 122ms/ 2561ms
...test__read_dirblocks_20k_tree_0_parents_py OK 235ms/ 2606ms
...test__read_dirblocks_20k_tree_1_parent_c OK 175ms/ 2797ms
...test__read_dirblocks_20k_tree_1_parent_py OK 358ms/ 3014ms
...test__read_dirblocks_20k_tree_2_parents_c OK 259ms/ 2992ms
...test__read_dirblocks_20k_tree_2_parents_py OK 498ms/ 3232ms
We are close to being 2x faster than the python implementation.
------------------------------------------------------------
revno: 2474.1.52
merged: john at arbash-meinel.com-20070507214233-czz6gaimsje4qka6
parent: john at arbash-meinel.com-20070507213645-le9y48efqghhes86
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 16:42:33 -0500
message:
Add a benchmark timing how long it takes to add ~20k entries to a DirState object.
------------------------------------------------------------
revno: 2474.1.51
merged: john at arbash-meinel.com-20070507213645-le9y48efqghhes86
parent: john at arbash-meinel.com-20070507213102-i2nuwkr0vfj8u98u
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 16:36:45 -0500
message:
Fix one benchmark so it is actually writing data instead of a null block.
------------------------------------------------------------
revno: 2474.1.50
merged: john at arbash-meinel.com-20070507213102-i2nuwkr0vfj8u98u
parent: john at arbash-meinel.com-20070507211832-430v0s9bvjud3jeg
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 16:31:02 -0500
message:
Refactor a bit to make benchmark setup time faster.
------------------------------------------------------------
revno: 2474.1.49
merged: john at arbash-meinel.com-20070507211832-430v0s9bvjud3jeg
parent: john at arbash-meinel.com-20070507204345-plq5j2u2hfwm1q8v
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 16:18:32 -0500
message:
Add DirState.save() benchmarks.
At this point it doesn't seem a huge overhead
(857ms for 20k entries with 2 parents on a slow machine)
But something we might look into in the future
------------------------------------------------------------
revno: 2474.1.48
merged: john at arbash-meinel.com-20070507204345-plq5j2u2hfwm1q8v
parent: john at arbash-meinel.com-20070507203816-0zk28og5dadjdj4l
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 15:43:45 -0500
message:
Just recording a benchmark on my fast machine
_read_dirblocks_20k_tree_0_parents_c OK 158ms/ 2632ms
_read_dirblocks_20k_tree_0_parents_py OK 247ms/ 2648ms
_read_dirblocks_20k_tree_1_parent_c OK 224ms/ 5493ms
_read_dirblocks_20k_tree_1_parent_py OK 324ms/ 5558ms
_read_dirblocks_20k_tree_2_parents_c OK 279ms/ 6675ms
_read_dirblocks_20k_tree_2_parents_py OK 435ms/ 6847ms
------------------------------------------------------------
revno: 2474.1.47
merged: john at arbash-meinel.com-20070507203816-0zk28og5dadjdj4l
parent: john at arbash-meinel.com-20070507202804-5w45ajlfp3xoc3kl
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 15:38:16 -0500
message:
Change the names of the functions from c_foo and py_foo to foo_c and foo_py
This makes it easier to search for 'def foo*' and means that benchmark results
are next to eachother, rather than far apart.
------------------------------------------------------------
revno: 2474.1.46
merged: john at arbash-meinel.com-20070507202804-5w45ajlfp3xoc3kl
parent: john at arbash-meinel.com-20070507191244-ywyxg0ftlh6n297f
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 15:28:04 -0500
message:
Finish implementing _c_read_dirblocks for any number of parents.
bench_dirstate.BenchmarkDirState.test__c_read_dirblocks_20k_tree_0_parents OK 367ms/ 4353ms
bench_dirstate.BenchmarkDirState.test__c_read_dirblocks_20k_tree_1_parent OK 594ms/ 8958ms
bench_dirstate.BenchmarkDirState.test__c_read_dirblocks_20k_tree_2_parents OK 842ms/ 10490ms
bench_dirstate.BenchmarkDirState.test__py_read_dirblocks_20k_tree_0_parents OK 560ms/ 4298ms
bench_dirstate.BenchmarkDirState.test__py_read_dirblocks_20k_tree_1_parent OK 692ms/ 8658ms
bench_dirstate.BenchmarkDirState.test__py_read_dirblocks_20k_tree_2_parents OK 1006ms/ 10710ms
So overall the performance benefit is about 15-30%
------------------------------------------------------------
revno: 2474.1.45
merged: john at arbash-meinel.com-20070507191244-ywyxg0ftlh6n297f
parent: john at arbash-meinel.com-20070507183155-fzs5z1516gyf5lth
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 14:12:44 -0500
message:
Add benchmarks to see how reading the dirstate changes when you have parents.
Currently, the C implementation is slower than python, but partially that is
because it is not optimized (at all).
------------------------------------------------------------
revno: 2474.1.44
merged: john at arbash-meinel.com-20070507183155-fzs5z1516gyf5lth
parent: john at arbash-meinel.com-20070507182449-mm860vvdw9keyfx5
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 13:31:55 -0500
message:
Use cmp_by_dirs in _iter_changes, it saves a bit of time.
When I initially wrote it, I thought they wouldn't be called often,
but I realize now they are evaluated when we have unknown/ignored files
on disk.
------------------------------------------------------------
revno: 2474.1.43
merged: john at arbash-meinel.com-20070507182449-mm860vvdw9keyfx5
parent: john at arbash-meinel.com-20070507180840-e0r1jomaos7an93j
parent: pqm at pqm.ubuntu.com-20070507175017-mvwcdqzq0w4z36lr
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 13:24:49 -0500
message:
[merge] bzr.dev 2483
------------------------------------------------------------
revno: 2474.1.42
merged: john at arbash-meinel.com-20070507180840-e0r1jomaos7an93j
parent: john at arbash-meinel.com-20070507175701-b8c87exjybq31evq
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 13:08:40 -0500
message:
fix benchmark names, refactor to avoid 'create_path_names' overhead.
------------------------------------------------------------
revno: 2474.1.41
merged: john at arbash-meinel.com-20070507175701-b8c87exjybq31evq
parent: john at arbash-meinel.com-20070505132458-0fe0g2jfdoyg95mn
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 12:57:01 -0500
message:
Change the name of cmp_dirblock_strings to cmp_by_dirs
And refactor the test cases so that we test both the python version and the
C version. Also, add benchmarks for both.
It shows that the C version is approx 10x faster.
------------------------------------------------------------
revno: 2474.1.40
merged: john at arbash-meinel.com-20070505132458-0fe0g2jfdoyg95mn
parent: john at arbash-meinel.com-20070505050202-hmi7l9smckjrf2pa
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Sat 2007-05-05 08:24:58 -0500
message:
(python-only) Shave a bit of time off by calling binascii.b2a_base64
I should have looked closer, base64.encodestring() is a Legacy api, which
just wraps binascii.b2a_base64.
On 21k pack_stat calls, it drops us from around 784ms to 281ms
------------------------------------------------------------
revno: 2474.1.39
merged: john at arbash-meinel.com-20070505050202-hmi7l9smckjrf2pa
parent: john at arbash-meinel.com-20070505045753-1fwhap6q0jyb18vt
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Sat 2007-05-05 00:02:02 -0500
message:
Clean up and remove unused functions.
------------------------------------------------------------
revno: 2474.1.38
merged: john at arbash-meinel.com-20070505045753-1fwhap6q0jyb18vt
parent: john at arbash-meinel.com-20070505043606-lw7bjxwzcnjbls9v
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 23:57:53 -0500
message:
Finally, faster than text.split() (156ms)
By iterating over the fields directly, we don't have to create Python strings
for the dirname field (only when it changes), or for the size field or is_executable
fields.
A lot fewer python objects means faster parsing.
------------------------------------------------------------
revno: 2474.1.37
merged: john at arbash-meinel.com-20070505043606-lw7bjxwzcnjbls9v
parent: john at arbash-meinel.com-20070505015422-9dfed0e9uza2g7n9
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 23:36:06 -0500
message:
get_next() returns the length of the string,
preparing for having a _get_entry... which parses rather than
extracting to a list first
------------------------------------------------------------
revno: 2474.1.36
merged: john at arbash-meinel.com-20070505015422-9dfed0e9uza2g7n9
parent: john at arbash-meinel.com-20070504223428-d7vwvp3f7ypn9ivv
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 20:54:22 -0500
message:
Move functions into member functions on reader() class.
Drops time down to 212ms
------------------------------------------------------------
revno: 2474.1.35
merged: john at arbash-meinel.com-20070504223428-d7vwvp3f7ypn9ivv
parent: john at arbash-meinel.com-20070504222904-6f6i8yxr9qpf8lpw
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 17:34:28 -0500
message:
Read the entries one at a time, rather than all at the beginning.
------------------------------------------------------------
revno: 2474.1.34
merged: john at arbash-meinel.com-20070504222904-6f6i8yxr9qpf8lpw
parent: john at arbash-meinel.com-20070504221204-d9mjz2nl8fd5maxp
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 17:29:04 -0500
message:
Delay reading fields until in parse loop
------------------------------------------------------------
revno: 2474.1.33
merged: john at arbash-meinel.com-20070504221204-d9mjz2nl8fd5maxp
parent: john at arbash-meinel.com-20070504220621-iwla6gmrtx7iy37s
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 17:12:04 -0500
message:
Using text.split() is down to 174ms
We'll need some work to get the Reader version faster.
------------------------------------------------------------
revno: 2474.1.32
merged: john at arbash-meinel.com-20070504220621-iwla6gmrtx7iy37s
parent: john at arbash-meinel.com-20070504214853-iqaht2z8963hdlr3
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 17:06:21 -0500
message:
Skip past the first entry while reading,
rather than while processing.
------------------------------------------------------------
revno: 2474.1.31
merged: john at arbash-meinel.com-20070504214853-iqaht2z8963hdlr3
parent: john at arbash-meinel.com-20070504214147-ckrxzu7bepvcs4ct
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 16:48:53 -0500
message:
Avoiding the string format unless there is actually a problem
saves us almost 50ms (down to 242ms)
------------------------------------------------------------
revno: 2474.1.30
merged: john at arbash-meinel.com-20070504214147-ckrxzu7bepvcs4ct
parent: john at arbash-meinel.com-20070504210438-cvtzgzh4xbad7kww
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 16:41:47 -0500
message:
Start working towards a parser which uses a Reader (producer)
rather than working on a list of fields. Currently slower than text.split('\0'),
but should be possible to avoid the intermediate list entirely.
------------------------------------------------------------
revno: 2474.1.29
merged: john at arbash-meinel.com-20070504210438-cvtzgzh4xbad7kww
parent: john at arbash-meinel.com-20070504200015-yli1te8jfhk3xpjc
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 16:04:38 -0500
message:
Refactor, so that the inner _fields_to_entries function is the
doing the path comparison, and it will re-use the dirname object,
rather than copying a new string each time.
This should have equivalent performance, but have a rather large
memory savings, because we don't maintain N copies of the dirname
for N files in that directory.
It (theoretically) will speed up some comparisons, too,
because the string hash, etc, will be properly cached.
------------------------------------------------------------
revno: 2474.1.28
merged: john at arbash-meinel.com-20070504200015-yli1te8jfhk3xpjc
parent: john at arbash-meinel.com-20070504194612-ryl2chfi4dd53c2h
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 15:00:15 -0500
message:
Ask the field converter to determine the current directory
rather than parsing it out of the returned entry.
------------------------------------------------------------
revno: 2474.1.27
merged: john at arbash-meinel.com-20070504194612-ryl2chfi4dd53c2h
parent: john at arbash-meinel.com-20070504192637-1tzys0ugbgy21fw9
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 14:46:12 -0500
message:
Switching to direct access of members of the list drops us down to 305ms
------------------------------------------------------------
revno: 2474.1.26
merged: john at arbash-meinel.com-20070504192637-1tzys0ugbgy21fw9
parent: john at arbash-meinel.com-20070504192326-5f9kzev4v57if01r
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 14:26:37 -0500
message:
Switch to using an offset rather than doing a list splice
------------------------------------------------------------
revno: 2474.1.25
merged: john at arbash-meinel.com-20070504192326-5f9kzev4v57if01r
parent: john at arbash-meinel.com-20070504190500-tq5wvnhmmd30m21y
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 14:23:26 -0500
message:
Refactor into a helper function to make implementation clearer
This also improves performance to 319ms
------------------------------------------------------------
revno: 2474.1.24
merged: john at arbash-meinel.com-20070504190500-tq5wvnhmmd30m21y
parent: john at arbash-meinel.com-20070504185936-1mjdoqmtz74xe5mg
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 14:05:00 -0500
message:
Unrolling into a direct loop drops us to 326ms
------------------------------------------------------------
revno: 2474.1.23
merged: john at arbash-meinel.com-20070504185936-1mjdoqmtz74xe5mg
parent: john at arbash-meinel.com-20070504181128-422svqlutnl3v43d
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 13:59:36 -0500
message:
A C implementation of _fields_to_entry_0_parents drops the time from 400ms to 330ms for a 21k-entry tree
------------------------------------------------------------
revno: 2474.1.22
merged: john at arbash-meinel.com-20070504181128-422svqlutnl3v43d
parent: john at arbash-meinel.com-20070504180557-iaitatth56jygggl
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 13:11:28 -0500
message:
Do the same renaming => py_ and c_ for _read_dirblocks
------------------------------------------------------------
revno: 2474.1.21
merged: john at arbash-meinel.com-20070504180557-iaitatth56jygggl
parent: john at arbash-meinel.com-20070504174616-4kdi7zi32h7ev4f9
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 13:05:57 -0500
message:
Cleanup the multiple testing.
Change the function names from both being 'bisect_dirblocks' to being
py_bisect_dirblocks and c_bisect_dirblocks.
And enable using the compiled form when it is available.
------------------------------------------------------------
revno: 2474.1.20
merged: john at arbash-meinel.com-20070504174616-4kdi7zi32h7ev4f9
parent: john at arbash-meinel.com-20070504173600-5reyrpo013nk17sr
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 12:46:16 -0500
message:
Apply all of the tests for DirState.bisect_dirblock to the compiled function.
------------------------------------------------------------
revno: 2474.1.19
merged: john at arbash-meinel.com-20070504173600-5reyrpo013nk17sr
parent: john at arbash-meinel.com-20070504163523-69dypgt24ipo26p2
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 12:36:00 -0500
message:
Clean up _cmp_dirblock_strings_alt to make it the default.
This improves bisect_dirblock_compiled by another 2x.
So far the improvement is now 800ms => 100ms => 50ms with the current
function.
------------------------------------------------------------
revno: 2474.1.18
merged: john at arbash-meinel.com-20070504163523-69dypgt24ipo26p2
parent: john at arbash-meinel.com-20070504161941-7n3we92jhxnczl5a
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 11:35:23 -0500
message:
Add an integer-size comparison loop at the begining, and
update the test suite to make sure we are properly exercising it.
------------------------------------------------------------
revno: 2474.1.17
merged: john at arbash-meinel.com-20070504161941-7n3we92jhxnczl5a
parent: john at arbash-meinel.com-20070504161120-wyplkl21ctqbq2ka
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 11:19:41 -0500
message:
Using a custom loop seems to be the same speed, but is probably
easier to understand.
------------------------------------------------------------
revno: 2474.1.16
merged: john at arbash-meinel.com-20070504161120-wyplkl21ctqbq2ka
parent: john at arbash-meinel.com-20070504160330-jai9q6h8ts1ddb2i
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 11:11:20 -0500
message:
Shave off maybe 10% by using the PyString_* macros instead of functions.
------------------------------------------------------------
revno: 2474.1.15
merged: john at arbash-meinel.com-20070504160330-jai9q6h8ts1ddb2i
parent: john at arbash-meinel.com-20070504160216-v19b36wj16g0awwi
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 11:03:30 -0500
message:
No need to benchmark bisect_dirblock_compiled_cached
The cache isn't used in the compiled form.
------------------------------------------------------------
revno: 2474.1.14
merged: john at arbash-meinel.com-20070504160216-v19b36wj16g0awwi
parent: john at arbash-meinel.com-20070504155015-l31mrfviixrrf277
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 11:02:16 -0500
message:
Switching bisect_dirblocks remove the extra .split('/')
This is a massive improvement (approx 8x).
Since we avoid all the temporary lists, dictionary lookups etc.
Now we just have a custom string comparison, which is quite fast.
------------------------------------------------------------
revno: 2474.1.13
merged: john at arbash-meinel.com-20070504155015-l31mrfviixrrf277
parent: john at arbash-meinel.com-20070504154346-fgz2nrtwtd8u9w6a
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 10:50:15 -0500
message:
Now that we have bisect_dirblock working again, bring back cmp_dirblock_strings.
------------------------------------------------------------
revno: 2474.1.12
merged: john at arbash-meinel.com-20070504154346-fgz2nrtwtd8u9w6a
parent: john at arbash-meinel.com-20070504044714-xgbrxg27p83yis89
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-05-04 10:43:46 -0500
message:
Clean up bisect_dirstate to not use temporary variables.
------------------------------------------------------------
revno: 2474.1.11
merged: john at arbash-meinel.com-20070504044714-xgbrxg27p83yis89
parent: john at arbash-meinel.com-20070504043751-5unx865kqw9scyyu
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 23:47:14 -0500
message:
Avoid a Py_INCREF by using a void *
------------------------------------------------------------
revno: 2474.1.10
merged: john at arbash-meinel.com-20070504043751-5unx865kqw9scyyu
parent: john at arbash-meinel.com-20070504041902-r5vxd4xpkduhbd0b
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 23:37:51 -0500
message:
Explicitly calling Py_INCREF makes things happier again.
------------------------------------------------------------
revno: 2474.1.9
merged: john at arbash-meinel.com-20070504041902-r5vxd4xpkduhbd0b
parent: john at arbash-meinel.com-20070504041242-lnhinwkv7wvsejg0
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 23:19:02 -0500
message:
Revert the pyrex implementation to its most basic
The fancier ones were causing segfaults.
------------------------------------------------------------
revno: 2474.1.8
merged: john at arbash-meinel.com-20070504041242-lnhinwkv7wvsejg0
parent: john at arbash-meinel.com-20070504035829-orbif7nnkim9md1t
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 23:12:42 -0500
message:
Fix the benchmarks to test what I thought I was testing earlier
------------------------------------------------------------
revno: 2474.1.7
merged: john at arbash-meinel.com-20070504035829-orbif7nnkim9md1t
parent: john at arbash-meinel.com-20070503234531-xt0tpuxwqgjn10l8
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 22:58:29 -0500
message:
Add some tests for a helper function that lets us
compare 2 paths in 'dirblock' mode, without splitting the strings.
------------------------------------------------------------
revno: 2474.1.6
merged: john at arbash-meinel.com-20070503234531-xt0tpuxwqgjn10l8
parent: john at arbash-meinel.com-20070503234105-xwv4fcxn26d97d6u
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 18:45:31 -0500
message:
use 10x the directories to make the timing to fall in the 1s mark
------------------------------------------------------------
revno: 2474.1.5
merged: john at arbash-meinel.com-20070503234105-xwv4fcxn26d97d6u
parent: john at arbash-meinel.com-20070503233549-445n015iomhc8ppm
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 18:41:05 -0500
message:
Implement explicit handling of the no-cache version, which is even faster.
------------------------------------------------------------
revno: 2474.1.4
merged: john at arbash-meinel.com-20070503233549-445n015iomhc8ppm
parent: john at arbash-meinel.com-20070503233314-btj1vbd2qtod34kq
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 18:35:49 -0500
message:
Add benchmarks for dirstate.bisect_dirblocks, and implement bisect_dirblocks in pyrex.
Shows about a 2x performance improvement being in compiled C.
Also, at least on my Mac, it is faster without extra caching.
------------------------------------------------------------
revno: 2474.1.3
merged: john at arbash-meinel.com-20070503233314-btj1vbd2qtod34kq
parent: john at arbash-meinel.com-20070503211741-b51wshh2i5ecw50i
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 18:33:14 -0500
message:
remove the .c file for now, so it doesn't clutter things
------------------------------------------------------------
revno: 2474.1.2
merged: john at arbash-meinel.com-20070503211741-b51wshh2i5ecw50i
parent: john at arbash-meinel.com-20070503201137-qiijh6rvjo9p14wy
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 16:17:41 -0500
message:
Add benchmark tests for a couple DirState functions.
------------------------------------------------------------
revno: 2474.1.1
merged: john at arbash-meinel.com-20070503201137-qiijh6rvjo9p14wy
parent: pqm at pqm.ubuntu.com-20070430223205-x4uyrteryh0230fp
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Thu 2007-05-03 15:11:37 -0500
message:
Create a Pyrex extension for reading the dirstate file.
Diff too large for email (2930 lines, the limit is 1000).
More information about the bazaar-commits
mailing list