Rev 2617: Improve merge speed in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Fri Jul 13 08:46:30 BST 2007
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 2617
revision-id: pqm at pqm.ubuntu.com-20070713074627-93zxs9uh528y0fki
parent: pqm at pqm.ubuntu.com-20070713060449-rydsxz28x12l2ksm
parent: aaron.bentley at utoronto.ca-20070713061227-0pawzjlh1hr26kua
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Fri 2007-07-13 08:46:27 +0100
message:
Improve merge speed
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/merge.py merge.py-20050513021216-953b65a438527106
bzrlib/transform.py transform.py-20060105172343-dd99e54394d91687
bzrlib/workingtree_4.py workingtree_4.py-20070208044105-5fgpc5j3ljlh5q6c-1
------------------------------------------------------------
revno: 2590.2.25
merged: aaron.bentley at utoronto.ca-20070713061227-0pawzjlh1hr26kua
parent: aaron.bentley at utoronto.ca-20070713061029-9ebz58a3yvvax5du
committer: Aaron Bentley <aaron.bentley at utoronto.ca>
branch nick: changes-merge
timestamp: Fri 2007-07-13 02:12:27 -0400
message:
Update NEWS
------------------------------------------------------------
revno: 2590.2.24
merged: aaron.bentley at utoronto.ca-20070713061029-9ebz58a3yvvax5du
parent: aaron.bentley at utoronto.ca-20070712235607-86ira1zktfq61w43
parent: pqm at pqm.ubuntu.com-20070713060449-rydsxz28x12l2ksm
committer: Aaron Bentley <aaron.bentley at utoronto.ca>
branch nick: changes-merge
timestamp: Fri 2007-07-13 02:10:29 -0400
message:
Merge bzr.dev
------------------------------------------------------------
revno: 2590.2.23
merged: aaron.bentley at utoronto.ca-20070712235607-86ira1zktfq61w43
parent: aaron.bentley at utoronto.ca-20070712234321-s54cgbnfpecqjs54
committer: Aaron Bentley <aaron.bentley at utoronto.ca>
branch nick: changes-merge
timestamp: Thu 2007-07-12 19:56:07 -0400
message:
Fix ensure_null redundancy
------------------------------------------------------------
revno: 2590.2.22
merged: aaron.bentley at utoronto.ca-20070712234321-s54cgbnfpecqjs54
parent: abentley at panoramicfeedback.com-20070712204351-d1vl9j526yscbfq3
committer: Aaron Bentley <aaron.bentley at utoronto.ca>
branch nick: changes-merge
timestamp: Thu 2007-07-12 19:43:21 -0400
message:
Remove cruft
------------------------------------------------------------
revno: 2590.2.21
merged: abentley at panoramicfeedback.com-20070712204351-d1vl9j526yscbfq3
parent: abentley at panoramicfeedback.com-20070712192754-30qegoe02w0uosra
parent: pqm at pqm.ubuntu.com-20070712124245-vaw0ajlwrexg8d0m
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 16:43:51 -0400
message:
Merge bzr.dev
------------------------------------------------------------
revno: 2590.2.20
merged: abentley at panoramicfeedback.com-20070712192754-30qegoe02w0uosra
parent: abentley at panoramicfeedback.com-20070712184237-caa2rq3zhg2949pt
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 15:27:54 -0400
message:
Fix handling of ghost base trees
------------------------------------------------------------
revno: 2590.2.19
merged: abentley at panoramicfeedback.com-20070712184237-caa2rq3zhg2949pt
parent: abentley at panoramicfeedback.com-20070712181917-jjz09lu21tdtu5qw
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 14:42:37 -0400
message:
Avoid fetch within a repository
------------------------------------------------------------
revno: 2590.2.18
merged: abentley at panoramicfeedback.com-20070712181917-jjz09lu21tdtu5qw
parent: abentley at panoramicfeedback.com-20070712174652-vof6knvyoca27gsk
parent: aaron.bentley at utoronto.ca-20070712175111-boeqhuztvhasep34
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 14:19:17 -0400
message:
Merge is_ancestor fix
------------------------------------------------------------
revno: 2590.2.10.1.2
merged: aaron.bentley at utoronto.ca-20070712175111-boeqhuztvhasep34
parent: aaron.bentley at utoronto.ca-20070712021619-fje8zjrqmsgu1wyr
committer: Aaron Bentley <aaron.bentley at utoronto.ca>
branch nick: changes-merge
timestamp: Thu 2007-07-12 13:51:11 -0400
message:
Avoid redundant is_ancestor checks
------------------------------------------------------------
revno: 2590.2.17
merged: abentley at panoramicfeedback.com-20070712174652-vof6knvyoca27gsk
parent: abentley at panoramicfeedback.com-20070712173938-x9v9abluk03n8kln
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 13:46:52 -0400
message:
Avoid redundant conflict check
------------------------------------------------------------
revno: 2590.2.16
merged: abentley at panoramicfeedback.com-20070712173938-x9v9abluk03n8kln
parent: abentley at panoramicfeedback.com-20070712171633-98rpuw7updpnxpxt
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 13:39:38 -0400
message:
Shortcut duplicate_entries conflict check if no new names introduced
------------------------------------------------------------
revno: 2590.2.15
merged: abentley at panoramicfeedback.com-20070712171633-98rpuw7updpnxpxt
parent: abentley at panoramicfeedback.com-20070712171411-k7kuotpy11aqwnoo
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 13:16:33 -0400
message:
Remove unused private function
------------------------------------------------------------
revno: 2590.2.14
merged: abentley at panoramicfeedback.com-20070712171411-k7kuotpy11aqwnoo
parent: abentley at panoramicfeedback.com-20070712171112-zvtvd5j0hceoheda
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 13:14:11 -0400
message:
Remove ancient inventory regeneration code
------------------------------------------------------------
revno: 2590.2.13
merged: abentley at panoramicfeedback.com-20070712171112-zvtvd5j0hceoheda
parent: abentley at panoramicfeedback.com-20070712170702-kefnhuo926w6cvhz
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 13:11:12 -0400
message:
Make find_base implement the base_finding code
------------------------------------------------------------
revno: 2590.2.12
merged: abentley at panoramicfeedback.com-20070712170702-kefnhuo926w6cvhz
parent: abentley at panoramicfeedback.com-20070712170611-rj70hbjdk5titkqt
parent: aaron.bentley at utoronto.ca-20070712021619-fje8zjrqmsgu1wyr
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 13:07:02 -0400
message:
Merge bzr.dev
------------------------------------------------------------
revno: 2590.2.10.1.1
merged: aaron.bentley at utoronto.ca-20070712021619-fje8zjrqmsgu1wyr
parent: abentley at panoramicfeedback.com-20070710181401-lfw19orwavwyk603
parent: pqm at pqm.ubuntu.com-20070712021235-0a3tqhdt9nxk0x9y
committer: Aaron Bentley <aaron.bentley at utoronto.ca>
branch nick: changes-merge
timestamp: Wed 2007-07-11 22:16:19 -0400
message:
Merge from bzr.dev
------------------------------------------------------------
revno: 2590.2.11
merged: abentley at panoramicfeedback.com-20070712170611-rj70hbjdk5titkqt
parent: abentley at panoramicfeedback.com-20070710181401-lfw19orwavwyk603
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: changes-merge
timestamp: Thu 2007-07-12 13:06:11 -0400
message:
Aggressively cache trees, use dirstate. re-mplement _add_parent.
=== modified file 'NEWS'
--- a/NEWS 2007-07-13 04:12:12 +0000
+++ b/NEWS 2007-07-13 06:12:27 +0000
@@ -23,6 +23,9 @@
read from the repository, such as a 1s => 0.75s improvement in
``bzr diff`` when there are changes to be shown. (John Arbash Meinel)
+ * Merge is now faster. Depending on the scenario, it can be more than 2x
+ faster. (Aaron Bentley)
+
LIBRARY API BREAKS:
* Deprecated dictionary ``bzrlib.option.SHORT_OPTIONS`` removed.
=== modified file 'bzrlib/merge.py'
--- a/bzrlib/merge.py 2007-07-12 07:37:18 +0000
+++ b/bzrlib/merge.py 2007-07-12 23:56:07 +0000
@@ -20,6 +20,7 @@
import warnings
from bzrlib import (
+ errors,
osutils,
registry,
revision as _mod_revision,
@@ -53,44 +54,6 @@
# TODO: Report back as changes are merged in
-def _get_tree(treespec, local_branch=None):
- from bzrlib import workingtree
- location, revno = treespec
- if revno is None:
- tree = workingtree.WorkingTree.open_containing(location)[0]
- return tree.branch, tree
- branch = Branch.open_containing(location)[0]
- if revno == -1:
- revision_id = branch.last_revision()
- else:
- revision_id = branch.get_rev_id(revno)
- if revision_id is None:
- revision_id = NULL_REVISION
- return branch, _get_revid_tree(branch, revision_id, local_branch)
-
-
-def _get_revid_tree(branch, revision_id, local_branch):
- if revision_id is None:
- base_tree = branch.bzrdir.open_workingtree()
- else:
- if local_branch is not None:
- if local_branch.base != branch.base:
- local_branch.fetch(branch, revision_id)
- base_tree = local_branch.repository.revision_tree(revision_id)
- else:
- base_tree = branch.repository.revision_tree(revision_id)
- return base_tree
-
-
-def _get_revid_tree_from_tree(tree, revision_id, local_branch):
- if revision_id is None:
- return tree
- if local_branch is not None:
- if local_branch.base != tree.branch.base:
- local_branch.fetch(tree.branch, revision_id)
- return local_branch.repository.revision_tree(revision_id)
- return tree.branch.repository.revision_tree(revision_id)
-
def transform_tree(from_tree, to_tree, interesting_ids=None):
merge_inner(from_tree.branch, to_tree, from_tree, ignore_zero=True,
@@ -123,14 +86,36 @@
self.pp = None
self.recurse = recurse
self.change_reporter = change_reporter
-
- def revision_tree(self, revision_id):
- return self.this_branch.repository.revision_tree(revision_id)
+ self._cached_trees = {}
+
+ def revision_tree(self, revision_id, branch=None):
+ if revision_id not in self._cached_trees:
+ if branch is None:
+ branch = self.this_branch
+ try:
+ tree = self.this_tree.revision_tree(revision_id)
+ except errors.NoSuchRevisionInTree:
+ tree = branch.repository.revision_tree(revision_id)
+ self._cached_trees[revision_id] = tree
+ return self._cached_trees[revision_id]
+
+ def _get_tree(self, treespec):
+ from bzrlib import workingtree
+ location, revno = treespec
+ if revno is None:
+ tree = workingtree.WorkingTree.open_containing(location)[0]
+ return tree.branch, tree
+ branch = Branch.open_containing(location)[0]
+ if revno == -1:
+ revision_id = branch.last_revision()
+ else:
+ revision_id = branch.get_rev_id(revno)
+ revision_id = ensure_null(revision_id)
+ return branch, self.revision_tree(revision_id, branch)
def ensure_revision_trees(self):
if self.this_revision_tree is None:
- self.this_basis_tree = self.this_branch.repository.revision_tree(
- self.this_basis)
+ self.this_basis_tree = self.revision_tree(self.this_basis)
if self.this_basis == self.this_rev_id:
self.this_revision_tree = self.this_basis_tree
@@ -166,45 +151,40 @@
raise BzrCommandError("Working tree has uncommitted changes.")
def compare_basis(self):
- changes = self.this_tree.changes_from(self.this_tree.basis_tree())
+ try:
+ basis_tree = self.revision_tree(self.this_tree.last_revision())
+ except errors.RevisionNotPresent:
+ basis_tree = self.this_tree.basis_tree()
+ changes = self.this_tree.changes_from(basis_tree)
if not changes.has_changed():
self.this_rev_id = self.this_basis
def set_interesting_files(self, file_list):
self.interesting_files = file_list
- def _set_interesting_files(self, file_list):
- """Set the list of interesting ids from a list of files."""
- if file_list is None:
- self.interesting_ids = None
- return
-
- interesting_ids = set()
- for path in file_list:
- found_id = False
- # TODO: jam 20070226 The trees are not locked at this time,
- # wouldn't it make merge faster if it locks everything in the
- # beginning? It locks at do_merge time, but this happens
- # before that.
- for tree in (self.this_tree, self.base_tree, self.other_tree):
- file_id = tree.path2id(path)
- if file_id is not None:
- interesting_ids.add(file_id)
- found_id = True
- if not found_id:
- raise NotVersionedError(path=path)
- self.interesting_ids = interesting_ids
-
def set_pending(self):
- if not self.base_is_ancestor:
- return
- if self.other_rev_id is None:
- return
- ancestry = set(self.this_branch.repository.get_ancestry(
- self.this_basis, topo_sorted=False))
- if self.other_rev_id in ancestry:
- return
- self.this_tree.add_parent_tree((self.other_rev_id, self.other_tree))
+ if not self.base_is_ancestor or not self.base_is_other_ancestor:
+ return
+ self._add_parent()
+
+ def _add_parent(self):
+ new_parents = self.this_tree.get_parent_ids() + [self.other_rev_id]
+ new_parent_trees = []
+ for revision_id in new_parents:
+ try:
+ tree = self.revision_tree(revision_id)
+ except errors.RevisionNotPresent:
+ tree = None
+ else:
+ tree.lock_read()
+ new_parent_trees.append((revision_id, tree))
+ try:
+ self.this_tree.set_parent_trees(new_parent_trees,
+ allow_leftmost_as_ghost=True)
+ finally:
+ for _revision_id, tree in new_parent_trees:
+ if tree is not None:
+ tree.unlock()
def set_other(self, other_revision):
"""Set the revision and tree to merge from.
@@ -213,8 +193,7 @@
:param other_revision: The [path, revision] list to merge from.
"""
- self.other_branch, self.other_tree = _get_tree(other_revision,
- self.this_branch)
+ self.other_branch, self.other_tree = self._get_tree(other_revision)
if other_revision[1] == -1:
self.other_rev_id = _mod_revision.ensure_null(
self.other_branch.last_revision())
@@ -229,9 +208,9 @@
self.other_basis = self.other_branch.last_revision()
if self.other_basis is None:
raise NoCommits(self.other_branch)
- if self.other_branch.base != self.this_branch.base:
- self.this_branch.fetch(self.other_branch,
- last_revision=self.other_basis)
+ if self.other_rev_id is not None:
+ self._cached_trees[self.other_rev_id] = self.other_tree
+ self._maybe_fetch(self.other_branch,self.this_branch, self.other_basis)
def set_other_revision(self, revision_id, other_branch):
"""Set 'other' based on a branch and revision id
@@ -241,12 +220,29 @@
"""
self.other_rev_id = revision_id
self.other_branch = other_branch
- self.this_branch.fetch(other_branch, self.other_rev_id)
+ self._maybe_fetch(other_branch, self.this_branch, self.other_rev_id)
self.other_tree = self.revision_tree(revision_id)
self.other_basis = revision_id
+ def _maybe_fetch(self, source, target, revision_id):
+ if (source.repository.bzrdir.root_transport.base !=
+ target.repository.bzrdir.root_transport.base):
+ target.fetch(source, revision_id)
+
def find_base(self):
- self.set_base([None, None])
+ this_repo = self.this_branch.repository
+ graph = this_repo.get_graph()
+ revisions = [ensure_null(self.this_basis),
+ ensure_null(self.other_basis)]
+ if NULL_REVISION in revisions:
+ self.base_rev_id = NULL_REVISION
+ else:
+ self.base_rev_id = graph.find_unique_lca(*revisions)
+ if self.base_rev_id == NULL_REVISION:
+ raise UnrelatedBranches()
+ self.base_tree = self.revision_tree(self.base_rev_id)
+ self.base_is_ancestor = True
+ self.base_is_other_ancestor = True
def set_base(self, base_revision):
"""Set the base revision to use for the merge.
@@ -255,29 +251,9 @@
"""
mutter("doing merge() with no base_revision specified")
if base_revision == [None, None]:
- try:
- pb = ui.ui_factory.nested_progress_bar()
- try:
- this_repo = self.this_branch.repository
- graph = this_repo.get_graph()
- revisions = [ensure_null(self.this_basis),
- ensure_null(self.other_basis)]
- if NULL_REVISION in revisions:
- self.base_rev_id = NULL_REVISION
- else:
- self.base_rev_id = graph.find_unique_lca(*revisions)
- if self.base_rev_id == NULL_REVISION:
- raise UnrelatedBranches()
- finally:
- pb.finished()
- except NoCommonAncestor:
- raise UnrelatedBranches()
- self.base_tree = _get_revid_tree_from_tree(self.this_tree,
- self.base_rev_id,
- None)
- self.base_is_ancestor = True
+ self.find_base()
else:
- base_branch, self.base_tree = _get_tree(base_revision)
+ base_branch, self.base_tree = self._get_tree(base_revision)
if base_revision[1] == -1:
self.base_rev_id = base_branch.last_revision()
elif base_revision[1] is None:
@@ -285,11 +261,13 @@
else:
self.base_rev_id = _mod_revision.ensure_null(
base_branch.get_rev_id(base_revision[1]))
- if self.this_branch.base != base_branch.base:
- self.this_branch.fetch(base_branch)
+ self._maybe_fetch(base_branch, self.this_branch, self.base_rev_id)
self.base_is_ancestor = is_ancestor(self.this_basis,
self.base_rev_id,
self.this_branch)
+ self.base_is_other_ancestor = is_ancestor(self.other_basis,
+ self.base_rev_id,
+ self.this_branch)
def do_merge(self):
kwargs = {'working_tree':self.this_tree, 'this_tree': self.this_tree,
@@ -349,75 +327,6 @@
return len(merge.cooked_conflicts)
- def regen_inventory(self, new_entries):
- old_entries = self.this_tree.read_working_inventory()
- new_inventory = {}
- by_path = {}
- new_entries_map = {}
- for path, file_id in new_entries:
- if path is None:
- continue
- new_entries_map[file_id] = path
-
- def id2path(file_id):
- path = new_entries_map.get(file_id)
- if path is not None:
- return path
- entry = old_entries[file_id]
- if entry.parent_id is None:
- return entry.name
- return pathjoin(id2path(entry.parent_id), entry.name)
-
- for file_id in old_entries:
- entry = old_entries[file_id]
- path = id2path(file_id)
- if file_id in self.base_tree.inventory:
- executable = getattr(self.base_tree.inventory[file_id], 'executable', False)
- else:
- executable = getattr(entry, 'executable', False)
- new_inventory[file_id] = (path, file_id, entry.parent_id,
- entry.kind, executable)
-
- by_path[path] = file_id
-
- deletions = 0
- insertions = 0
- new_path_list = []
- for path, file_id in new_entries:
- if path is None:
- del new_inventory[file_id]
- deletions += 1
- else:
- new_path_list.append((path, file_id))
- if file_id not in old_entries:
- insertions += 1
- # Ensure no file is added before its parent
- new_path_list.sort()
- for path, file_id in new_path_list:
- if path == '':
- parent = None
- else:
- parent = by_path[os.path.dirname(path)]
- abspath = pathjoin(self.this_tree.basedir, path)
- kind = osutils.file_kind(abspath)
- if file_id in self.base_tree.inventory:
- executable = getattr(self.base_tree.inventory[file_id], 'executable', False)
- else:
- executable = False
- new_inventory[file_id] = (path, file_id, parent, kind, executable)
- by_path[path] = file_id
-
- # Get a list in insertion order
- new_inventory_list = new_inventory.values()
- mutter ("""Inventory regeneration:
- old length: %i insertions: %i deletions: %i new_length: %i"""\
- % (len(old_entries), insertions, deletions,
- len(new_inventory_list)))
- assert len(new_inventory_list) == len(old_entries) + insertions\
- - deletions
- new_inventory_list.sort()
- return new_inventory_list
-
class Merge3Merger(object):
"""Three-way merger that uses the merge3 text merger"""
@@ -507,7 +416,7 @@
for conflict in self.cooked_conflicts:
warning(conflict)
self.pp.next_phase()
- results = self.tt.apply()
+ results = self.tt.apply(no_conflicts=True)
self.write_modified(results)
try:
working_tree.add_conflicts(self.cooked_conflicts)
=== modified file 'bzrlib/transform.py'
--- a/bzrlib/transform.py 2007-07-06 17:21:34 +0000
+++ b/bzrlib/transform.py 2007-07-12 17:46:52 +0000
@@ -717,6 +717,8 @@
def _duplicate_entries(self, by_parent):
"""No directory may have two entries with the same name."""
conflicts = []
+ if (self._new_name, self._new_parent) == ({}, {}):
+ return conflicts
for children in by_parent.itervalues():
name_ids = [(self.final_name(t), t) for t in children]
name_ids.sort()
@@ -783,17 +785,21 @@
return True
return False
- def apply(self):
+ def apply(self, no_conflicts=False):
"""Apply all changes to the inventory and filesystem.
If filesystem or inventory conflicts are present, MalformedTransform
will be thrown.
If apply succeeds, finalize is not necessary.
+
+ :param no_conflicts: if True, the caller guarantees there are no
+ conflicts, so no check is made.
"""
- conflicts = self.find_conflicts()
- if len(conflicts) != 0:
- raise MalformedTransform(conflicts=conflicts)
+ if not no_conflicts:
+ conflicts = self.find_conflicts()
+ if len(conflicts) != 0:
+ raise MalformedTransform(conflicts=conflicts)
inv = self._tree.inventory
inventory_delta = []
child_pb = bzrlib.ui.ui_factory.nested_progress_bar()
=== modified file 'bzrlib/workingtree_4.py'
--- a/bzrlib/workingtree_4.py 2007-07-12 07:37:18 +0000
+++ b/bzrlib/workingtree_4.py 2007-07-12 20:43:51 +0000
@@ -1490,6 +1490,10 @@
return parent_details[1]
return None
+ def get_weave(self, file_id):
+ return self._repository.weave_store.get_weave(file_id,
+ self._repository.get_transaction())
+
def get_file(self, file_id):
return StringIO(self.get_file_text(file_id))
More information about the bazaar-commits
mailing list