Rev 3744: (jam) When packing create a single new pack, in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Fri Sep 26 22:11:34 BST 2008


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 3744
revision-id: pqm at pqm.ubuntu.com-20080926211130-ojyixbni0jpqoify
parent: pqm at pqm.ubuntu.com-20080926203628-5yzejrl21kt639nk
parent: john at arbash-meinel.com-20080926202824-934bshs1o0evsec4
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Fri 2008-09-26 22:11:30 +0100
message:
  (jam) When packing create a single new pack,
  	fixes bugs #242510 and #172644
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
  bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
    ------------------------------------------------------------
    revno: 3711.4.3
    revision-id: john at arbash-meinel.com-20080926202824-934bshs1o0evsec4
    parent: john at arbash-meinel.com-20080919154746-g07uetnih1s5flil
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: autopack_bug_242510
    timestamp: Fri 2008-09-26 15:28:24 -0500
    message:
      Small cleanups from Robert
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
    ------------------------------------------------------------
    revno: 3711.4.2
    revision-id: john at arbash-meinel.com-20080919154746-g07uetnih1s5flil
    parent: john at arbash-meinel.com-20080919150355-14s3zy2jndtlmc2q
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: autopack_bug_242510
    timestamp: Fri 2008-09-19 10:47:46 -0500
    message:
      Change the logic to solve it in a different way.
      
      Now autopack will always write out a single pack file. It uses
      the same 'fitting' logic, to determine which packs are
      sub-optimally sized and should be combined. However, when it
      finally comes to packing them, it just puts them in the same file.
      It is the same amount of I/O, it just leads to fewer final
      pack files.
      
      Some analysis should be done on the long-term effects of this.
      
      As a side effect, it seems immune to bug #242510. The only way
      to trigger it was to have a large pack built, and then wanting
      a smaller pack. However, if you are genuinely building a pack,
      then there is something to be combined with.
    modified:
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
    ------------------------------------------------------------
    revno: 3711.4.1
    revision-id: john at arbash-meinel.com-20080919150355-14s3zy2jndtlmc2q
    parent: pqm at pqm.ubuntu.com-20080917230446-p0wippqwikt511sp
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: autopack_bug_242510
    timestamp: Fri 2008-09-19 10:03:55 -0500
    message:
      Fix bug #242510, when determining the autopack sequence,
      if the last entry is in a pack by itself, don't repack it.
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
=== modified file 'NEWS'
--- a/NEWS	2008-09-26 15:57:40 +0000
+++ b/NEWS	2008-09-26 21:11:30 +0000
@@ -87,6 +87,11 @@
     * Make the first line of the manpage preamble a comment again.
       (David Futcher, #242106)
 
+    * The autopacking logic will now always create a single new pack from
+      all of the content which it deems is worth moving. This avoids the
+      'repack a single pack' bug and should result in better packing
+      overall.  (John Arbash Meinel, #242510, #172644)
+
   DOCUMENTATION:
 
     * Explain revision/range identifiers. (Daniel Clemente)

=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py	2008-09-02 03:17:08 +0000
+++ b/bzrlib/repofmt/pack_repo.py	2008-09-26 20:28:24 +0000
@@ -1276,8 +1276,8 @@
         # plan out what packs to keep, and what to reorganise
         while len(existing_packs):
             # take the largest pack, and if its less than the head of the
-            # distribution chart we will include its contents in the new pack for
-            # that position. If its larger, we remove its size from the
+            # distribution chart we will include its contents in the new pack
+            # for that position. If its larger, we remove its size from the
             # distribution chart
             next_pack_rev_count, next_pack = existing_packs.pop(0)
             if next_pack_rev_count >= pack_distribution[0]:
@@ -1300,8 +1300,18 @@
                     # this pack is used up, shift left.
                     del pack_distribution[0]
                     pack_operations.append([0, []])
-        
-        return pack_operations
+        # Now that we know which pack files we want to move, shove them all
+        # into a single pack file.
+        final_rev_count = 0
+        final_pack_list = []
+        for num_revs, pack_files in pack_operations:
+            final_rev_count += num_revs
+            final_pack_list.extend(pack_files)
+        if len(final_pack_list) == 1:
+            raise AssertionError('We somehow generated an autopack with a'
+                ' single pack file being moved.')
+            return []
+        return [[final_rev_count, final_pack_list]]
 
     def ensure_loaded(self):
         # NB: if you see an assertion error here, its probably access against

=== modified file 'bzrlib/tests/test_repository.py'
--- a/bzrlib/tests/test_repository.py	2008-09-04 21:10:07 +0000
+++ b/bzrlib/tests/test_repository.py	2008-09-19 15:47:46 +0000
@@ -744,13 +744,16 @@
     def get_format(self):
         return bzrdir.format_registry.make_bzrdir('pack-0.92')
 
+    def get_packs(self):
+        format = self.get_format()
+        repo = self.make_repository('.', format=format)
+        return repo._pack_collection
+
     def test__max_pack_count(self):
         """The maximum pack count is a function of the number of revisions."""
-        format = self.get_format()
-        repo = self.make_repository('.', format=format)
-        packs = repo._pack_collection
         # no revisions - one pack, so that we can have a revision free repo
         # without it blowing up
+        packs = self.get_packs()
         self.assertEqual(1, packs._max_pack_count(0))
         # after that the sum of the digits, - check the first 1-9
         self.assertEqual(1, packs._max_pack_count(1))
@@ -772,21 +775,16 @@
         self.assertEqual(25, packs._max_pack_count(112894))
 
     def test_pack_distribution_zero(self):
-        format = self.get_format()
-        repo = self.make_repository('.', format=format)
-        packs = repo._pack_collection
+        packs = self.get_packs()
         self.assertEqual([0], packs.pack_distribution(0))
 
     def test_ensure_loaded_unlocked(self):
-        format = self.get_format()
-        repo = self.make_repository('.', format=format)
+        packs = self.get_packs()
         self.assertRaises(errors.ObjectNotLocked,
-                          repo._pack_collection.ensure_loaded)
+                          packs.ensure_loaded)
 
     def test_pack_distribution_one_to_nine(self):
-        format = self.get_format()
-        repo = self.make_repository('.', format=format)
-        packs = repo._pack_collection
+        packs = self.get_packs()
         self.assertEqual([1],
             packs.pack_distribution(1))
         self.assertEqual([1, 1],
@@ -808,9 +806,7 @@
 
     def test_pack_distribution_stable_at_boundaries(self):
         """When there are multi-rev packs the counts are stable."""
-        format = self.get_format()
-        repo = self.make_repository('.', format=format)
-        packs = repo._pack_collection
+        packs = self.get_packs()
         # in 10s:
         self.assertEqual([10], packs.pack_distribution(10))
         self.assertEqual([10, 1], packs.pack_distribution(11))
@@ -825,9 +821,7 @@
         self.assertEqual([100, 100, 10, 1], packs.pack_distribution(211))
 
     def test_plan_pack_operations_2009_revisions_skip_all_packs(self):
-        format = self.get_format()
-        repo = self.make_repository('.', format=format)
-        packs = repo._pack_collection
+        packs = self.get_packs()
         existing_packs = [(2000, "big"), (9, "medium")]
         # rev count - 2009 -> 2x1000 + 9x1
         pack_operations = packs.plan_autopack_combinations(
@@ -835,9 +829,7 @@
         self.assertEqual([], pack_operations)
 
     def test_plan_pack_operations_2010_revisions_skip_all_packs(self):
-        format = self.get_format()
-        repo = self.make_repository('.', format=format)
-        packs = repo._pack_collection
+        packs = self.get_packs()
         existing_packs = [(2000, "big"), (9, "medium"), (1, "single")]
         # rev count - 2010 -> 2x1000 + 1x10
         pack_operations = packs.plan_autopack_combinations(
@@ -845,15 +837,28 @@
         self.assertEqual([], pack_operations)
 
     def test_plan_pack_operations_2010_combines_smallest_two(self):
-        format = self.get_format()
-        repo = self.make_repository('.', format=format)
-        packs = repo._pack_collection
+        packs = self.get_packs()
         existing_packs = [(1999, "big"), (9, "medium"), (1, "single2"),
             (1, "single1")]
         # rev count - 2010 -> 2x1000 + 1x10 (3)
         pack_operations = packs.plan_autopack_combinations(
             existing_packs, [1000, 1000, 10])
-        self.assertEqual([[2, ["single2", "single1"]], [0, []]], pack_operations)
+        self.assertEqual([[2, ["single2", "single1"]]], pack_operations)
+
+    def test_plan_pack_operations_creates_a_single_op(self):
+        packs = self.get_packs()
+        existing_packs = [(50, 'a'), (40, 'b'), (30, 'c'), (10, 'd'),
+                          (10, 'e'), (6, 'f'), (4, 'g')]
+        # rev count 150 -> 1x100 and 5x10
+        # The two size 10 packs do not need to be touched. The 50, 40, 30 would
+        # be combined into a single 120 size pack, and the 6 & 4 would
+        # becombined into a size 10 pack. However, if we have to rewrite them,
+        # we save a pack file with no increased I/O by putting them into the
+        # same file.
+        distribution = packs.pack_distribution(150)
+        pack_operations = packs.plan_autopack_combinations(existing_packs,
+                                                           distribution)
+        self.assertEqual([[130, ['a', 'b', 'c', 'f', 'g']]], pack_operations)
 
     def test_all_packs_none(self):
         format = self.get_format()




More information about the bazaar-commits mailing list