Checkouts? Or just light bound branches?

Sun Jan 29 23:28:16 GMT 2006

On Fri, 2006-01-27 at 21:38 +0100, Jan Hudec wrote:
> On Fri, Jan 27, 2006 at 14:29:07 -0500, Aaron Bentley wrote:
> > Hi Jan,
> > 
> > I think you're misreading me.  When I talk about checkouts below, I'm
> > referring to the concept of working-tree-plus-last-revision, not
> > bound-branch-with-shared-repository.
> 
> We seem to be talking past each other.
> 
> There are two ways to do 'checkouts':
>  1. Add concept of current revision to working tree.
>  2. Use a bound branch to put that concept in.

I dont think these are a matter of choice: a working tree *has* a
propery of current revision; currently we have bugs that relate to
failing to record that separately.

Its been covered in this thread already, but heres the core again:

Our current behaviour:
Start with a branch 'a' (with working tree) on my local disk. 
Bind a branch to that over sftp. ('b')
Commit in 'b', which results in a warning about the working tree in 'a'
not being updated.
now, locally cd to 'a' and do bzr diff.
What you get is a combination of any local changes plus the inverse
delta of the commit you did in 'b'.

This is a bug - if the branches are really bound then doing a diff in
'a' should only show user introduced local changes. Saying that 'a'
should not have a working tree is in this case inappropriate - we'd be
telling the user how to work, and I can certainly think of multiple use
cases for this exact layout [for example, a website under vcs, or using
my laptop and desktop interchangably].

So before we talk model, we need to see what is required to solve this.
And I think its clear that to solve it we need a last-revision property
of the working tree that is left at its previous value during that sftp
operation that the 'commit' performs.

Now, in terms of model, if we do this via reusing bound branches, then
the branch 'a' above will actually need to be two branches - a local
lightly bound branch representing the working tree content, and a normal
branch with storage representing the logical branch, and both of these
in the same place.

So now Occams razor comes in: we have two proposals:
 a) Serialize the [currently implicit] last revision that represents the
text the working tree was last updated to by bzr.
 b) Use a bound branch everywhere that a working tree is present to
record the same implicit last revision.

a) seems *far* more complex to me.

So, lets go with (a) to fix the bug with the user story I presented at
the top of the email.

Now, there is a separate feature request, that we be able to stop
emitting the 'Warning: working tree is not being updated' when pushing
over sftp if there is no working tree. The current plan to implement
this is to have a metaformat where working trees may be not present, as
determined by the existence of a control directory. This means that due
to a different design goal, a branch logically does not have to have a
working tree.

Now we can move onto checkouts. Checkouts are defined not in terms of
implementation but of behaviour:
 * Only as large as the current source code
 * Cannot commit or do other operations offline
 * Can be out of sync with the branch.

We have multiple options for implmenting these : we can add a Checkout
class, we can reuse bound branches (by extending them to have no local
storage but only when there is a working tree), or we can reuse working
trees (by extending them to record the location the branch can be found
at).

Note that in *neither* case do we have to add a last-revision property
to any objects: we have it in both Branch and WorkingTree already. Also
note that the issues with handling 'update' when the last-revision the
user created/updated to is not == the last revision on the branch are
not altered by the plan to implement checkouts, because we have to have
it anyway to handle the working trees last-revision being out of date.

Given the above, the question becomes - which is simpler/better:
 a) add a new class
 b) add a flag to bound branches to say 'storage is at the
bound-too-branch' AND I must have a working tree.
 c) add a marker to working tree recording where the branch is rather
than have it always at '.'. (And this could be recorded in the branch
control dir now that I think of it - as a reference object [see my other
mail for details])

Here, I think (c) is clearly better because rather than introducing an
'either-or' behaviour to branch, we do not need to change the working
tree code at all, simply by adding a pointer to the branch in the
control dir.

Now there is some possibly duplicate code, which started this whole
design discussion. I think it boils down to a couple of cases:
 * aborting commits when the objects last-revision is not the same as
that of the branch we are committing to. [commit]
 * handling the merge of out of objects with an out of date
last-revision up to the current head last-revision. [update]

pull might have some code changes for bound branches, but they dont seem
the same in wt and branch to me.

Its important to note that bound branches have the same check on commit
as working tree - self.last_revision == self.branch.last_revision for
working tree, and self.last_revision == self.bound_branch.last_revision
for the bound branch check. 

For update however, there are significant differences: when a bound
branch has diverged, rather than being a prefix, we need to turn the
divergent commits into a pending-merge. This isn't something we need to
do with working trees, as they cannot diverge, being purely passive
trees.

Update in a working tree is something like:
def update(self):
  new_revision = self.branch.last_revision()
  merge_inner(self.branch, this_tree=self, basis=self.basis_tree(),  
     other_tree = self.branch.basis_tree())
  self.set_last_revision(new_revision)

And 'update' for a bound branch is something like:
new_merges = ancestry(branch.bound_to.last_revision) - \    
   ancestry(branch).last_revision
tree = WorkingTree(branch.base, branch)
tree.set_pending_merges(new_merges)
# get the new history into the branch
branch.update_revisions(branch.bound_to, overwrite=true)
tree.update()

What concerns me a little is the layering - bound branches seem to be
somewhat dependent on the working tree to deliver the UI we want - but I
haven't looked deeply at Johns branch yet, so maybe hes addressed that
there.

Anyway, I dont see large amounts of duplicate code - similar yes, but
not identical. So I'm not concerned at this point.

My vote is strongly:
 +1 Use a working tree to implement checkouts.
 -1 Reuse Branch to implement checkouts.

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060130/bf1ded0f/attachment.pgp