[RFC] Community driven branch extensions
Ian Clatworthy
ian.clatworthy at canonical.com
Tue Nov 10 03:50:28 GMT 2009
Apologies up front for the length of this email. I hope the topic is
important enough to warrant it ...
One of the many things we're discussing at this week's sprint is how we
can deliver better imports for projects using other VCS tools such as
svn, gt and hg. To provide some broader context, the initial vision of
Launchpad included it being a "project source code supermarket". In
other words, if you wanted to see/improve the source code for package
Foo because a bug in Ubuntu was annoying you, you could scratch your
itch like this:
1. Fetch the branch using lp:foo
2. Make the change, linking it to the bug.
3. Submit a merge proposal.
The upstream project may have it's own bug tracker and it's own VCS
server somewhere else but that wouldn't be stuff casual contributors
would need to care about. Launchpad would join all the dots by having
good code imports, smart syncing with the upstream bug tracker and have
a neat way of turning merge proposals into the required upstream "patch
bundle".
Right now, the above works really nicely when the upstream is Bazaar but
we're not there yet for non-Bazaar projects. Technically, Bazaar is
still missing some features in those tools (e.g. externals ala
Subversion, rules/attributes per directory ala Git, file copies ala
Mercurial) which make lossless round-tripping impossible. Hmm.
How do we get those features (and many more) in there faster? Three
things spring to my mind when I ponder that question:
* Bazaar plugins are a *huge* success story - around 120 now exist
and the number grows week by week
* The vcs-fastimport developers (across all the major tools) recently
agreed on a "feature" command that allows tools to declare local
enhancements present in the stream. This rocks and it's allowed
far better bzr-to-bzr roundtripping/partitioning/etc. inside
bzr-fastimport than ever before. I understand Git are planning
to use the technology for hg and bzr interoperability as well.
* My http://bazaar-vcs.org/DraftSpecs/BranchDependencies spec from
months ago.
Our past strategy for enhanced branch metadata has been to consider all
the issues, rigorously debate whether something could break over a chain
of patches, force a format bump and cautiously roll out the change.
There have been many positive consequences including good scalability
and tight control over performance drivers. There have also been some
large negatives including the "too many formats" and "we can't consider
Bazaar because it doesn't have per branch end-of-line handling"
complaints. To be rude/ironic about it, Bazaar core development feels
like a cathedral vs the excitement and innovation happening in our
plugin marketplace. Hmm.
The root of the problem is my mind is this simple fact: 90% of problems
are solved by a fraction of the code. Nested trees are an excellent
example. If all I want to do is pull various trees together into a whole
(for building the Bazaar Plugin Guide for example), the constraints on
that problem means that I don't need to solve the generic nested trees
design and all the edge cases it contains. File copies are another. If
all I care about is round-tripping file copy data back to Mercurial, I
don't need to solve the complete File Copy problem (smart merging,
etc.). I can go on and on with examples (keywords, branch specific
rules) but here's the bottom line: the high priests of Bazaar
development (like me) can take years of elapsed time to do a generic
feature while casual contributors are churning out plugins solving
useful subsets of functionality in a fraction of the time. Maybe we
ought to put our energy into *enabling* the community to scratch their
itches better rather than serialising advanced feature development
inside the core team as much as we do? For the LEAN folk, we need more
set-based design and less BDUF.
It's a nice idea in theory but what does it mean and will it fly in
practice? Here's one idea for implementing it:
A. Branches can say "I need capability X in order to use me".
Users can register these via a simple UI ala bzr (q)tag.
B. Plugins can say "I deliver capability X".
C. We maintain a community-wide registry (wiki page say) of Xs.
Projects could use this feature to ensure developers had "team policy"
plugins installed, e.g. for always executing particularly hooks.
bzr-git might provide a capability called "git-submodules". bzr-keywords
might declare a capability called "keywords". It might also decide that
keywords are only enabled by rules declared inside .bzrkeywords inside
the working tree. Indeed, it doesn't need totally generic branch
specific rules to be solved to everyone's satisfaction - it just needs
to work for that case. (BTW, there are good safety reasons why per-user
keyword enablement is actually a mis-feature, not a benefit.)
Just as Bazaar checks the format name now before accessing data and says
"sorry, I don't understand format foo - please use a later version", it
would check the format AND check that all the capabilities were present.
If not, it would say "sorry, I can't handle capability X - please see
the (registry) for the plugins to install". In a sense, the "format"
would change from being a name to being a name + a set of keywords.
Don't that make sense? Am I smoking crazy stuff and ought to be locked
up? Are there better ways of achieving the same outcome? If not, would
this empower you and what would you use it for?
Don't get me wrong: there are *lots* of devil in the detail and this is
far harder than it sounds. To begin with, the Launchpad guys might lynch
me because it may be impossible for them to track all the required
plugins for all the required "ad hoc branch formats" that would
effectively be created. Perhaps some branches would become "store only"
and not viewable via Loggerhead say. Ugly but not a disaster in *my*
opinion. On the upside, the best case outcome is that this could be game
changing in a very positive way, yes?
Ian C.
More information about the bazaar
mailing list