The Origins of Trunk Based Development
Learn more about trunk-based development and its influence on source control management systems, DevOps, and software development as a whole.
Join the DZone community and get the full member experience.
Join For Freeiâve been trying to track down the pioneers of trunk based development, and find out what their rationale and influences were.
a big-ass trunk âgeneral shermanâ, in california from wikipedia.
merging wasnât always smooth
if you ignore the âonly one person can edit a file at a timeâ era of source-control, merging has always a factor. it doesnât matter if youâre merging to working copy, or to/from a branch, merging is in your tool-chain. the advances at the end of the 90âs were more effective three-way merges, and better and better merge point tracking.
at some level, if a merge is buggy as it was in the early days, youâre forced into a trunk model. everyone in a team would sync (pull) to (from) the head revision of the single branch many times a day, make small changes, and check them back in. that way the pain from co-workers would be minimized. good merging and branching allowed you to step away from that model, but a minority of experienced developers wonder if you should. at least, wonder how long those branches should live, and how many developers should be allowed to contribute to them.
trunk is just one mode of operation
cvs came before subversion. the wikipedia page says a handful of scripts created in 1986, were fashioned into an initial release in 1990, with the last release being in 2008. it was created to overcome deficiencies in rcs (1982 to date). when cvsâs limitations were determined to be unsurmountable, the subversion project kicked off (the initial 0.x release was in 2000). subversion took influences from other commercial packages like perforce too. subversion has a default layout for a new repo for âtrunk, tags and branchesâ, but that isnât forcing the trunk based development model that i promote at all. people who use subversion in the enterprise can often say âwe do trunk based developmentâ, but thatâs only true if the only branches are for releases, and they are made on a just in time basis for the release itself.
perforce was around in the 90âs too, and very solid. while trunk based development was a mode of operation, it wasnât the only way of using it, and the perforce people didnât focus on it as a bedrock practice. not like i do, at least. they did describe it, to some degree, in 1998âs high-level best practices in software configuration management paper. i was head of development at a startup in london in 2001, and it was the bible i made lieutenants read before flipping the team from cvs to perforce (and trunk).
thoughts from scm pioneers, karl fogel and jim blandy
karl was one of the cvs developers who stepped helped kick off subversion. casting his mind back to the late 90âs he says:
"we were mainly influenced by cvs. i guess iâd say subversionâs branching design comes from âcvs + atomicityâ:
- a branch should just be a lightweight copy (copy-on-write);
- a tag is just a read-only branch;
- since branches and tags are just copies, and copying is a versioned
event, it follows that branching and tagging events are versioned.
that last point is still one of the most frustrating things about git for me: that in git, creating or removing a branch is not an event in the version history. a branch is just an un-versioned moveable pointer to a particular commit. (thereâs a lot i do like about git, i hasten to add â just there are a few things where i think âdang, thatâs not rightâ).
as for development models, we had ones we were accustomed to, but we wanted a system flexible enough to accommodate different models. hence the âitâs just a copyâ starting point â you can support a lot of different models on top of a consistent branching primitive.
well, i know jim and i were using that model for open source development (then called âfree software developmentâ of course) well before svn, and in fact even before wan-enabled cvs had been released â we were using it in projects versioned under rcs. the earliest that iâm sure i was collaborating with others using that basic model in rcs was 1993
as for the usage of the âeveryone on one trunk, with occasional short-lived branches for bug-fixing or experimental developmentâ model:
"iâm pretty sure thatâs what was going on the emacs tree when emacs was still on stock rcs/cvs â at least as of 1992, when i got involved, but jim could confirm or deny that that model was going on earlier than that.
again, i donât think it was unique to emacs. itâs a pretty obvious model to use when one has a competent group of developers who can agree to not break (or not break too often) the master line of development.
i hope thatâs helpful, paul, even though it probably canât count as the most rigorous historical sourcing ever! (i suppose an actual investigation of the emacs source codeâs history, which i believe has been preserved pretty well through all the vc system migrations, could confirm, although that would be a lot of work.)
(karl started question copyright some time ago, and you should really check that out)
jim also helped design subversion (and was a long term gdb maintainer). on how they wrote subversion itself, he says:
"we all worked on a single trunk, and we didnât use branches much at all, let alone long-lived branches.
on the back history, towards where trunk-only models were first used?
"iâd been the maintainer of gnu emacs before (1990 to 1993, taking over from joe arceneaux), where we just used a single source tree, without version control. then, iâd been working on gdb for red hat, where we used cvs and used the trunk for almost all work, and made releases from branches.
thoughts from craig silverstein â googleâs first hire #1
says i: who were the chief architects of the âweâre all in one big trunk, sharing code at source levelâ inside google? also: what were the influences for such a design?
"i believe that i am as responsible as anyone for the policy of having our entire codebase in one big repository. my philosophy was that it would make it as easy as possible to share code across projects. it also helped avoid problems with versioning, since the expectation was that everyone would always run their code at the tip of the trunk. i donât remember there being any outside influences that prompted that decision.
this had its plusses and minuses, and while i feel overall it was a win it definitely had a cost. as the company grew the base libraries would get modified in ad hoc ways that did not represent a coherent vision; eventually we had to put gatekeepers on checkins to the base/ directory. and we had to write lots of bespoke tooling to allow people to only check out a subset of the repo, since it was too large to check out the whole thing.
if i were doing it again i donât know how i would arrange things. at khan academy, where we use git, weâve organized things using lots of submodules, which again has plusses and minuses. basically, my conclusion is that this is a surprisingly hard problem.
thus, craig was the designer/engineer of googleâs mega-trunk. craig now works for the phenomenal khan academy , as he mentions.
jez humble, co-author of continuous delivery
jez canât actually remember the 90âs, but says:
"version control is fundamentally a communication tool, and you canât communicate what youâre doing, or get feedback from the deployment pipeline, if youâre working off in long-lived feature branches that donât get merged regularly into trunk. feature branches work fine on small, experienced teams, but they donât scale because to release you have to merge the feature branches with each other, which is a combinatorial problem, which is why organizations like google work off trunk and at head
jez has exclusively encountered trunk branching models in his career. by contrast thatâs just mostly the case for me, or i got the client there after a short struggle. as heâd never encountered the truly shit clearcase style multi-branch scenario, i donated a branching diagram or two originally used at thoughtworks mission at a us bank, where we took them to trunk from that hell. for the cd book i mean â page 349 or thereabouts.
jez frequently reminds me that the practice of continuous integration (ci) really requires a trunk usage. to some extent he says, âci is trunk-based developmentâ. wikipedia authors have refined an opening snippet on ci which really rams home this point:
"continuous integration (ci) is the practice, in software engineering, of merging all developer working copies with a shared mainline several times a day. it was first named and proposed by grady booch in his method (see book ref ), but he did not advocate integrating several times per day. it was adopted as part of extreme programming (xp), which did advocate integrating more than once per day, perhaps as many as tens of times per dayâŚ
theyâve used âmainlineâ in the wikipedia article, but i like to disambiguate from the hell that clearcase made popular in the middle naughties, also called mainline, that was very different. hence trunk and trunk-based development.
at the time of writing, i have 33 articles on trunk-based development .
Published at DZone with permission of Paul Hammant. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments