Scality’s Release Engineering team has completed the integration of all Zenko-related repositories in its GitWaterFlow delivery model, like all other Scality’s products. The GitWaterFlow model was introduced years ago at Scality to increase release quality and increase development speed.
You may have noticed that Cloudserver and Backbeat repositories now default to a branch called development/8.0. The repositories also contain new directories called eve and pull requests contain comments from the bot Bert-E. What’s going on? That’s the GitWaterFlow process in action. To understand it, we need a little history …
At the start, the small-scale team of Scality engineers working to develop the RING product employed CVS and later Subversion in an ad-hoc fashion, and collaboration happened ‘on the spot’. The engineering team pushed features and bug fixes into a shared trunk branch. This branch was slated to become the next ‘major’ release of the product, though overeager integration of partially-delivered features often resulted in the branch being in a non-shippable state.
The process to port bug fixes to relevant branches (‘back-porting’) was fully manual. When the change rate of the codebase reached a certain level, this turned out to be a bottleneck in the development process. As with all manual processes, this also was prone to introduce accidental bugs or regressions. Creation of backport commits on various version branches also destroyed relationships between semantically equivalent changesets, which could only be recovered through information kept in commit messages or the ticketing system, again relying on humans doing the right thing.
Introducing the GitWaterFlow model
That approach had too many flaws and didn’t scale. The Release Engineering team investigated options to radically change the approach, easing the workflow for developers as well as ensuring correctness of meta-information. The results were announced at the end of 2016 and kept improving since then.
GitWaterFlow (GWF) is a combination of a branching model and its associated tooling, featuring a transactional view on multi-branch changesets supported by none of the tools and models previously described. GWF tends to ban “backporting” in favor of “(forward) porting”. The term “porting” is employed to describe the act of developing a changeset on an old — yet active — version branch and subsequently merging it on newer ones. It is considered better than “backporting” for multiple reasons. “Porting” also makes merge automation trivial. In fact, changes that are merged in an old version branch, whether fixes or improvements, must also land in newer ones, otherwise there is a risk of regression. A bot can use this assumption to prepare and then execute the merge on newer branches, thus offloading the developer.
GWF comes with a versioning scheme that is inspired by semantic versioning (semver). Basically, version numbers are in the form major.minor.patch. patch is incremented only when backward compatible bug fixes are being added, minor is incremented when backward-compatible features are added, and major is incremented with major backward incompatible changes.
In GWF, every living minor version has a corresponding development/major.minor branch, each of which must be included in newer ones. In fig. 1 a development/1.0 is included into development/1.1, which in turn is included in development 2.0. Consequently, a GWF-compliant repository has a waterfall-like representation, hence the name “GitWaterFlow”.
As GWF is based on ‘porting’, feature branches do not necessarily start from the latest development branch. In fact, prior to start coding a developer must determine the oldest development/* branch his code should land upon (refer to fig.1.a). Once ready to merge, the developer creates a pull request that targets the development branch from which he started. A gating and merging bot will ensure that the feature branch will be merged not only on the destination but also on all the subsequent development branches.
Transactional Multi-Branch Changes
The fact that every pull request can concurrently target more than one mainline branch can dramatically affect the approach that developers take in addressing issues. For instance, it is not uncommon that conflicts exist between the feature branch targeting version n, and version n+1. In our setup, this class of conflicts must be detected and fixed prior to merging the pull request. The code that resolves such conflicts is considered part of the change, in fact, and must be reviewed at the same time. Also, it is a requirement that a pull request be merged only once it has passed the tests on all targeted versions.
In short, the changes brought to the software on multiple branches is a single entity and should be developed, reviewed, tested, and merged as such.
Only The Bot Can Merge
Bert-E is the gatekeeping and merging bot Scality developed in-house to automate GWF, its purpose being to help developers merge their feature branches on multiple development branches. The tool is written in Python and designed to function as a stateless idempotent bot. It is triggered via Bitbucket/GitHub webhooks after each pull request change occurrence (creation, commit, peer approval, comment, etc.).
Bert-E helps the developer prepare his pull request for merging. It interacts directly with the developer through GitHub’s (or Bitbucket) comment system via the pull-request timeline, pushing contextualized messages on the current status and next expected actions. In Scality’s case, Bert-E ensures that the pull request has at least two approvals from peers before it merges the contribution. In the future, Bert-E will also check that the JIRA fixVersion field is correct for the target branches to help product managers keep track of progress. Bert-E usually replies in less than 50 seconds, thus creating a trial-and-error process with a fast feedback loop that is ideal in onboarding newcomers to the ticketing process.
In parallel with the previously described process, Bert-E begins trying to merge on the subsequent development branches by creating integration branches named w/major.minor/feature/foo, after both the originating feature branch and the target development branch (refer to fig.1.b). Every time Bert-E is triggered, it checks to ensure that the w/* branches are ahead of both the feature branch and the corresponding development branches (updating them following the same process when this is not the case).
Every change on a w/* branch triggers a build/test session. When the pull request fulfills all the requirements previously described, and when the builds are green on all the w/* branches, Bert-E fast-forwards all the development branches to point to the corresponding w/* branches in an atomic transaction, as depicted in fig.1.c.
Note that if another pull request is merged in the interim, Bert-E will not be able to push and must re-update its w/* branches and repeat the build/test process.
Better ‘Definition of DONE’ and Smoother Developer Process
In use at Scality for over two years, we can testify that the main GWF benefit is its atomic multi-branch merge property. In this context, ‘DONE’ means merged and fully tested on all target branches, and there is no additional backporting phase wherein it is discovered that the backport is more complex than the fix itself. Target branch conflicts are detected early and are dealt with prior to merging.
Peer reviews/approvals aside, the development process is smoother and allows the developer to push his changeset to completion without depending on third parties to merge. Changeset ownership reverts back to the author and does not vary across time. Thus, the developer is responsible for it up until the merge.
Also, the metadata in the git repository is much clearer, and now a simple git branch –contains <commit> will indicate within which branch a change has been merged. Due to gatekeeping, the development branches are always in a shippable state, which has greatly improved Scality’s accuracy in predicting delivery dates. Hand-in-hand with that, the amount of overall engineering work in progress has been reduced due to the GWF deployment, and as a direct result Scality is shipping faster.