Abandoning Gitflow and GitHub in favour of Gerrit
GitHub is the go-to place to host your open source projects, that much is well known. A lot of companies also use their paid plans to get the ecosystem around GitHub for their own code. Why would you want to use anything else? We took the decision to move away from GitHub and in the end we benefitted hugely!
We adopted Gitflow early on when we still used GitHub as our main hub for organization repositories. It served us well for a long time and as a developer (at the time) it was easy to use and didn’t add a lot of extra work to incorporate in the day-to-day coding.
Time passed and we started to adopt an agile development cycle, pushing new releases every week. All of the sudden we had more code reviews to work on, more time spent in QA and the time spent on actual coding was steadily decreasing.
At the time we had consultants working with us to speed up the development process resulting in even more time spent on code reviews for the senior developers. We found ourselves more and more frustrated with GitHub and their pull request setup. As soon as someone added changes to their pull request – either by rebasing in the new changes or making it as a new commit – you lost track of the comments in the code and viewing what had actually changed since the last update became really hard (almost impossible if the new push was rebased with the new changes). With some git-fu you could do a manual diff locally but doing this between multiple pull requests day out and day in added unnecessary stress on the code reviewers.
Surely there had to be a better alternative?
There’d been a lot of noise around a tool called Gerrit and how it made code reviewing easier and more efficient. With the rise of docker it was easy to set up and get a test-server up and running.
At first it was overwhelming. The interface was clunky and configuring the service was complex. Coming from GitHub it seemed to miss a lot of the niceness we were used to, mainly due to the simplified UI. It wasn’t obvious initially to see any real benefits. Project setups and overall maintaining was well documented but the flexibility of Gerrit comes with the cost of complexity.
Setting up permissions (ACL) for projects that could inherit left and right was not easy and caused a lot of lockouts at times when playing with it locally. Luckily we decided to push forward, forcing us to at least get a few of our main git repositories up and fully use Gerrit for them to evaluate. This is when we started to see the benefits gerrit brought to the table.
While GitHub supports multiple commits in one pull request, Gerrit does not. The fundamentals of Gerrit is closer to the git request-pull way of thinking (the built in way to create “pull requests” within git – not to be confused with GitHub pull requests). One code change, (be it a feature or a bugfix, ) is one commit. Once you’ve created your commit and it’s ready for review you simply push it to a special git reference on the Gerrit server: git push gerrit feature/x:refs/for/master. The commit needs to be tagged with a Change-Id which can be automatically generated with a commit-hook.
Success! You’ve just created your first Gerrit code review. The process might seem more complex initially but think of it like this: If you add a new member to your team, they would have to fork the repositories on GitHub, clone them locally, make the changes, push to their own fork and then create the pull request. With Gerrit, you could clone the main repository, do your change and then push it directly to the same remote as you cloned it from.
It’s now time for the reviewers to step in. They will be presented with a simple list of open/available code reviews still not merged. The list will show for all available projects (git repositories) which gives you a nice unified view where you can start your day. By default Gerrit will be setup to require a score for each review, ranging from -2 to +2 (where 0 is the default).
The way -2..+2 is used as a score is giving you quick overview of the status of a code review and what your peers think of it. A -2 normally means they don’t accept the code change and that they veto it to not be merged. Giving it a -1 shows that you would prefer it to not be merged as is. It’s still valid to give comments and feedback and keep the score as 0 meaning that you don’t weigh in on if it should be merged or not, instead you simply wanted to give your five cents in a comment. When we reach the positive scale, +1 means that you find the code review acceptable but want someone else to also approve it. The highest score (+2) simple means: looks good to me, approved.
For a code review to be merged, it requires at least one reviewer rating it as +2. A code review with two +1’s will not be allowed to merge into the master branch. One -2 will block merging altogether, no matter how many +2’s it has. This over time results in better code reviews as it only requires that one person finds an issue with the code, gives it -2 and effectively blocks merging it until a consensus has been reached regarding the issue.
We decided early that we wanted to have feedback on both the code itself as well as a flag stating if the verification was successful or not. Extending the default Code-Review score we also added a Verified score (ranging from -1 to +1). The verified score is meant to be used as an “I’ve tested this locally and it works according to the specification” but it is also used for continuous integration with Jenkins CI. As soon as a new code review is pushed, Jenkins will automatically start the build process and test-suite for the project. If the build fails or any of the tests don’t pass it will be reported back to Gerrit as: Verified -1. If everything is green from Jenkins side it will send a +1 for verified. Even though a Code-Review +2 with Verified +1 is enough, we have internally decided that the Jenkins +1 doesn’t count for the Verified score. It requires a manual +1 (this could be enforced within Gerrit but we wanted to avoid adding too many flags and instead put the trust in the developers to handle this in the decided way).
Updating a code review
When feedback starts rolling in on one of your code reviews it’s time to take action on it. In Gerrit you’ll find a customized dashboard showing your open code reviews, code reviews you’ve given feedback to as well as recently closed ones. Compared to GitHub, the feedback loop is easier to follow and it’s very easy to get up to speed with what’s happened since last time you visited Gerrit.
Suppose you got a -2 on the Code-Review part. You’ll easily see comments in the diff view where the reviewer with great flexibility can mark single characters, words, lines or bigger blocks to pinpoint exactly what they want to comment on.
With that feedback it’s time to fix the issue. You open up your editor (neovim, because vim is love) and code away. When you’re ready to update the code review on Gerrit, instead of creating a new commit locally, you do a git commit –amend. This adds the changes into your existing commit (HEAD) resulting in your feature still being only one commit. Now you do a simple push to Gerrit git push gerrit HEAD:refs/for/master and your code review gets updated (this is where the Change-Id comes into play, using the Change-Id Gerrit knows that you are doing an update instead of a new code review).
But wait a moment?! Didn’t you say the reason for leaving GitHub was that you lost track of changes that happens between pull request updates? Just amending would remove all the commit history! Yes, but not quite. It’s time to go down the path of patch sets!
What is a patch set?
A patch set in Gerrit is related to a specific code review. Each time you do a git commit –amend it rewrites the commit resulting in a new commit hash. Locally you have access to your previous hash (not associated with any branch anymore so it might be “hidden”) which would allow you to diff between them locally: git diff <old hash> <hash after amend>. This is exactly the same thing that Gerrit does. They even go one further and stores every push you do to Gerrit as it’s own patch set and stores it as a git reference on the remote server.
With the patch sets, Gerrit allows you to diff any two patch sets within a code review. By default you will look at the Base (the parent commit you started from) and the latest pushed patch set. However if you left a comment on patch set 2 and there’s been updates so that the latest one is patch set 6, you might not be interested in checking them one by one but rather see the sum of the changes. You would then simply diff between patch set 2 and 6. All the comments I added would be shown in the left side split and on the right I would see what has changed since that point in time. When that’s been reviewed I go back and check the base to current patch set changes and verify them again.
To sell this concept even more: Gerrit even keeps track of the files you’ve reviewed (even if you haven’t sent your drafted comments/feedback yet). For huge code reviews with changes in many files you can view the first three, leave the code review and the next day when you pick it up again Gerrit will show them as “seen”. As long as there hasn’t been any new patch sets pushed, or if the patch sets didn’t modify the viewed files. Let me repeat that: If no changes have been made to files you’ve seen then Gerrit will explicitly tell you that there’s “nothing new to see”.
Working with patch sets
With GitHub you would often add the repository of the one creating the pull request as a remote and then check out their feature-branch using git checkout <username of pull-request creator>/feature/<name of feature>. This worked but you ran the risk of checking out code that wasn’t in the pull request (if the PR was locked to a specific hash and not the branch). GitHub does have pull request references so a safer bet would be to use that (git fetch github refs/pull/<id>/head && git checkout FETCH_HEAD).
Gerrit allows the same flexibility as GitHub with references. Instead of refs/pull you’ll find the code reviews under refs/changes. If I’m working on code review id 3 and the first patch set (1) I would fetch it using: git fetch gerrit refs/changes/03/3/1. Since this is an operation done often it makes sense to add the refs/changes/* to be fetched together with the remote (git config –add remote.gerrit.fetch ‘+refs/changes/*:refs/remotes/gerrit/changes/*’). Doing so will fetch all changes in gerrit with all patch sets. With that setup you can easily check changes between patch sets locally in a manner of git diff gerrit/changes/00/300/5 gerrit/changes/00/300/16. So if I was just verifying the patch set 5 and it worked as expected, I can then see the diff of changes and easily check out the new patch set to verify that. The diff will give a good hint on where to focus the verification process on.
Gerrit has certainly been the right tool for our team. It allows us to work with code reviews in an efficient manner again, resulting in more time developing which is what every developer wants. Even though we’ve just scratched on the surface of what gerrit has to offer it should give an idea of the main benefits it could bring in terms of follow up on existing code reviews over time.
Gerrit is being used by many large open source projects, everyone with their unique setup in terms of what they require a code review to pass before being ready to submit into master. Check them out to get a feel for how gerrit works and how it can be used:
- Android – https://android-review.googlesource.com/
- LibreOffice – https://gerrit.libreoffice.org/
- Wikimedia – https://gerrit.wikimedia.org/
- Golang – https://go-review.googlesource.com/
- Tuleap – https://gerrit.tuleap.net/
- Star Wars Galaxies Emulator – http://review.swgemu.com/