GitLab Book: Git Branching Strategies 2020

Pretty good book, branching just isn't that important

GitLab released a dozen page book about Git Branching Strategies to serve as a high-level overview of branching strategies and how they work.

The flow of the book is pretty smooth and it is a quick read with lots of important topics. It’s also nice that they use the tiny font size in places where the text gets weedsy so a team leader or manager shouldn’t feel bad about skimming those.

  • Intro to development workflows.
  • Intro to source control.
  • Intro to git terms: commit, branch, and remote.
  • Then they dive into the branching strategies.

Introduction

Nothing too surprising in the introduction, though there was one stage-setting omission.

My introduction would include something like:

Code is both text and applications

  • git manages the text
  • IDEs (packagers and compilers) manage the application

Git works like a low-level API that operates on text in some very novel and helpful ways. This doesn’t mean it has a pleasant interface or is comprehensible to everyone. Just like Visual Studio or Eclipse mediate the developer’s experience with msbuild or java, GitLab and others have created mediation layers to operate git.

GitLab’s mediation approach enables many process improvements beyond what git provides and I wrote a bit about it.

If you haven’t taken the time to skim the book, you may want to before proceeding. The content below does not even attempt at expressing the knowledge from the book. This is more of a set of affirmations and grievances.

Central workflow

I think this one would have landed clearer if folks saw it as the “local vs remote” workflow since the expectations is that a commit will be an entire completed unit of work. It leans on decentralization heavily but requires external collaboration and discussion to keep the Git aspects simple.

They also cover how automatic merge commits are generated during certain actions and why not to be scared of them. It does get dense for several lines and it’s basically impossible to conceptualize stash vs merge commit without doing it.

Feature branching

Good outline and the finishing touch of “you don’t have to start here, just outgrow the central strategy and follow this one”.

Personal branching

This should have just had a big warning saying that the benefit is so minimal and complexity increase so high that it should be avoided. The one positive example of a developer creating a feature branch off their personal branch doesn’t even sound positive to me. They now have 2 sets of unusable work in their personal branch and have to figure out when each piece is okay to combine and then also merge to master. How would they just merge that feature to master if the personal branch isn’t done?

Furthermore, the personal branch is what everyone already has in their local repo vs remote repo so if you’re working on a feature branch locally and do something boneheaded, the decentralization already provides a measure of isolation there. If you don’t push it, you can change things to a new feature branch or rebase off a different master commit. The personal branch approach reads like a fundamental misunderstanding of the decentralization aspects.

Not that it isn’t a fine workflow when it’s identified as the best way to minimize coordination costs and improve efficiency. I am just not imaginative enough to come up with an example.

GitFlow

Again, this section is far too diplomatic and positive for what should be seen as a legacy approach that mitigates problems inherent to the basic git operations but are entirely resolved by GitLab’s mediation. The use of merge requests, git tags, and pipelines which map to environments means that using GitFlow is at least duplicating if not triplicating coordination work.

The other big problem with GitFlow that they didn’t touch on is how many meetings and discussions are necessary to decide whether a change comes from or goes into a develop or feature or master or release branch.

GitLab Flow

This section walks through a couple of examples and then cranks the complexity back up to match that of GitFlow. The re-introduction of a pre-prod environment made me sad because we have solved user acceptance and release testing with dynamic review apps and environments. Even production roll-outs are good with canary and blue/green or progressive roll-outs.

The lack of reference to git tags also sets up the casual reader with a blind spot. Any place where “this branch only reflects x” is the case, it should be a tag instead of a branch. The fact that one can create a hotfix branch from a tag means you have all the capabilities of GitFlow without needing to manage a branch over the long term.

Production branch

Why though?! When someone is looking at the commit log, even if the intention is that every commit in production was deployed, was it? No. And when was it deployed? the second it went in? Are rollbacks in the production environment followed by an old commit hash being added back to the head? And who knows what the head-1 or head-2 situation is at any given time for safe rollback?

None of that information goes into the git repo so it makes no sense to dedicate a whole branch to trying to show it.

Please send me an email if you have some example of this working properly and can show me that there’s more value in that than the RC tag approach.

Complexity increases that are worth it and when

I’d set the complexity increases at:

Complexity Branch solution Better Solution
Single user Commit to default branch Commit to default branch
Multiple users Branch per user? Branch per feature? Branch per feature
Change management: How do we know what’s deployed? Master should always reflect production. Create a new develop branch, change everything about everyone’s workflow. Tag commits that are supposed to be deployed, if it goes wrong, fix and tag the next one. The git hash is immutable so send it to production.
Product has 3 previous versions supported Create stable-1, stable-2, stable-3 branches and merge fixes to all of them Tag the releases since you do that anyway, checkout from the tags for patches
SaaS offering with multiple deployments A branch per deployment per environment? Maybe branches plus tags. Multiple repositories with base code in one place and each deployment having a separate repo. Each repo follows simple branching strategy and uses SemVer or git hashes to reference the base repo.

Seamless workflow, but seams are leverage points

“Seamless workflow” is a nice idea, but the seams are helpful for opening things up and taking fabric in or letting it out depending on the need. I never want someone to treat a workflow as though it’s being designed to be seamless and perfect. Workflows must consistently evolve and experiment with newer tools and better ways, measure those things, and improve.

GitFlow has dozens of seams but many of those are burlap which don’t even belong in the DevOps dress.

GitLab Flow designs a few deliberate seams in and brings flexibility to them depending on the type of organization and maturity of the product.

One note on total value

One of the smaller text units says the most important aspect:

With GitLab Flow, commits only flow downstream, ensuring that every line of code is tested in all environments. It’s suitable for teams of any size and has the flexibility to adapt to unique needs and challenges.

The reason I tend to educate folks on branching strategies is to help them get to the conclusion that it should be as minimal as possible and they’re better served relying on solutions to the problems they see.

For example, a lot of GitFlow branching is not about text management but about environment management or test management. There are better ways to do these things without dragging your developers and their git branches through those things.

Branches for input, tags and environments for output.

Branches for input, tags and environments for output.

Also, saying “this is the organizational branching strategy” is a huge problem because apps and teams will work differently, even within GitLab. If there is a reason to negatively impact how everyone in the organization works, it better be a really good reason. Something like “so our BI tool can draw a chart” or “so we can move PMs around” shouldn’t be acceptable.

In the GitLab way of things, we have created some Value Stream Analytics that tie a bit into the simple branching strategy but don’t require anything more complex. The stakeholders involved can get a good sense of how things are going and where to focus additional scrutiny without mandating a workflow.