Linear git history, Part II

Jun Sheng
3 min readAug 17, 2023

--

In part I, I discussed why linear git history is useful and in general how can we get a linear git history.

This part I am going to talk about one more thing which can make our lives with linear git history easier.

Tracking the commits

In git, a commit is identified by its id. The commit id is the digest(Git uses sha1) of the commit object, which has 3 parts: the commit message, the id of the tree object and the id(s) of the commit’s parent(s). If any of the 3 parts gets changed, the commit id will change.

image

In the operations of rebase or cherry-pick , the commit id will definitely change because the parent id gets changed, and mostly the tree object will also change. Rebase operations usually happen in the acceptance of a PR while cherry-pick is usually used for creating a release branch.

To overcome this, we can add some traits in the commit message, for example, some unique identities. If all of the rebase operations and cherry-pick operations are held in a way keeping these unique identities, we will have tracking about where the commits are.

What can be the traits and where should they be added

The traits themselves can be anything unique. For simplicity, it can just be a random string.
And for simplicity of processing by programs, the good place for these traits is in the trailer of the commit messages. A meaningful token is also necessary.

Here is an example:

Change-Id: I0287ae2ce24a876e13dd0863fa65c82dd847ebf5

How to add these traits

The traits should be added to the commit message no later than the pull-requests being created.

If you follow the commit --amend workflow, the traits can be added by using a commit hook.

If you follow the branch-then-squash workflow, the traits should be added when the squashed commit is generated. I have this git-gerrit-subcommand to support the branch-then-squash workflow. With this subcommand, the workflow looks like this:

  • Make sure you are in main/master branch
git gerrit whereami
  • Start a new feature:
git gerrit start-work
  • Edit-commit loop
git add MY_EDITED_FILES
git commit
  • Create a new change
git gerrit submit
  • If not approved, goto Edit-commit loop
  • (Getting approval) Accept the change

This tool can be used with github. When the git config parameter gerrit.remoteIsNotGerrit is set to true, this tool will push the branch having the squashed commit to GitHub allowing you to create a pull-request from that.

In both ways, the trait should be remaining the same until the pull-request is accepted.

How to use these traits

Unfortunately, git doesn’t have a way to structurally search the commit logs. However, if the traits were added as trailer of the commit message, using a --grep in the git-log command is straightforward:

git log --grep "^Change-Id: I0287ae2ce24a876e13dd0863fa65c82dd847ebf5$"

All in one solution

The real all-in-one solution is to use the Gerrit Code Review system. The Gerrit Code Review (Gerrit for short) system is a software providing git host and code review for software developers. The Android and Chromium development are organized using a Google managed Gerrit service.

In Gerrit, the equivalent concept to pull-request is the change. A change in Gerrit is a commit (and only one commit) identified by the "Change-Id" in the commit message trailer. Code review is held around the change. After being reviewed, the change is accepted and merged into the main branch, and Gerrit supports "rebase" merge. So Gerrit has all the 3 points of linear git history I mentioned: one-commit-per-pull-request, rebase merge and trait in commit messages.

Summary

Having one-commit-per-pull-request and rebase-merge is not adequate for getting linear git history. Adding traits to commit messages can help tracking each of the commits in the repository. Such traits can be just random strings.

--

--

Jun Sheng
Jun Sheng

Written by Jun Sheng

Jun Sheng is an expert in various areas. GitOps, Big Data and AI is his current focus.

No responses yet