Have you ever found yourself trawling through a history of commits to see when a particular change was made? Perhaps you’ve found a bug in your code and want to work out exactly when it was introduced? How do you find it?
Do you open the commit history on GitHub and open 10 random historical commits in new tabs, hoping one of them has the change you’re looking for?
I’ve got some good news for you. There’s a way more efficient way to do this -
Git bisect is a command that allows you to leverage the power of binary search to find changes in your code.
When you start the bisect process, you give git an example of a ‘good’ commit (a commit hash where your code doesn’t contain the bug) and a ‘bad’ commit (a commit hash where your code contains the bug). Git will then pick a commit that occurred between the ‘good’ and the ‘bad’ commits. You can check if the regression is present at that point and tell git. You then repeat this process, at each step discarding half of the remaining commits between the ‘good’ and the ‘bad’ commits, until you find the specific commit that the regression you’re looking for was introduced at.
Without git bisect, you would need to check out each commit and check if the regression is present. If you’ve got even 100 commits, this would take quite a while. If you’ve got a few thousand commits - this isn’t at all reasonable. Checking out each commit uses a linear search which isn’t scaleable as the number of commits grows.
Using a binary search is much more efficient (especially as the size of elements you’re searching over increases) because you can discard half of the elements at each step.
In terms of time complexity, if we’ve got n commits to search through, a linear search has a time complexity of O(n) whereas the binary search that git bisect uses has a time complexity of O(log(n)).
Hypothetically, if you have a list of a billion commits to search through, a binary search can find a specific commit in a maximum of 30 steps. That’s incredible.
Check out a branch that has a regression in it. Alternatively, if you just want to experiment - pick a commit from the past that changed something in your codebase and make it your goal to find that specific commit.
Initiate a bisect:
git bisect start
Now let git know that the commit you’ve currently checked out contains a regression (is ‘bad’). If you don’t pass a hash here, git will assume the currently checked out commit is the ‘bad’ commit:
git bisect bad <optional commit hash here>
Next, let git know a commit hash in the past where the regression was not present (a ‘good’ commit):
git bisect good ff739dc2ad3df7d8656b88558eaf0f8b9ad3d3d3 > Bisecting: 235 revisions left to test after this (roughly 8 steps) > [cc8c12f4822ea7bb45b10fcc3cd8d13a65e0f60d] fix: update var name
Git will now pick a commit halfway between the good and the bad commits. In this example, there are 253 commits to test and we can test these in roughly 8 steps!
In this case, git has checked out the commit
cc8c12f4822ea7bb45b10fcc3cd8d13a65e0f60d so now I can look at my code to see if the regression is present or not.
The regression is still there in this commit, so I tell git this by typing
git bisect bad:
git bisect bad Bisecting: 118 revisions left to test after this (roughly 7 steps) [39624fe3e07038abe3658967bece9d16430f7764] chore: update helper
Now just repeat this process. At each step, git will checkout a commit halfway through the remaining revisions. You can tell git if the commit is ‘good’ (
git bisect good if the regression isn’t present) or ‘bad’ (
git bisect bad if the regression is present).
Eventually, you’ll find the exact commit that introduced the regression!
❯ git bisect bad a13093056e162df176cc66d9ab53c34322d2e939 is the first bad commit commit a13093056e162df176cc66d9ab53c34322d2e939 Author: Seán Barry <firstname.lastname@example.org> Date: Wed Mar 20 13:59:03 2021 +0000 fix: update endpoint url true .../endpoints/urls.ts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Now that we’ve found the exact commit (
a13093056e162df176cc66d9ab53c34322d2e939) that introduced the regression, the final step is to clean up the bisection state and return to the original HEAD with the command:
git bisect reset
Commits aren’t necessarily ‘bad’ or ‘good’, so git bisect allows you to use alternative terms.
By default, you can use ‘old’ and ‘new’ instead:
git bisect start git bisect old ff739dc2ad3df7d8656b88558eaf0f8b9ad3d3d3 git bisect new <optional hash here or current one used>
You can also use your own custom terms:
git bisect start --term-old this-works --term-new broke-af
And if you’ve got a bad memory and can’t remember the terms you used, git will remind you:
git bisect terms > Your current terms are this-works for the old state > and broke-af for the new state.
If you know that a specific commit isn’t the one you’re looking for, you can skip it:
git bisect skip
There’s more options around skipping commits that gives you much more control - find out about them in the git bisect docs.
If you can programmatically check whether or not a commit is good (perhaps by running a tests or detecting the presence of a keyword or a specific file), you can write a script to check the commits for you, and git bisect will run it automatically until the check fails and the ‘bad’ commit is found.
Write a script that exits with the code
0 if the checked out commit is ‘good’ and exits with a code between
127 (inclusive), except
125 if the checked out commit is ‘bad’.
Then just run:
git bisect run debug_script <arguments>
This is super useful if you’re one of those people who just loves automating things, or if you’ve got a huge number of revisions to check.
Check out the documentation for a more in-depth description of how to use this: git bisect docs.
I hope this article has taught you something new, and made you a more efficient software engineer. If it has - please reach out to me on Twitter - @SeanBarryUK so I know about it!