Using Git the Right Way

2024-10-11
,

I’ll first describe 3 different levels of Git proficiency. Then I’ll discuss ways to actually use Git “The Right Way” to get the most value out of it, and end with some tips. Hopefully by the end of this post you’ll want to become an advanced Git user if you’re not one already.

Beginner

You use Git as a necessary evil, because it’s mandated by your company to use it. It’s the only way for you to contribute code, so you put up with it.

You use git commit . to commit everything in the (top-level?) directory, because you don’t know how to use git rebase -i effectively.

Intermediate

You realize that Git is a useful tool for organizing code changes. You have multiple branches and know how to keep each one up-to-date. You might also know how to use git rebase -i to clean up history, for example to separate out formatting changes from semantic changes.

You use git add -p to help avoid needing to reach for git rebase -i.

Advanced

You appreciate Git commit messages as an important tool for archiving historical context and knowledge about every code change. These commit messages are written in stone forever, so you take care to craft each one with the utmost care. You care about giving a good reason (the why) in each commit message. And sometimes there are multiple “why” reasons: why you are authoring the commit at all in the first place, as well as why you are choosing the particular approach in the code.

This makes it pleasant for others to review your code, because you’ve taken the time to explain everything in each commit, perhaps even preemptively answering questions that code reviewers might have.

You try your best to help others understand the benefit of clean, beautiful Git history, because you know that future maintainers of the codebase will appreciate the story behind how it evolved over time.

Using Git “The Right Way” as an Advanced User

But what does it mean to use Git as an advanced user? How do commits actually get created in an advanced way? This is what that looks like in practice:

The last point may be the dealbreaker for most Git users, because it requires git rebase -i familiarity. However there are at least two fantastic benefits that you get from a codebase where Git was used in “The Right Way”:

The benefit to code review alone should be reason enough for you to get better at using Git if you’re not comfortable with it already.

Tips for getting better at Git

Most useful Git commands

If I had to recommend 2 commands (which are most often neglected), it would be:

These two commands help you craft small, short-and-sweet commits that do one thing well. It makes code review a breeze because the reviewer only has to look at each of these small commits, one at a time.

Don’t create meaningless merges

Another thing you can do is to avoid merging in the latest main (or master) branch into your PR branch. GitHub shockingly encourages this horrible behavior, encouraging newcomers to create many useless merge commits in a PR before it is merged, making a real mess of the history.

Never merge main into your branch. Instead, every PR should have only 1 merge commit — when the PR is merged into main. This keeps the history clean.

Rich commit messages

Lastly, I recommend using the active voice in present tense (imperative mood) to keep commit messages as direct as possible. Here’s an example of a noisy commit message not adhering to these rules:

fixed string buffer to have dynamically allocated memory instead of statically allocated limit of 255 characters sometimes we might need more than this esp on production

Terrible, right? Here’s why:

In summary, there are all of these speed bumps that just get in your way. It has a very low signal-to-noise ratio. Well, at least it’s better than “fix string bug” or some other lazy gibberish.

Now consider how the same commit message could have been written:

module foo: use dynamic strings to avoid truncation

The use of the 255-character limit for strings in module foo was first
introduced in 0a6f593 (use fixed sizes for strings, 2022-03-14). This
worked for a long time because we always knew (at program init time)
what each string looked like.

But then 51b475c (allow changing strings during runtime, 2024-06-29)
introduced the ability for strings to be reassigned during runtime, not
just at program initialization. This means that sometimes users were
assigning strings longer than 255 characters without realizing that this
was the case, because we don't perform input validation during string
reassignments. This led to these strings getting silently truncated,
leading to a jarring experience later on.

Drop the 255-character limit, and instead use dynamic strings. We could
probably get away with a larger 1024-character limit, but then given how
most strings are far less than 1024 characters, using a larger hardcoded
limit would lead to a large amount of memory being wasted in practice.
If we do need to go back to a larger hardcoded limit in the future, we
could consider moving toward a model of multiple hardcoded limits,
perhaps 255, 512, and 1024. For now, using dynamic allocation everywhere
is simpler, so that's what we do here.

And here’s why it’s better:

Of course, it is longer. And it took more effort to write. I’m arguing that such effort is worth it for future maintainers (maybe it’ll be the same person who authored the commit 6 months later who has forgotten everything about “module foo”).

Conclusion

Hopefully I’ve convinced you that using Git is much more nuanced than simply knowing the right commands. If you want to get exposure to a project that follows the above recommendations (full of advanced Git users), I recommend the Git project itself.

Please use Git to its full potential. Don’t settle!