a journal
25 September, 2013
When it comes to understanding git, I've found myself struggling to provide beginners with a good overview of how I intuitively picture what's happening when I use various git commands.
Usually, I end up scribbling a pile of dots on a bit of paper and saying something like the following:
it's really simple: each dot is a commit, and each dot is linked to at least one "parent" dot. A
branch
is just a label attached to one of the dots. "Committing a change to a branch" really just means "adding a dot and moving the label to that new dot".git reset
really just means "move the label to a specific dot", andgit merge
just means "add a new dot which links back to two other dots, then move the label to that new dot".
I sense the eyes begin to glaze over around the point where I claim this is "simple".
It turns out Sam Livingston-Gray had a similar explanation, but was smart enough to realise that this looks exactly like graph theory.
Seriously, go and read his amazing Think like (a) git. I mean it, it'll take like twenty minutes.
I'll wait.
Go. I'll wait.
Isn't it awesome? I love it! It perfectly captures my intuition about git when I realised that the base units are commits, not branches or tags or files or anything like that.
git really is quite simple: nodes and edges. Easy!
I really do love almost everything about that resource, but there's one part that I think could be streamlined a little. Part way through, Sam has this great moment where he fesses up:
… before I tried something I was a little uncertain about, I would back up the entire directory.
We've all been there, right? That moment when we're not sure, so we take a backup so that we can always get back if it goes horribly wrong. No harm in that, for sure, but if there's one lesson I have learned about git, it's that it's likely already a step ahead of me.
Buckle up. This is going to get awesome.
The issue Sam is highlighting here is the fear that "when I merge, there's no simple way for me to see what the result of that merge will be, in absolute terms. How can I preview the result so I can be comfortable?"
The solution he outlines, which is perfectly reasonable, is to create a new test branch from the merge target (probably master
), and then to merge the changes down to the new branch. If it all looks good, we can merge the changes down to master. If not, we just blow away the test branch and figure out what went wrong.
Thing is, this is similar to the pattern of backing up before we go forward, but unlike that pattern, it doesn't really buy us much.
The great thing about git is the fact that it very rarely throws anything away. Let's consider this simple workflow:
git checkout -b awesome_feature
echo "[ ]: implement feature" > feature.todo
git add feature.todo
git commit -m "adds feature checklist"
So this has passed peer review, and I'm ready to merge down to master
. What's my usual flow? Let's assume master
is up-to-date with all our remotes, and dive right in:
git checkout master
git merge awesome_feature
Wait, what?! That's terrifying!
What did I just do? Well, I merged my branch down to master
, and nothing went wrong.
Or, at least, I think nothing went wrong. How can I know for sure?
The simplest way I can think of, if you're dealing with a remote (on GitHub, say), is to diff with that remote. Assuming your remote is called origin
:
git diff origin/master
This will show you all the changes that your merge has introduced compared with the remote version of master
, and obviates the need to create a "copy" of master
to check against "actual" master
. git checkout master
already created a copy, and the original remains as origin/master
.
That's all well and good, but what if the merge is a complete clustercuss? My checked-out master
is now a bombsite. How do I fix this mess?
Well, this is where reset
comes in real handy. See, reset
says "take the current branch (label), and point it at a specific commit". In this case, we want to take the master
"label", and have it point at what it was before we merged. With a remote repo, this is trivial:
git reset --hard origin/master
Boom. Our checked out master
branch is now pointing at the same commit as origin/master
is pointing at: in other words, we're back to where we started. Clean as a whistle.
Okay, smarty: what happens if it's a purely local repo? You don't have an origin/master
to fall back to, so you're buggered. Hah!
Not so fast. Let's say you don't have a remote. reset
will still do the right thing, but we'll need to give it the actual SHA commit reference of where we were at. You wrote that down before you merged, right?
No?
NEVER FEAR. git totally has your back.
You might be thinking that you could just use git log
to see where we were before. Here's my git log
output:
git log --oneline
473578b Merge branch 'awesome_feature'
ebe0f08 adds feature checklist
9ea423f adds README
I mean, it's not bad, but it might not be clear which commit belongs to which branch. I could use a visualiser:
git log --oneline --graph
* 473578b Merge branch 'awesome_feature'
|\
| * ebe0f08 adds feature checklist
|/
* 9ea423f adds README
but where's the fun in that? git has a far cooler way of keeping track of where you've been:
git reflog
473578b HEAD@{0}: merge awesome_feature: Merge made by the 'recursive' strategy.
9ea423f HEAD@{1}: checkout: moving from awesome_feature to master
ebe0f08 HEAD@{2}: commit: adds feature checklist
9ea423f HEAD@{3}: checkout: moving from master to awesome_feature
9ea423f HEAD@{4}: commit (initial): adds README
Man, I love the reflog.
The reflog is a log of all the commits you've ever checked out, along with some narrative of how you got there. Note the second line of the output:
9ea423f HEAD@{1}: checkout: moving from awesome_feature to master
It tells me quite clearly that I moved from my awesome_feature
branch to master
, and when I was done I was looking at commit 9ea423f
. After that, I merged:
473578b HEAD@{0}: merge awesome_feature: Merge made by the 'recursive' strategy.
So 9ea423f
is the right commit to compare with:
git diff 9ea423f
and the right one to reset to if it's all gone horribly, horribly Pete Tong:
git reset --hard 9ea423f
and there we go: dodgy merge aborted!
This might seem a little convoluted, but hey: git itself seems "convoluted" compared to just backing up folders before you make any changes you're not sure of. The point here is that, because all you're really doing is making commits and applying labels, regardless of what you do, there's very little that ever gets lost. This frees you up to wield that git chainsaw with a little less fear.
Take this, for example: Let's say you've been working on a branch, and decide it's time to merge down to master. You check out, and then get called into a meeting. You come back and run git merge my_cool_feature
… and then realise you weren't actually on master
. You've just merged your cool feature into some other branch, and it's a complete mess. No worries:
git reset --hard HEAD@{1}
This is a shortcut for saying that you want to reset
the current branch to where it was before the last git action you did: in this case, to where it was before your crazy merge.
git is not just an awesome source control system, it's also silently keeping a little undo log running in the background, a meta log of all the commits you've looked at and how you got to each one.
git is not only a safety net for your code, it's a safety net for your process.
Now, seriously this time, go and read Think like (a) git.
Update: Thanks to Sam for the helpful feedback on a couple of points in this article.