Rewriting the History#
Learning Objectives#
Learn the appropriate situation for rewriting commit history and the risks involved
Understand how to use
git reflog
to recover from mistakes and access dangling commitsLearn how to amend the most recent commit to correct mistakes or update the commit message
Practice using
git reset
to move the HEAD of a branch to a specific commit, including the differences between--soft
and--hard
resetsMaster the technique of rebasing to clean up commit history and integrate changes linearly
Learn how to use interactive rebase to reorder, edit, squash, or delete commits for a cleaner history
In previous lessons we have seen how to create commits, branch our work and merge changes from multiple branches. In this lesson we going to look at the options we have available to rewrite the commits in our history.
When changing the history, we can lose sight of commits, so these operations aren’t completely without risk and some care must be taken. We’ll cover some precautions to take, and also see how to recover from mistakes.
When to Rewrite the History#
As a rule of thumb, don’t rewrite any history on a main
branch.
For a more nuanced approach, the more people who have a version of a commit, the less inclined you should be to rewrite the history.
The safest time to rewrite the history is on a local branch that hasn’t made it to a remote repository.
Once a tracked branch has been pushed to a repository, if we modify the history the try to re-push, we will get an error message.
We can tell the remote to replace the commits with the ones we are supplying with the --force
argument, if we’re sure this is what we want to do.
Getting Unstuck#
In all the cases presented below, making changes to the commit will actually create new commits with new SHAs.
If the previous commit have no references pointing towards them (branches, tags, remote branches or detached HEAD) then they will no longer show up in our log
(even with the --all
flag).
These dangling commits are still reachable from their SHA, but it is possible they could be cleaned up at some point by git’s garbage collector. If you perform an operation here and lose a commit you wanted (step 1, Don’t Panic), you can still see and reach the most recent commits visited:
git reflog
These commit can either be access by the SHA, or the special references, e.g., HEAD@{1}
for the previous commit location.
In the first instance, it would be best to attach a branch or tag to the chain of misplaced commits (e.g., git tag whoops HEAD@{1}
or git branch whoopsy HEAD@{1}
).
Following this you will be at your leisure to search the internet for how to recover your previous state (typically involving git reset
).
Amend Commit#
The smallest (and arguably safest if done before pushing) change we can make to the history is to update the previous commit. This is useful just after we’ve made a commit, then realised we missed something, or made a mistake in the files or the commit message.
To modify the previous commit, use git add
to stage the changes we missed out (if any), then create an amend
commit:
git commit --amend
An editor window will open populated with our previous commit message, giving us the chance to update it.
Exercise: Amend#
Try this out and change the commit message, then have a look at the output of git log --oneline --graph --all
, followed by git reflog
.
Add a tag to the “missing” commit then rerun the above git log
command.
Reset the Branch#
Another relatively simple change is to move the head of a branch. This can be achieved with
git reset <commit-sha>
The arguments supplied to reset
change whether the workspace, the index or both are updated.
The command git reset --soft
will update the index but leave the working directory unchanged.
This command is particularly useful for unstaging changes from the index.
In contrast, git reset --hard
will change the HEAD
and also change the files in the working directory to match.
We have to be especially careful with this command, since we can lose changes that haven’t been commit anywhere.
Aside: The Stash#
A useful command to quickly store any unsaved changes is
git stash
This takes the updates to our tracked files ans saves them into a local stash.
(It could be used as a safer equivalent to git reset --hard HEAD
.)
We can then recover our changes with
git stash pop
This action can even be performed on a different branch, providing us with a simple quick way of moving changes (say if we created a branch but forgot to switch to it).
Multiple sets of changes can be stored in the stash, they can be annotated with --message
, and they can all be viewed with git stash list
.
See the git stash --help
for extra features such as stashing untracked files, or moving stashed changes straight to a new branch.
Rebase Instead of Merge#
We have seen how we can use a merge to create a new commit with the changes from two branches:
D--E feature
/
A--B--C main
From main
, git merge feature
would result in
D----E feature
/ \
A--B--C--F main
where F
has the changes from B
, C
, D
and E
.
An alternative strategy would be to rebase the feature
branch to main
.
In a rebase, the commits unique to the current branch are applied at the head of the specified branch.
Returning to the previous example before the merge:
D--E feature
/
A--B--C main
If we are on the feature
branch, then calling git rebase main
would result in
D'--E' feature
/
A--B--C main
Here the two original commits unique to feature
(D
and E
) have been discarded, and two new commits with the same changes (D'
and E'
) have been applied starting at the head of main
.
The result should be the same as the merge with a “simpler” history (assuming any conflicts are resolved in the same way).
However, we have rewritten the history to get here - if there are any references to the previous commits, they will appear to be in the history twice.
Merging or rebasing comes down to a matter of choice for the situation. As with all rewrites to the history, it should be avoided when other people might have work on top of the commits which will be removed.
An alternative to rebase
is cherry-pick
, which applies the changes of commits but leave the original branch unchanged.
Aside: rerere
#
If you decide to make significant use of rebase
, you may find yourself being asked to resolve the exact same conflicts multiple times.
In this case, enabling the tool rerere
, or reuse recorded resolution, will save the user’s fix of a conflict and reapply it each time it comes up.
Interactive Rebase#
Another use of rebase
is to review part of the history and to get to choose whether to reorder, edit, squash or delete each commit in turn.
This is know as an interactive rebase.
This is simplest when we pick a number of commits unique to the current branch and rebase against the oldest one:
git rebase --interactive HEAD~4
An editor will be opened with all the commits pick
ed for inclusion in the rebase.
We can change the first column for each commit to one of the keywords prompted.
As an example, we could edit
one commit message, and squash
a number of commits into a single commit.
This interactive rebase can be done at the same time as moving the commits to the head of another branch. However, we could more simply do the interactive rebase followed by a separate rebase.
Summary#
We have seen a few options for rewriting the history, commit --amend
, reset
and rebase
.
We have also seen reflog
, the main tool for finding our way back to “lost” commits as a result of rewrites.
To end the course, we shall briefly see ten things we did not have time to cover, that you might want to read up on in your own time.