Can someone tell me the difference between HEAD, working tree and index, in Git?
From what I understand, they are all names for different branches. Is my assumption correct?
I found this:
A single git repository can track an arbitrary number of branches, but your working tree is associated with just one of them (the "current" or "checked out" branch), and HEAD points to that branch.
Does this mean that HEAD and working tree are always the same?
HEAD
is the commit at the tip of the current branch. If you've just checked out the branch, i.e. have no modified files, then its content matches the working tree. As soon as you modify anything, it no longer matches.
Staging Area
to that list. What is HEAD
, Working Tree
, Index
and a Staging Area
A few other good references on those topics:
My Git Workflow
https://i.stack.imgur.com/cZkcV.jpg
I use the index as a checkpoint.
When I'm about to make a change that might go awry — when I want to explore some direction that I'm not sure if I can follow through on or even whether it's a good idea, such as a conceptually demanding refactoring or changing a representation type — I checkpoint my work into the index. If this is the first change I've made since my last commit, then I can use the local repository as a checkpoint, but often I've got one conceptual change that I'm implementing as a set of little steps. I want to checkpoint after each step, but save the commit until I've gotten back to working, tested code.
Notes: the workspace is the directory tree of (source) files that you see and edit. The index is a single, large, binary file in
Why Git is better than X
https://i.stack.imgur.com/CYO5D.jpg
Git Is Your Friend not a Foe Vol. 3: Refs and Index
They are basically named references for Git commits. There are two major types of refs: tags and heads. Tags are fixed references that mark a specific point in history, for example v2.6.29. On the contrary, heads are always moved to reflect the current position of project development.
(note: as commented by Timo Huovinen, those arrows are not what the commits point to, it's the workflow order, basically showing arrows as 1 -> 2 -> 3 -> 4
where 1
is the first commit and 4
is the last)
Now we know what is happening in the project. But to know what is happening right here, right now there is a special reference called HEAD. It serves two major purposes: it tells Git which commit to take files from when you checkout, and it tells Git where to put new commits when you commit. When you run git checkout ref it points HEAD to the ref you’ve designated and extracts files from it. When you run git commit it creates a new commit object, which becomes a child of current HEAD. Normally HEAD points to one of the heads, so everything works out just fine.
The difference between HEAD (current branch or last committed state on current branch), index (aka. staging area) and working tree (the state of files in checkout) is described in "The Three States" section of the "1.3 Git Basics" chapter of Pro Git book by Scott Chacon (Creative Commons licensed).
Here is the image illustrating it from this chapter:
https://i.stack.imgur.com/NxTUz.png
In the above image "working directory" is the same as "working tree", the "staging area" is an alternate name for git "index", and HEAD points to currently checked out branch, which tip points to last commit in the "git directory (repository)"
Note that git commit -a
would stage changes and commit in one step.
working tree
seems to be preferred to working directory
nowadays. See github.com/git/git/commit/…
Your working tree is what is actually in the files that you are currently working on.
HEAD
is a pointer to the branch or commit that you last checked out, and which will be the parent of a new commit if you make it. For instance, if you're on the master
branch, then HEAD
will point to master
, and when you commit, that new commit will be a descendent of the revision that master
pointed to, and master
will be updated to point to the new commit.
The index is a staging area where the new commit is prepared. Essentially, the contents of the index are what will go into the new commit (though if you do git commit -a
, this will automatically add all changes to files that Git knows about to the index before committing, so it will commit the current contents of your working tree). git add
will add or update files from the working tree into your index.
git commit -a
(you need to add them with git add
), so your working tree may have extra files that your index, your local repo, or your remote repo do not have.
HEAD
refers to the most recent commit, so when you commit, you are updating HEAD
to your new commit, which matches the index. Pushing doesn't have much to do with it - it makes branches in the remote match branches in your local repo.
Working tree
Your working tree are the files that you are currently working on.
Git index
The git "index" is where you place files you want commit to the git repository.
The index is also known as cache, directory cache, current directory cache, staging area, staged files.
Before you "commit" (checkin) files to the git repository, you need to first place the files in the git "index".
The index is not the working directory: you can type a command such as git status, and git will tell you what files in your working directory have been added to the git index (for example, by using the git add filename command).
The index is not the git repository: files in the git index are files that git would commit to the git repository if you used the git commit command.
reset --hard HEAD
to make sure that your index == your working tree. an then: mkdir history && git checkout-index --prefix history/ -a
The result is a duplication of your entire working tree in your history/
directory. Ergo git index >= git working directory
echo untracked-data > untracked-file
, before or after the git reset --HARD
and git checkout-index
steps. You will find that the untracked file is not in the history
directory. You can also modify both index and work-tree independently, although modifying the index without first touching the work-tree is hard (requires using git update-index --index-info
).
This is an inevitably long yet easy to follow explanation from ProGit book:
Note: For reference you can read Chapter 7.7 of the book, Reset Demystified
Git as a system manages and manipulates three trees in its normal operation:
HEAD: Last commit snapshot, next parent
Index: Proposed next commit snapshot
Working Directory: Sandbox
The HEAD
HEAD is the pointer to the current branch reference, which is in turn a pointer to the last commit made on that branch. That means HEAD will be the parent of the next commit that is created. It’s generally simplest to think of HEAD as the snapshot of your last commit on that branch.
What does it contain? To see what that snapshot looks like run the following in root directory of your repository:
git ls-tree -r HEAD
it would result in something like this:
$ git ls-tree -r HEAD
100644 blob a906cb2a4a904a152... README
100644 blob 8f94139338f9404f2... Rakefile
040000 tree 99f1a6d12cb4b6f19... lib
The Index
Git populates this index with a list of all the file contents that were last checked out into your working directory and what they looked like when they were originally checked out. You then replace some of those files with new versions of them, and git commit converts that into the tree for a new commit.
What does it contain?
Use git ls-files -s
to see what it looks like. You should see something like this:
100644 a906cb2a4a904a152e80877d4088654daad0c859 0 README
100644 8f94139338f9404f26296befa88755fc2598c289 0 Rakefile
100644 47c6340d6459e05787f644c2447d2595f5d3a54b 0 lib/simplegit.rb
The Working Directory
This is where your files reside and where you can try changes out before committing them to your staging area (index) and then into history.
Visualized Sample
Let's see how do these three trees (As the ProGit book refers to them) work together? Git’s typical workflow is to record snapshots of your project in successively better states, by manipulating these three trees. Take a look at this picture:
https://i.stack.imgur.com/NjC3A.png
To get a good visualized understanding consider this scenario. Say you go into a new directory with a single file in it. Call this v1 of the file. It is indicated in blue. Running git init
will create a Git repository with a HEAD reference which points to the unborn master branch
https://i.stack.imgur.com/FdHxp.png
At this point, only the working directory tree has any content. Now we want to commit this file, so we use git add
to take content in the working directory and copy it to the index.
https://i.stack.imgur.com/zxC4h.png
Then we run git commit
, which takes the contents of the index and saves it as a permanent snapshot, creates a commit object which points to that snapshot, and updates master to point to that commit.
https://i.stack.imgur.com/m47S1.png
If we run git status
, we’ll see no changes, because all three trees are the same.
The beautiful point
git status shows the difference between these trees in the following manner:
If the Working Tree is different from index, then git status will show there are some changes not staged for commit
If the Working Tree is the same as index, but they are different from HEAD, then git status will show some files under changes to be committed section in its result
If the Working Tree is different from the index, and index is different from HEAD, then git status will show some files under changes not staged for commit section and some other files under changes to be committed section in its result.
For the more curious
Note about git reset
command
Hopefully, knowing how reset
command works will further brighten the reason behind the existence of these three trees.
reset
command is your Time Machine in git which can easily take you back in time and bring some old snapshots for you to work on. In this manner, HEAD is the wormhole through which you can travel in time. Let's see how it works with an example from the book:
Consider the following repository which has a single file and 3 commits which are shown in different colours and different version numbers:
https://i.stack.imgur.com/0pAJy.png
The state of trees is like the next picture:
https://i.stack.imgur.com/cBZyr.png
Step 1: Moving HEAD (--soft):
The first thing reset will do is move what HEAD points to. This isn’t the same as changing HEAD itself (which is what checkout does). reset moves the branch that HEAD is pointing to. This means if HEAD is set to the master branch, running git reset 9e5e6a4 will start by making master point to 9e5e6a4. If you call reset
with --soft
option it will stop here, without changing index
and working directory
. Our repo will look like this now:
Notice: HEAD~ is the parent of HEAD
https://i.stack.imgur.com/aXBkk.png
Looking a second time at the image, we can see that the command essentially undid the last commit. As the working tree and the index are the same but different from HEAD, git status
will now show changes in green ready to be committed.
Step 2: Updating the index (--mixed):
This is the default option of the command
Running reset
with --mixed
option updates the index with the contents of whatever snapshot HEAD points to currently, leaving Working Directory intact. Doing so, your repository will look like when you had done some work that is not staged and git status
will show that as changes not staged for commit in red. This option will also undo the last commit and also unstage all the changes. It's like you made changes but have not called git add
command yet. Our repo would look like this now:
https://i.stack.imgur.com/FusDJ.png
Step 3: Updating the Working Directory (--hard)
If you call reset
with --hard
option it will copy contents of the snapshot HEAD is pointing to into HEAD, index and Working Directory. After executing reset --hard command, it would mean like you got back to a previous point in time and haven't done anything after that at all. see the picture below:
https://i.stack.imgur.com/ZjR0K.png
Conclusion
I hope now you have a better understanding of these trees and have a great idea of the power they bring to you by enabling you to change your files in your repository to undo or redo things you have done mistakenly.
Success story sharing