Last updated on 26th Sep 2020, Blog, Tutorials
What is Git?
Git is currently the most popular implementation of a distributed version control system.
Git originates from the Linux kernel development and was founded in 2005 by Linus Torvalds. Nowadays it is used by many popular open source projects, e.g., the Android or the Eclipse developer teams, as well as many commercial organizations.
The core of Git was originally written in the programming language C, but Git has also been re-implemented in other languages, e.g., Java, Ruby and Python.
A Git repository contains the history of a collection of files starting from a certain directory. The process of copying an existing Git repository via the Git tooling is called cloning. After cloning a repository the user has the complete repository with its history on his local machine. Of course, Git also supports the creation of new repositories.
If you want to delete a Git repository, you can simply delete the folder which contains the repository.
If you clone a Git repository, by default, Git assumes that you want to work in this repository as a user. Git also supports the creation of repositories targeting the usage on a server.
- bare repositories are supposed to be used on a server for sharing changes coming from different developers. Such repositories do not allow the user to modify locally files and to create new versions for the repository based on these modifications.
- non-bare repositories target the user. They allow you to create new changes through modification of files and to create new versions in the repository. This is the default type which is created if you do not specify any parameter during the clone operation.
A local non-bare Git repository is typically called local repository.
A local repository provides at least one collection of files which originate from a certain version of the repository. This collection of files is called the working tree. It corresponds to a checkout of one version of the repository with potential changes done by the user.
The user can change the files in the working tree by modifying existing files and by creating and removing files.
A file in the working tree of a Git repository can have different states. These states are the following:
- untracked: the file is not tracked by the Git repository. This means that the file never staged nor committed.
- tracked: committed and not staged
- staged: staged to be included in the next commit
- dirty / modified: the file has changed but the change is not staged
After doing changes in the working tree, the user can add these changes to the Git repository or revert these changes.
Adding to a Git repository via staging and committing
After modifying your working tree you need to perform the following two steps to persist these changes in your local repository:
- add the selected changes to the staging area (also known as index) via the git add command
- commit the staged changes into the Git repository via the git commit command
This process is depicted in the following graphic.
The git add command stores a snapshot of the specified files in the staging area. It allows you to incrementally modify files, stage them, modify and stage them again until you are satisfied with your changes.
Some tools and Git user prefer the usage of the index instead of staging area. Both terms mean the same thing.
After adding the selected files to the staging area, you can commit these files to add them permanently to the Git repository. Committing creates a new persistent snapshot (called commit or commit object) of the staging area in the Git repository. A commit object, like all objects in Git, is immutable.
The staging area keeps track of the snapshots of the files until the staged changes are committed.
For committing the staged changes you use the git commit command.
If you commit changes to your Git repository, you create a new commit object in the Git repository. See Commit object (commit) for information about the commit object.
Synchronizing with other Git repositories (remote repositories)
Git allows the user to synchronize the local repository with other (remote) repositories.
Users with sufficient authorization can send new version in their local repository to remote repositories via the push operation. They can also integrate changes from other repositories into their local repository via the fetch and pull operation.
The concept of branches
Git supports branching which means that you can work on different versions of your collection of files. A branch allows the user to switch between these versions so that he can work on different changes independently from each other.
For example, if you want to develop a new feature, you can create a branch and make the changes in this branch. This does not affect the state of your files in other branches. For example, you can work independently on a branch called production for bugfixes and on another branch called feature_123 for implementing a new feature.
Branches in Git are local to the repository. A branch created in a local repository does not need to have a counterpart in a remote repository. Local branches can be compared with other local branches and with remote-tracking branches. A remote-tracking branch proxies the state of a branch in another remote repository.
Git supports the combination of changes from different branches. The developer can use Git commands to combine the changes at a later point in time.
Subscribe For Free Demo[contact-form-7 404 "Not Found"]
The details of the commit objects
Commit object (commit)
Conceptually a commit object (short:commit) represents a version of all files tracked in the repository at the time the commit was created. Commits know their parent(s) and this way capture the version history of the repository.
Technical details of a commit object
This commit object is addressable via a hash ( SHA-1 checksum ). This hash is calculated based on the content of the files, the content of the directories, the complete history of up to the new commit, the committer, the commit message, and several other factors.
This means that Git is safe, you cannot manipulate a file or the commit message in the Git repository without Git noticing that corresponding hash does not fit anymore to the content.
The commit object points to the individual files in this commit via a tree object. The files are stored in the Git repository as blob objects and might be packed by Git for better performance and more compact storage. Blobs are addressed via their SHA-1 hash.
Packing involves storing changes as deltas, compression and storage of many objects in a single pack file. Pack files are accompanied by one or multiple index files which speedup access to individual objects stored in these packs.
A commit object is depicted in the following picture.
The above picture is simplified. Tree objects point to other tree objects and file blobs. Objects which didn’t change between commits are reused by multiple commits.
Hash and abbreviated commit hash
A Git commit object is identified by its hash (SHA-1 checksum). SHA-1 produces a 160-bit (20-byte) hash value. A SHA-1 hash value is typically rendered as a hexadecimal number, 40 digits long.
In a typical Git repository you need fewer characters to uniquely identify a commit object. As a minimum you need 4 characters and in a typical Git repository 5 or 6 are sufficient. This short form is called the abbreviated commit hash or abbreviated hash. Sometimes it is also called the shortened SHA-1 or abbreviated SHA-1.
Several commands, e.g., the git log command can be instructed to use the shortened SHA-1 for their output.
Predecessor commits, parents and commit references
Each commit has zero or more direct predecessor commits. The first commit has zero parents, merge commits have two or more parents, most commits have one parent.
In Git you frequently want to refer to certain commits. For example, you want to tell Git to show you all changes which were done in the last three commits. Or you want to see the differences introduced between two different branches.
Git allows addressing commits via commit reference for this purpose.
A commit reference can be a simple reference (simple ref), in this case it points directly to a commit. This is the case for a commit hash or a tag. A commit reference can also be symbolic reference (symbolic ref, symref). In this case it points to another reference (either simple or symbolic). For example HEAD is a symbolic ref for a branch, if it points to a branch. HEAD points to the branch pointer and the branch pointer points to a commit.
Branch references and the HEAD reference
A branch points to a specific commit. You can use the branch name as reference to the corresponding commit. You can also use HEAD to reference the corresponding commit.
Parent and ancestor commits
You can use ^ (caret) and ~ (tilde) to reference predecessor commit objects from other references. You can also combine the ^ and ~ operators. See Using caret and tilde for commit references for their usage.
The Git terminology is parent for ^ and ancestor for ~.
Using caret and tilde for commit references
[reference]~1 describes the first predecessor of the commit object accessed via [reference]. [reference]~2 is the first predecessor of the first predecessor of the [reference] commit. [reference]~3 is the first predecessor of the first predecessor of the first predecessor of the [reference] commit, etc.
- [reference]~ is an abbreviation for [reference]~1.
For example, you can use the HEAD~1 or HEAD~ reference to access the first parent of the commit to which the HEAD pointer currently points.
- [reference]^1 also describes the first predecessor of the commit object accessed via [reference].
For example HEAD^ is the same as HEAD~ and is the same as HEAD~3.
The difference is that [reference]^2 describes the second parent of a commit. A merge commit typically has two predecessors. HEAD^3 means ‘the third parent of a merge’ and in most cases this won’t exist (merges are generally between two commits, though more is possible).
Commit ranges with the double dot operator
You can also specify ranges of commits. This is useful for certain Git commands, for example, for seeing the changes between a series of commits.
The double dot operator allows you to select all commits which are reachable from a commit c2 but not from commit c1. The syntax for this is c1..c2. A commit A is reachable from another commit B if A is a direct or indirect parent of B.
- Think of c1..c2 as all commits as of c1 (not including c1) until commit c2
For example, you can ask Git to show all commits which happened between HEAD and HEAD~4.
- git log HEAD~4..HEAD
This also works for branches. To list all commits which are in the master branch but not in the testing branch, use the following command.
- git log testing..master
You can also list all commits which are in the testing but not in the master branch.
- git log master..testing
Commit ranges with the triple dot operator
The triple dot operator allows you to select all commits which are reachable either from commit c1 or commit c2 but not from both of them.
This is useful to show all commits in two branches which have not yet been combined.
- # show all commits which
- # can be reached by master or testing
- # but not both
- git log master…testing
The Git command line tools
The core Git development team provides tooling for the command line via the the git command. Without any arguments, this command lists its options and the most common commands. You can get help for a certain Git command via the help command online option followed by the command.
- git help [command to get help for]
- See all possible commands, use the git help –all command.
Git supports for several commands a short and a long version, similar to other Unix commands. The short version uses a single hyphen and the long version uses two hyphen. The following two commands are equivalent.
- git commit -m “This is a message”
- git commit –message “This is a message”
Separating parameters and file arguments in Git commands
The double hyphens (–) in Git separates out any references or other options from a path (usually file names). For example, HEAD has a special meaning in Git. Using double hyphens allows you to distinguish between looking at a file called HEAD from a Git commit reference called HEAD.
In case Git can determine the correct parameters and options automatically the double hyphens can be avoided.
- # seeing the git log for the HEAD file
- git log — HEAD
- # seeing the git log for the HEAD reference
- git log HEAD —
- # if there is no HEAD file you can use HEAD as commitreference
- git log HEAD
The Eclipse IDE provides excellent support for working with Git repositories.
See Using Git with the Eclipse IDE for an introduction into the usage of Git with Eclipse.
Be An Expert in Git with Hands-on Practical Git Training By Top-Rated Instructors
- Instructor-led Sessions
- Real-life Case Studies
Other graphical tools for Git
You can also use graphical tools.
See GUI Clients for an overview of other available tools. Graphical tools like Visual Studio Code, Netbeans or IntelliJ provide also integrated Git Tooling but are not covered in this description.
Installation of the Git command line tooling
Ubuntu, Debian and derived systems
On Ubuntu and similar systems you can install the Git command line tool via the following command:
sudo apt-get install git
Fedora, Red Hat and derived systems
On Fedora, Red Hat and similar systems you can install the Git command line tool via the following command:
dnf install git
Other Linux systems
To install Git on other Linux distributions please check the documentation of your distribution. The following listing contains the commands for the most popular ones.
- # Arch Linux
- sudo pacman -S git
- # Gentoo
- sudo emerge -av git
- # SUSE
- sudo zypper install git
A Windows version of Git can be found on the Git download page. This website provides native installers for each operating system. The homepage of the Windows Git project is git for window.
The easiest way to install Git on a Mac is via the Git download page and to download and run the installer for Mac OS X.
Git is also installed by default with the Apple Developer Tools on Mac OS X.
Git configuration levels
The git config command allows you to configure your Git settings. These settings can be system wide, user or repository specific.
A more specific setting overwrites values in the previous level. A setting for the repository overrides the user setting and a user setting overrides a system wide setting.
Git system-wide configuration
You can provide a system wide configuration for your Git settings. A system wide configuration is not very common. Most settings are user specific or repository specific as described in the next chapters.
On a Unix based system, Git uses the /etc/gitconfig file for this system-wide configuration. To set this up, ensure you have sufficient rights, i.e. root rights, in your OS and use the –system option for the git config command.
Git user configuration
Git allows you to store user settings in the .gitconfig file located in the user home directory. This is also called the global Git configuration.
For example Git stores the committer and author of a change in each commit. This and additional information can be stored in the Git user settings.
In each Git repository you can also configure the settings for this repository. User configuration is done if you include the –global option in the git config command.
Repository specific configuration
You can also store repository specific settings in the .git/config file of a repository. Use the –local or use no flag at all. If neither the –system not the –global parameter is used, the setting is specific for the current Git repository.
User credential configuration
You have to configure at least your user and email address to be able to commit to a Git repository because this information is stored in each commit.
- # configure the user which will be used by Git
- # this should be not an acronym but your full name
- git config –global user.name “Firstname Lastname”
- # configure the email address
- git config –global user.email “firstname.lastname@example.org”
If your are using Git in a version below 2.0 you should also execute the following command.
- # set default so that only the current branch is pushed
- git config –global push.default simple
This configures Git so that the git push command pushes only the active branch to your Git remote repository. As of Git version 2.0 this is the default and therefore it is good practice to configure this behavior.
You learn about the push command in Push changes to another repository.
Always rebase during pull
By default, Git runs the git fetch followed by the git merge command if you use the git pull command. You can configure git to use git rebase instead of git merge for the pull command via the following setting.
- # use rebase during pull instead of merge
- git config –global pull.rebase true
This setting helps avoiding merge commits during the pull operation which synchronizes your Git repository with a remote repository. The author of this description always uses this setting for his Git repositories.
Allow rebasing with uncommited changes
If you want Git to automatically save your uncommited changes before a rebase you can activate autoStash. After the rebase is done your changes will get reapplied. For an explanation of git stash please see Stashing changes in Git.
git config –global rebase.autoStash true
- Before Git v2.6 git pull –rebase didn’t respected this setting.
The following commands enables color highlighting for Git in the console.
git config –global color.ui auto
Setting the default editor
By default Git uses the system default editor which is taken from the VISUAL or EDITOR environment variables if set. You can configure a different one via the following setting.
- # setup vim as default editor for Git (Linux)
- git config –global core.editor vim
Setting the default merge tool
File conflicts might occur in Git during an operation which combines different versions of the same files. In this case the user can directly edit the file to resolve the conflict.
Git allows also to configure a merge tool for solving these conflicts. You have to use third party visual merge tools like tortoisemerge, p4merge, kdiff3 etc. A Google search for these tools help you to install them on your platform. Keep in mind that such tools are not required, you can always edit the files directly in a text editor.
Once you have installed them you can set your selected tool as default merge tool with the following command.
- # setup kdiff3 as default merge tool (Linux)
- git config –global merge.tool kdiff3
- # to install it under Ubuntu use
- sudo apt-get install kdiff3
All possible Git settings are described under the following link: git-config manual page
Query Git settings
To query your Git settings, execute the following command:
- git config –list
If you want to query the global settings you can use the following command.
- git config –global –list
Are you looking training with Right Jobs?Contact Us
- How to Install Git on Windows Tutorial
- GIT Interview Questions and Answers
- Git Architecture Tutorial
- what is Git Push Command?
- GitHub Interview Questions and Answers
- What is Dimension Reduction? | Know the techniques
- Difference between Data Lake vs Data Warehouse: A Complete Guide For Beginners with Best Practices
- What is Dimension Reduction? | Know the techniques
- What does the Yield keyword do and How to use Yield in python ? [ OverView ]
- Agile Sprint Planning | Everything You Need to Know