Version control systems like Git help manage changes to files. Sometimes, you’ll want (or need) to make some ‘feature’ or ‘patches’ to a collaborative research project. Or maybe you want to make some experimental changes to your code, but don’t want to touch your main code. This, and more, is where branches come into play.
In this code-along we’ll go over what branches are, and how and why you would use them.
In very simple terms, git branches are individual projects within a git repository. Different branches within a repository can have completely different files and folders, or it could have everything the same except for some lines of code in a file.
Let’s use a few real world examples (at least that I’ve used before, others may have used them differently):
revisionbranch in your manuscript git repository. In that branch you make the changes to your manuscript in response to the reviewers. Once you are satisfied, you merge the branch into the
masterbranch and resubmit the article.
fixingto fix the problems without messing with the master dataset. After you confirm the problem is real and that you have the solution, you submit a pull request of the
fixingbranch to be merged with the
featurebranch is created to address these problems, which will eventually get merged in with the
masterbranch for the next version of the software. This ensures that other people’s code isn’t broken everytime a bug gets fixed.
There are many uses of branches in Git. The nice (and very powerful) thing about Git is the fact that branches are very cheap compared to other version control systems. By cheap, I mean they don’t take up much disk space, it’s computationally easy to move between branches, and it’s (relatively) easy to merge branches together. This is because of how Git represents branches, since they are simply pointers or an individual commit. That’s it. Just a pointer… Git commit history is a directed acyclic graph, which means that every single commit always has a ‘parent’ commit (the previous commit in the history, or multiple parents when a merge happens), and any individual commit can have multiple ‘children’. This history can be traced back through the ‘lineage’ or ‘ancestry’. The branch just gives a name to each ‘lineage’ when a commit has multiple children.
When you merge two branches together, the commit histories get merged together as well. Which means that all the changes you made in each branch gets combined back into a single lineage, rather than two. This makes it easier to work collaboratively on a project, since each individual could work on their own branches, without dealing with the messiness that could come from working all on one branch.
Branches are best understood visually. So let’s first start with using
this website to see what the
merge commands are doing.
After we’ve tried that, let’s do it locally (on your own computer). Here is a sequence of commands to try out:
cd ~/Desktop mkdir git-branches cd git-branches git init # start a repo git add . git commit -m "First commit" # make the first commit git branch testBranch # create branch git checkout testBranch # move to branch ## can also do git checkout -b testBranch echo "Some text" > file.txt git add file.txt git commit -m "Added a file with text" git checkout master echo "Text in another file" > new-file.txt git add new-file.txt git commit -m "Added another file" git log --graph --oneline --decorate --all # This command is long, so shorten it using aliases git config --global alias.lg 'log --graph --oneline --decorate --all' git merge testBranch git lg git branch -d testBranch # delete the branch
I mentioned this already, but branches are best used when doing a pull request (unless the pull request is very small or few people work on the repository).
The steps to take would be:
git checkout -b fixingBranch
git push origin fixingBranch
fixingBranchon your forked repo on GitHub and on your computer (
git checkout master && git pull upstream master && git branch -d fixingBranch)
If you have any questions, often one of the best places to start is either
StackOverflow or Google (which more likely links to StackOverflow).
cd- change directory
mkdir- make a directory
echo- print a message to the screen or to a file if
>(redirect) is present.
git init- start or initialize a git repository
git add- put a file into the staging area, so that git starts tracking it
git commit- send files in the staging/index area into the history (the git repository)
git log --graph --oneline --decorate --all- view the commit history in the git repository and the branches, with each commit as one line.
git branch- An individual line of commit history that contains files that may differ from other branches.
git checkout- A way to move across the git commits and branches.
git merge- Combine a branch into the current checked out branch (i.e. the branch you are on).