Introduction to git
06 Apr 2023Reading
For an overview of how Git can be used in the biological sciences, please read this excellent article by Ram.
For a practical introduction please read Chapter 5 in Bioinformatics Data Skills available from the library here
The Github Handbook is also nice.
After you have some experience with git, this cheat sheet may be helpful (but right now it will probably just be confusing.)
Git: reproducibility and collaboration
This document will introduce you to Git, a version control system that is a great aid in writing software, maintaining documentation, and maintaining reproducibility.
What does Git do? Git keeps track of changes that you (and your collaborators) make in your documents. By maintaining a record of all the changes that have been made you can restore your project to an earlier state if needed (i.e. if you screw up). Git also allows you to maintain different versions (known as branches in Git) simultaneously, an incredibly useful feature. For example you can maintain a “main” branch that works correctly. You try out changes in a “develop” branch without breaking the working “main” version. Once you know that your changes in “develop” are functioning as intended you can merge them into the “main”.
A few key concepts
A project that is tracked by Git is called a repository (repo for short).
To start a new repository you use the git init
command.
To add files for git to track you use git add [FILE]
. (where [FILE] is replaced by the actual filename)
When you have made some changes to your project and you want to commit those changes to the repository, it is a two step process. First add the changes git add [FILE]
and then use git commit
typically with the option -m
to include a brief message about the changes made.
If you are collaborating with others, or just want to share your project, you will want to set up a remote repository. One common (and free!) hosting site is GitHub. When you want to add your changes to the remote repository you push to the repository using git push
. When you want to download changes that others have made then you want to pull changes using git pull
.
Learn about git using a tutorial
Now let’s see some of this in action.
Exercise Keep track of what each command that you learn does by making notes for yourself in a markdown document named gitNotes.md . Save this file, to be turned in later.
We will next do a tutorial, Git-it.
IMPORTANT: WHEN THE TUTORIAL ASKS YOU TO CLONE YOUR REPOSITORY USING HTTPS, DON’T DO IT. USE SSH INSTEAD
The tutorial is already installed on your instance. To start the tutorial:
- Open the “Git-it” folder on your desktop. Be Patient the first time it runs it may take ~ 1 minute to start. If it still hasn’t started after a minute then double-click again. If you do not see the folder, click on the “Files” icon at the bottom first.
- In the “Git-it” folder, double-click the “Git-it” icon.
Proceed through the Git-it exercises. It may take a few seconds for the window to open when you click on the “SELECT DIRECTORY” button. Skip the section “Step Install Git” on the first page, it is already installed. But do configure git on your instance as descrbied in “Step Configure Git” also on the first page. You also will need to create a github account (unless you already have one) as instructed on the fourth page of the git-it tutorial.
When asked to edit files in the tutorial you can use nano
or Rstudio
.
If you want an alternative (or additional) tutorial, you can try the one at katacoda (not required)
Now let’s try it in real life.
Make a repository and collaborate
Work with one or two lab partners. Each partner should follow along with what the others are doing so you are versed in all steps.
Designate one of you to create a new repository. This is Partner 1.
There are two ways to make a new repository and get the local and remote versions linked. Either you create it on Github first and clone it down to your computer or you init it on your computer and link it to a Github repository.
Partner 1 (only) should create a new repository, using one of the two options below:
1) Create the repository on Github first.
Do NOT type git init
. In this case, since you already initialized a repository on github it is not needed
- From your github.com home page click on the green “+ New Repository” button
- On the resulting page give it a name, check the “Initialize this repository with a README” box and press the “Create Repository” button.
- Click on “SSH” and then on the clipboard icon to copy the URL.
- Open the terminal on your computer,
cd
to the parent directory of wherever you want the repository to reside and thengit clone URL
where URL is the URL that you copied from Github. - Next,
cd
to your repository and begin working on it.
OR
2) Create the repository on your computer first (this is what you did in the tutorial).
cd
to the parent directory of where you want the repository to reside.mkdir NAME
where NAME is the name you want for your repository.- Very Important
cd NAME
to move into the repository git init
to initialize a repository in the current directory-
Add a file to the repository. For example:
touch README.md
git add README.md
git commit -m "Added README.md"
- Go to Github.com
- From your github.com home page click on the green “+ New Repository” button (right hand side of screen)
- On the resulting page give it a name and press the “Create Repository” button. DO NOT check the “Initialize this repository with a README” box.
- Click on the clipboard icon to copy the URL next to the heading “…or push an existing repository from the command line”
-
Paste that into the terminal while in the directory of your repository. i.e.
git remote add origin git@github.com:jnmaloof/test2.git
git push -u origin main
Now let’s collaborate!
- Partner 1:
- Add a file to the repository with a bit of text (what your plans are for the weekend?).
- Commit your change
- Push the repository to github
- Go to the github website for this repository.
- Add Partner 2 (and 3) as collaborators.
- Partner 2 (and 3):
- Check your email for an email from github. Click on the link to the repo
- Clone the repository to your computer. You do NOT need to fork it
- Add your information (what your plans are for the weekend?)
- Commit your change
- Push the changes to the repository
- Run git log and save the output to a file.
- Partner 1:
- Pull the changes back to your computer
- Run git log and save the output to a file.
Use github in RStudio
- Tired of using the command line to commit and push your changes?
- You can also use the git module in
RStudio
as shown in lecture. - If you already have a git repository cloned onto your computer:
- In RStudio go to
File > New Project...
. - Choose
Existing Directory
. Then Select the folder that correspond to your git repository.
- In RStudio go to
- If you want to clone a repository from Github to your computer:
- In Rstudio go to
File > New Project...
- Choose
Version Control > Git
. - Paste in the ssh clone url from Github
- Optionally check the parent directory
- In Rstudio go to
- You can stage, commit, and push using the tools in the upper right hand pane using the
git
tab. - Each partner should try this out.
- If you close RStudio you will need to choose
Open Project
either from the file menu or the right hand corner to get thegit
menu to come up again.
Fork a project
The above exercise illustrates one way to collaborate: each collaborator is added as a contributor to the repository. A second (and perhaps more common) method is to fork a repository. When you fork a repository your are creating your own copy of the repository. You then make changes to your fork. If you think the original creator might want to incorporate your changes then you can create a pull request to request that they pull your changes back into their repository. This is safer for the original creator because it is easier for them to choose to include your changes if they like them.
Let’s try it. I need to collect everyone’s GitHub usernames. To do this we will each add a file to a shared repository with our details. I have created a repository https://github.com/UCDBIS180L/gh-usernames for this purpose.
- Go the home page for that repository in your web browser.
- Fork it using the button on the upper right hand side.
- Clone your forked repository to your computer (NOT my original repository)
- Open the file “2023_Roster.csv” in
nano
or Rstudio - Find your name in the file and add your github user name after the last comma (look at my entry for an example).
- If your name is not on the list, add a blank line and then add your name to the end of the list
- Please add your username even if you are auditing
- Save your changes to the file.
- Add and commit your changes.
- Push your change back up to your repository.
- Use the Github website to send a pull request.
More resources
Still confused? Or want to go further? Here are some additional resources
Tutorial
An alternative tutorial
GitHub for beginners Part 1
GitHub for beginners Part 2
Videos
The four part git basics series (the first two were shown in class)
- https://www.youtube.com/watch?v=8oRjP8yj2Wo
- https://www.youtube.com/watch?v=uhtzxPU7Bz0
- https://www.youtube.com/watch?v=wmnSyrRBKTw
- https://www.youtube.com/watch?v=7w5Z7LmyLgI
A longer video (50 minutes)
Online book
The official online git manual