Git Tutorial

Getting Started with Git & GitHub for Project Collaboration

The total length of the videos in this section is approximately 30 minutes

You can also view all the videos in this section at the YouTube playlist linked here.

This is our newest module, and our goal is to find out how effective this tutorial is and how to improve it. We would love to hear feedback.

This lecture was created by Donna Gan '20.

If you are completing this module as part of Wellesley's QAI Summer Program or STAT 260, note that you do not need to memorize or even practice every command that's included here. Depending on your previous experience, Git might seem familiar/logical or completely different from anything you've seen before. Take a look at the posted assignment to see that we will ask you to practice the commands from the first few videos but not the last few. If you are new to this platform, you should watch all of the videos, but don't spend hours trying to internalize every detail. Like anything, Git is best learned with practice.

Part 1: Introduction to Git

Most projects involve collaborating with others and repeatedly updating code files and other documents. Git is an open-source, distributed version control system. In other words, Git is a way to manage a project when multiple people are making repeated changes to the same files.

Why learn Git?

Downloading Git

On a mac:

git --version

If a version number is printed to the screen, you already have Git. If not, you need to download Git.

On a PC:

PC users: click here only if you are following this tutorial on the Windows command line or PowerShell and already have Git installed there. Again, we recommend downloading Git Bash instead.

The commands described in the videos and text in this module assume that you are using mac/linux or Git Bash. If you are used to using the built-in Windows command line or PowerShell and prefer not to download Git Bash, you will have to make some substitutions. In the list below, bold text is the linux/mac command, and regular text is the Windows/PowerShell command.

pwd = cd

touch = echo or, in PowerShell, echo $null >> filename

open = just type the file name without any command and it'll open up, or, in PowerShell, substitute ii for open

cat  = type


References

You might find it helpful to explore these links if you have questions as you go through the material in this module.

Learning new terminology

Throughout this tutorial, we will primarily be using Git to work with "Git repositories." A Git repository is a space to store files for a project. You may find that learning Git involves becoming familiar with new terminology, such as "repository". That's okay and is part of the learning process. Each of the videos below is followed by notes that are intended to help spell out some of the terminology and ideas that are introduced in the video.

A change in terminology

GitHub made an important change to its terminology just after these videos were recorded. The primary version of a repository is now called "main" instead of "master", as part of an effort to avoid language related to slavery. You can find discussion in various articles, like this one. You will still see "master" in this module's videos (and one screenshot from a video) because we have not yet rerecorded. However, we have substituted "main" for "master" in the text on this page. When you are experimenting with the commands you learn from the videos, be sure to use "main" instead of "master", because GitHub will not recognize "master" and will think that you are trying to create and name something new.

Question 1: What is Git, and why is it useful?

Show answer

Git is a distributed version control system that tracks changes to files over time. We can use Git to go back to a specific version of the tracked files and in a collaborative setting, coordinate parallel work among team members. Git is particularly useful for managing large-scale projects. 

Part 2: Getting started with GitHub





git config --global user.name "Your Name"

git config --global user.email "Your Email"



Git.1.GettingStarted.mp4

First create a new repository on GitHub and obtain the url, then locally, execute from the command line:

git init <repository name>

cd <repository name>

echo "# repository name" >> README.md

git add README.md

git commit -m "first commit"

git remote add origin <url>

git push -u origin main


https://github.com/WellesleyQAI/DemoProject

See Part 2 video notes

cd Desktop

git clone <url>

cd <repository name> 

git remote -v (using the default name “origin” to denote the local copy of the remote repository)

cd Desktop

git init <repository name>

OR turn an existing directory into a git repository:

cd <directory name> 

git init 

Question 2: What are the two common workflows of setting up the remote and local versions of the repositories mentioned in the video?

Show answer

You can either clone a repository from GitHub or take a local directory that is currently not under version control, turn it into a Git repository, and upload it to GitHub.

Part 3: Tracking edits with Git


Git.2.Tracking Edits.mp4

Note: It’s generally a good idea to commit incremental changes (although perhaps not too frequently).

Part 3 video notes - make sure to read, as the key steps are reviewed here ("add" and "commit")

cd DemoRepo

touch demo.R 

open demo.R

git add <filename> (specific file)

OR 

git add . (everything that’s changed in the current directory)

Question 3: How do we use Git to track edits for version control?

Show answer

Git uses a so-called two-step commit process: you would first use the git add command to add a file that you want to track to the staging area, and then from the staging area, use the git commit command to record a snapshot of the repository. 

Part 4: Communicating with the remote repository


Git.3.Communicating with the remote repository.mp4

See Part 4 video notes

Question 4: What are the two Git network commands that are mentioned in the video and what are they used for?

Show answer

We use the git push command to upload local repository content to the remote repository and the git pull command to download content from the remote repository and update the local repository to match the remote copy. 

Part 5: Collaboration -- Manage merge conflicts


Git.4.Merge conflicts.mp4

git pull (creates a merge conflict)

git checkout --theirs <filename>

git commit -am “resolve merge conflict” 

git push

OR

git fetch --all (similar to git pull but without the merging)

git reset --hard origin/main (using hard reset to discard local changes to match origin/master) 

git pull origin main


git pull (creates a merge conflict)

git checkout -- ours <filename>

git commit -am “resolve merge conflict”

git push

OR

git push -f (force push local to overwrite the remote)

See Part 5 video notes

git commit -am “resolve merge conflict” 

git push

Question 5: Why did merge conflict occur in this scenario?

Show answer

In the video, edits to the same line of the same file were made both in the local and the remote copy. To synchronize the local file with the remote file, we need to decide which edits to keep and resolve the merge conflict. 

Part 6: Collaboration -- Branching

Branching is a way to experiment with changes to the file without affecting the current saved version that collaborators can access.

Git.5.Branching.mp4

See Part 6 video notes

Question 6: When would we want to use the branching features in Git?

Show answer

Suppose we want to experiment with a new feature without potentially messing up the current working version of our project, we can create a new branch which allows us to work on a separate copy of our code. Once the new feature is implemented, we can merge it back to the main branch.

Part 7: More on version control (these videos are optional)

Git.6.More on version control (Part 1_3).mp4

Question 7: If you go back to an earlier version, can you return to the current version again?

Show answer

Yes, using git checkout main

Git.7.More on version control (Part 2_3).mp4

Question 8: Can you go back to an earlier version if you have saved but not committed your most recent work?

Show answer

Yes: git stash temporarily saves your most recent work, and git checkout main brings you back to the stored version.

Git.8.More on version control (Part 3_3).mp4

c527def (HEAD -> main) Revert "comment out swiss data"


See Part 7 video notes

git diff <branch-name> <branch-name>

OR

git diff <commit-hash> <commit-hash>

git reset --hard <commit-hash>

Question 7: What are some of the Git commands that you can invoke to restore/rollback to a previous state of the repository?

Show answer

1) git checkout <commit-hash> allows you to temporarily switch to and inspect an older version of your project without modifying the commit history.

2) git revert <commit-hash> creates a new commit that reverts the effect of a previous commit. It is recommended to use git revert if the particular commit that you wish to undo is public so that your collaborators could clearly see the changes introduced and all old commits that they may depend on are preserved.

3) git reset --hard <commit-hash> completely removes a particular commit and all subsequent commits from commit history. Proceed with caution since this command overwrites commit history.

Extensions:

GitHub Desktop (offers the option to skip the command line and interact with Git/GitHub through a user interface): https://desktop.github.com/

Build your own website with GitHub Pages: https://pages.github.com/

Thanks for working through this tutorial!