Git and GitHub Basics - ECE-180D-WS-2024/Wiki-Knowledge-Base GitHub Wiki

Git and GitHub Basics

Introduction

Have you ever wished that you could refer back to every past version of a file you are working on? Or have you wished that you and your collaborators had an easy way to merge your respective edits into one file? If so, what you are wishing for is a version control system. A version control system tracks and manages changes to files over time, and it is commonly used by software developers to refer back to past versions of the same files and to collaborate on projects. In this article, we will introduce the concept of version control, the version control system Git, and a major hosting platform of Git repositories, GitHub. This article will also explain some useful commands to help you get started with using Git and GitHub.

Idea Behind Git: Version Control

Version control is the practice of logging and managing changes made to a file or a set of files, either by one person or by multiple people. Oftentimes in software development, it is useful to look at previous versions, keep track of reasons for changes, potentially roll back to a previous version if desired, and much more [1]. First, we will focus on local version control: managing changes on a single computer.
When only looked at locally, Git has a rather simple model: it maintains a database of every version a file has had [2]. Each version in this database is called a "commit", which contains a snapshot of the file at the time the commit was made. Looking back at each commit allows the user to see the entire state of the file at the time the commit was created, as opposed to only seeing what changed between the last commit and this one.

Figure 1. Local version control

There are several steps necessary to creating a commit in Git. When a file is newly added or modified, it is added to the working directory, or the working tree. For Git to track the changes made to the file, the user needs to explicitly tell it to. To do this, we add the file to the staging area, which contains only the changes that the we want to have in our next commit. The changes in the staging area can then be committed and permanently stored as a snapshot in the Git repository. They become another version history that we can visit later.

Figure 2. The three areas that a change can live in

An advantage of this design is the flexibility of controlling what is in each commit. Since a commit only includes what is added to the staging area, the user can group their changes into many different commits, allowing more freedom to organize commit history in a logical way [3].

Scenario 1: Using Git Locally

Suppose that after hearing about the amazing capabilities of Git, you decide to create a Git repository on your computer to track the changes you make to a project. Here are a few steps that would help you get started.
First, make sure you have Git installed on your computer. The installation process varies depending on your computer's operating system, so we recommend following this installation guide for more details. Additionally, it is a good idea to set up your environment after installation following this guide.
With our system set up, we can get started with creating our first Git repository from scratch. To do so, cd into the directory you would like your Git repository to be in and enter

git init

The git init command creates a Git repository in your current directory. You may notice that the directory now contains a .git directory. That is the directory that keeps all the files necessary for Git to work and is a sign that your repository has been successfully created.
With this repository created, you can now put files in it and make changes to them. When you want to put files to the staging area, use the following command, replacing [file] with the names of the files you want to stage.

git add [file]

If you would like to stage all the changes, you can also run git add . to avoid typing out every filename. The added changes are not recorded in the repository yet after running git add. To commit the added changes, we need to run

git commit

You can also use git commit -m [message] to add a message associated with this commit. Commit messages are usually short messages that explain why you made the changes in this commit, so you can refer back to them later.
Congratulations, you have now made your first commit in this repository! If you committed all your changes, your working tree is now clean. You can repeat the git add and git commit steps to commit any subsequent changes you make.
If you want to learn the state of your working tree at any time, you can run

git status

to see what files are modified and what files are staged [4].

GitHub and Git Repository Collaboration

So far, we have covered how to use Git to track changes made on one computer. Git is much more powerful than just local version control. When there are multiple collaborators involved in the same project, they can leverage the Git's distributed version control system. In the distributed version control model, multiple machines have the full repository and its version history locally. This allows many people to work on the same repository on their own machines while having access to the changes everyone made. Under this model, copies of the project still exist in case any machine fails and loses its local copy.

Figure 3. Distributed version control

GitHub is a popular platform that hosts Git repositories. It is a web-based platform where developers can upload their repositories and also view or contribute to other people's repositories.

Scenario 2: Collaborating Using GitHub

Suppose your friend likes the project you were tracking using Git in Scenario 1, and they would like to work on it with you and host the project on GitHub. After you both create a GitHub account, you would need to first push your repository from Scenario 1 to a GitHub repository. To do this, you need to specify a GitHub path for you remote repository:

git remote add origin https://github.com/YOUR-USERNAME/YOUR-REPOSITORY-NAME.git

This command set up the https://github.com/YOUR-USERNAME/YOUR-REPOSITORY-NAME.git repository as the remote repository that your local repository is tracking. Essentially, it means that your repository is stored at this address and can be accessed by someone using a different machine. To push your existing commits to this address, use the command

git push 

If you or your friend check the website https://github.com/YOUR-USERNAME/YOUR-REPOSITORY-NAME now, you will both be able to see your past commits hosted on this site.
Now, your friend can obtain a clone of your project by running

git clone https://github.com/YOUR-USERNAME/YOUR-REPOSITORY-NAME.git

Thanks to Git's distributed version control system, your friend now has an identical copy of your repository, including all of its history and past versions. They can now make their own changes and commits on their local version of the repository.
After your friend is ready to share their edits with you, they can make their changes visible on GitHub by running git push. For you to see your friend's changes on your local machine, run

git pull

This command updates your local repository with the new changes that is on the remote repository.
It is usually good practice to run git pull before making changes to your local repository. This is to make sure you are editing on the latest version of your project and to avoid conflicts when merging your edits with your friends'.

Branching

A branch in Git refers to a new version of the main repository. It is often used in projects completed by multiple people, allowing them to work on different parts simultaneously. Branches enable you to create separate lines of development within a project. This is useful for developing new features, fixing bugs, or conducting experiments without affecting the main codebase. Each branch can be developed, tested, and reviewed independently before being merged back into the main branch, ensuring that the work on one branch does not interfere with others.

If you want to check the branches list you created:

git branch

To create a new branch, use the git branch command followed by the branch name:

git branch yourBranchName

To switch to the other branch that you already created, use the git checkout command followed by the branch name:

git checkout yourBranchName

If you are familiar to git command, you can also choose the shortcut method which combines creating and switching to that branch together: git checkout -b yourBranchName

Merging

Merging allows multiple developers to work on different features or parts of a project simultaneously. Merging in Git involves integrating changes from one branch into another. This is essential for collaborative work, because each developer can create their own branch, work on their tasks independently, and then merge their changes back into the main branch which the whole group shares. This facilitates parallel development and ensures that everyone's contributions are integrated into the project. There are two main types of merges: fast-forward and three-way merges.

A fast-forward merge occurs when there is no divergence between branches, simply moving the target branch pointer forward. This often happens when one person works on the whole project, or when there is no other people merging their changes to the main branch. To do the fast-forward merging, we use the following command.

git checkout main
git merge yourBranchName

A three-way merge happens when branches have diverged or conflicts. It means that there are other changes happened besides the modification you want to merge to the branch. This situation always happens when multiple people work on the same project. It requires Git to create a new commit that combines the changes. If conflicts happens, they must be resolved manually.

Here's the workflow for three-way merge:

git checkout main
git merge yourBranchName

Then you need to manually edit the files to resolve conflicts. Basically you need to choose which version you want to keep in the conflict situation. After you chose the version you want for the main branch, stage the changes and commit the changes:

git add <resolved-file>    
git commit

Commit History and Comparing Commits

The git log command is used to view the commit history of a Git repository. It provides a complete list of commits, including the commit hash, author, date, and commit message. This command is essential for tracking changes and understanding the history of a project. It allows developers to understand the evolution of the project, and debug issues.

The git diff command is an essential tool in Git that allows developers to compare changes between various states in Git repository. It helps developers see what changes have been made, whether they are staged for commit, and how different branches or commits differ from each other. It can be used to show the differences between commits, branches, or files. It highlights changes in the content, allowing developers to see what has been added, modified, or deleted.

Conclusion

In this article, we covered the concepts of local and distributed version control, basics of Git and GitHub, and some simple commands to use them. We learned that Git is a distributed version control system that allows developers to track the history of a project made by multiple collaborators. In addition to existing in individual computers, Git repositories can be hosted on platforms like GitHub to allow remote storage and collaboration. We also introduced some basic commands that allow you and your collaborators to start using Git.
Git has much more functionalities, and this article only serves as the beginning of your Git-learning process. To fully access the powers of Git, there remain numerous commands and design goals to discover. If you are interested in learning more, we recommend checking out the additional resources below.

Additional Resources

  • Pro Git Book: This book details almost everything you can learn about Git, from the commands to use it efficiently to the underlying internals that allow Git to happen.
  • Git Documentation: This is the documentation of Git on the command line. Almost every Git command that exists and the ways to use them can be found on this page.

References

[1] “Version control (GIT),” Version Control (Git), https://missing.csail.mit.edu/2020/version-control/ (accessed Feb. 9, 2024).
[2] “1.1 Getting Started - About Version Control,” Git, https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control (accessed Mar. 21, 2024).
[3] “About,” Git, https://git-scm.com/about/staging-area (accessed Mar. 21, 2024).
[4] “About git,” GitHub Docs, https://docs.github.com/en/get-started/using-git/about-git (accessed Feb. 9, 2024).