git and GitHub - CMCC-Foundation/CMCC-CM GitHub Wiki

Contents

Getting started with git:

  • For a quick, visual introduction to git, try a 5 Minute Overview of Git on Youtube
  • Need an introduction to version control along with git and GitHub? Try Git and GitHub for Poets, a Youtube series
  • For a glossary of common terms used in git and GitHub documentation, see the GitHub glossary.
  • GitHub's Git Handbook offers a brief overview of git and GitHub, including example workflows and links to more in-depth resources.
  • Software Carpentry offers a hands-on introduction to git.
  • A highly-regarded, comprehensive git reference is the book Pro Git (available as a book or a free pdf).
    • The first three chapters of Pro Git, along with sections 6.1 and 6.2 on GitHub, are particularly useful if you're getting started with git. [1]
    • The site also contains links to git reference pages along with introductory videos on git.
    • You can skip the last section of chapter 3, on rebasing, until you have more experience with git. It's easy to cause problems for yourself or others with rebasing so only use rebasing in your model repository following the workflow instructions below as rebasing rewrites history, which can make it impossible to reproduce exactly what you have done before.
  • For a clear introduction to git internals, try Dissecting Git Guts by Emily Xie.
  • Git from the inside out is a blog post with accommpaning video from Mary Rose Cook that looks under the covers a bit to show what happens when you execute git commands.
  • Git from the Bottom Up by Josh Wiegley provides a quick glossary of the main terms describing a git repository with links to detailed descriptions. Good site for reminders about a git concept.
  • Need to really understand the science behind how git works? Try Advanced Git: Graphs, Hashes, and Compression, Oh My! by Matthew McCullough

Git also offers extensive built-in (i.e., command line) help, although it can be hard to understand until you are familiar with basic git concepts:

git help

git help COMMAND

man gittutorial

man giteveryday

How to set-up your git development environment

There are several stages in setting up your git and GitHub development environment. Below, they are broken into three sections which represent the different stages:

One time GitHub setup

  1. Set-up personal github account (if one does not already exist):

    https://github.com/

  2. Set up SSH keys on GitHub (optional):

    If you will be pushing changes to GitHub (either to your own fork or to shared forks), or if you will be pulling changes from private repositories on GitHub, then it's worth the time to set up ssh keys for each machine you'll be using. By doing so, you won't have to enter your password when pushing to or pulling from GitHub.

    See https://help.github.com/articles/connecting-to-github-with-ssh for instructions. After doing this, you can use the ssh form of GitHub URLs (e.g., [email protected]:CMCC-Foundation/cmcc-cm.git) in place of the https form.

  3. Configure GitHub notifications (optional): It is important to keep track of activity on GitHub but there are ways to manage how you are notified.

    • Controlling how you receive notifications: Click on your profile picture in the upper-right of any GitHub page, then click on "Settings", and then on "Notifications". You will see several options, here are a few recommended settings:
      • Automatically watch repositories: Yes (check box)
      • Participating: Email
      • Watching: Web
    • To see and manage Web notifications: Click on the bell in the upper right of any GitHub page to see Web notifications and to manage which repositories you are watching. There will be a blue dot on the bell if there are new (unread) notifications).

Set up git environment on new machine

git has global settings which apply to all repository clones on your machine. These settings reside in a file called .gitconfig in your home directory. Below are some required and some optional but recommended global git settings to apply to any new machine where you will do model development (e.g., Zeus, personal laptop). Apply the settings below or simply copy a .gitconfig file from a machine that is already configured.

Required git global configuration settings

git config --global user.name "Your Name"
git config --global user.email <GitHub email address>

We recommend that you use your UCAR email address as your GitHub email address but if you use another address, you can add your UCAR email address by clicking on your profile picture in the upper-right of any GitHub page, then clicking on "Settings", and then on "Emails".

Recommended git global configuration settings

You can set which editor to use for log messages, etc., with:

git config --global core.editor <editor of your choice: emacs, vi, vim, etc>

(See http://swcarpentry.github.io/git-novice/02-setup for specific settings to use for many common editor choices.)

The following setting generates better patches than the default:

git config --global diff.algorithm histogram

The following setting makes it easier to resolve conflicts if you're doing conflict resolution by hand as opposed to with a dedicated conflict resolution tool. Without this setting, you'll just see your version and the version you're merging in delimited by conflict markers. With this setting, you'll also see the common ancestor of the two sides of the merge, which can make it much easier to figure out how to resolve the conflict:

git config --global merge.conflictstyle diff3

Alternatively, look into using a graphical conflict-resolution tool such as kdiff3 or the Emacs built-in M-x vc-resolve-coflicts.

We recommend that you set git to not push anything by default:

git config --global push.default nothing

This can help prevent you from accidentally pushing work to the wrong repository.

Configuring git on shared machines

If using git on shared resources, such as on the login nodes for CISL machines, then one may find their git commands being killed by sys admins due to git spawning too many threads and thus blocking (or at least slowing down) other users. To avoid this situation, you can limit the number of threads git spawns for various activities by setting the following git config variables:

git config --global --add index.threads 8
git config --global --add grep.threads 8
git config --global --add pack.threads 8

Please note that a limit of 8 threads was chosen specifically for CISL machines. If you are using a separate shared system you may find it beneficial to choose a different thread limit.

git tools for Bash

There are two helpful things you can do to streamline your use of git if you use the bash shell. These are completely optional, but improve your git experience. These are documented in the appendix of the excellent Pro Git book.

Working with clones

Here are some commands for creating and working with clones:

Create a new clone

git clone https://github.com/<GitHub userid>/<repo>
cd <repo>

or

git clone [email protected]:<GitHub userid>/<repo>
cd <repo>

where <GitHub userid> is your GitHub account login ID. Some useful options to the clone command are:

  • --origin <origin name>: A clone knows where it came from and by default, calls that location, "origin". You can change that name with the --origin option. This can come in handy when dealing with multiple upstream repositories.
  • Change the clone directory name: By default, the clone is created in a directory with the same name as the repository. You can change this by adding a directory name to the clone command. Use the appropriate example below:
      git clone https://github.com/<GitHub userid>/<repo>  <clone_dir_name>
      cd <clone_dir_name>

or

      git clone [email protected]:<GitHub userid>/<repo> <clone_dir_name>
      cd <clone_dir_name>

Checking out a tag or switching to a new tag

  • To checkout a tag:
  git checkout <tag>

note that <tag> can also be the name of a branch or a commit hash. If you specify the name of a branch, you will check out the head of the branch. If you name a remote branch (e.g., origin/branch_name), you will create a detached HEAD but you can still use the code. Please note that if you plan on changing the code, first create a branch (see Working with branches)

Working with branches

When you create a clone, your clone will contain pointers all the branches that existed at the clone's origin (e.g., the repository at GitHub). While you can check out these branches, however, before attempting to make any changes, you should first create a local version branch (so git can keep track of the local commits).

  • To create a new local branch that starts at a certain point:
  git branch <new branch name> <tag or branch name>

for example

  git branch new_feature cam6_2_024
  • To check out a local branch:
  git checkout <new branch name>
  • If you are working with a repository that uses manage_externals (e.g., CMCC-CM, CAM), always run that tool after checking out a new branch or tag:
  ./manage_externals/checkout_externals

Working with remotes (upstream repository locations)

While working with clones created using the methods above will be sufficient for most if not all of your development needs, there may be times when you will want to access or compare your code with code from a different repository. git has no problem storing revisions from multiple repositories in a single clone!

To begin, your clone probably has a single remote (also known as an upstream repository). To see the current status of which upstream repositories are configured for your clone, use the git remote command:

git remote

origin

To see the location of the remote repositories in your current directory:

git remote -v

You should see something like:

origin  https://github.com/gituser/<repo> (fetch)
origin  https://github.com/gituser/<repo> (push)

This tells you the "upstream" location from where new code is downloaded (when you run git fetch origin) or where code is uploaded (when you run git push origin <branch>). Note that most git commands are purely local, using only information in the .git directory of your clone.

You can rename an existing remote:

git remote rename origin CMCC-Foundation

You can set the remote name as part of a clone command (the default is 'origin'):

git clone -o CMCC-Foundation https://github.com/CMCC-Foundation/cam

Adding remotes (new upstream repository locations)

To add a new upstream repository, use the remote add command. For example:

git remote add CMCC-Foundation https://github.com/CMCC-Foundation/<repo>
git fetch --tags CMCC-Foundation

You should see messages much like a new clone when you execute the git fetch command. Note that you can call the new remote anything, in this example we are calling it CMCC-Foundation.

Updating your branch to latest CMCC-CM

Note that while this section explains how to update your local branch to the CMCC-Foundation/cmcc-cm branch, the instructions can easily be generalized for any branch from any upstream remote.

Before starting, you should have either:

  • A fresh clone of your fork with the branch you wish to update checked out (see Create a new clone and Working with branches).
  • An existing clone with the branch you wish to update checked out and in a clean state (i.e., make sure you do a git commit and that git status shows no modified files).

Add the upstream remote, if you have not already done so (see Adding remotes).

Merge the specific remote/branch into your branch. In this example, it is CMCC-Foundation/cam_development

git fetch CMCC-Foundation
git merge CMCC-Foundation/cmcc-cm

Comparing differences using git diff

If you have a git clone, you can view differences between commits or tags. As far as git diff is concerned, a commit hash is the same as a tag so in the examples below will use <tag>.

  • To see the full difference between two tags (i.e., a changeset):
  git diff <tag1> <tag2>
  • To see the full difference between the current checkout (sandbox) and a tag:
  git diff <tag>
  • To see only the names of files that are different:
  git diff --name-only <tag> [ <tag2> ]
  • To see the difference in one or more specific files:
  git diff <tag> [ <tag2> ] -- <path_to_file1> <path_to_file2>

Configuring and using a graphical difference tool

git has a command, difftool, that can run a graphical tool on each file that is different between two commits.

  • To configure opendiff as the graphical difference tool:
  git config --global diff.tool opendiff
  • To see the available graphical difference tools:
  git difftool --tool-help
  • To run difftool on <file1> and <file1>
  git difftool <tag> [ <tag2> ] -- <path_to_file1> <path_to_file2>
  • To optionally run difftool on all files in a changeset (answer 'y' or 'n' for each file):
  git difftool <tag> [ <tag2> ]
  • To run difftool on all files in a changeset (i.e., same as 'y' for every file):
  yes | git difftool <tag> [ <tag2> ]
⚠️ **GitHub.com Fallback** ⚠️