Working Remotely with Git - aakash14goplani/FullStack GitHub Wiki

Overview In this module, we will look at Working Remotely with Git. To start with, we'll talk about cloning a remote repository to our local machine. We'll then look at how we can list off which remote repositories are associated with our local repository, how we can fetch changes from that remote, as well as merge those changes into our local working copy. We'll look at pulling from a remote, which is a combination of fetching and merging. We can also push changes remotely that we've made to our local repo. And finally, we'll wrap up with how we can work with tags.

Cloning a Remote Repository Let's look at cloning a remote repository to our local machine. For this example, I'll use the jQuery repository on GitHub. I need to get my clone URL which is the URL where the source is located. I'll be using the http-based one and I'll copy it to my clipboard. Coming over to Git, I can say git clone and then provide the URL. This is going to download the entire history of the project, all of the commits that have ever been made to the jQuery repository. Now, you might think this might take quite a while if you've used other version control systems. But even for a repository of a fair size such as jQuery, it only takes about 20 seconds to actually download all of the commits. And there we have it. Let's change the jQuery repository and take a look at what we've got. I'll do a git log and you can see that we've got a list of commits from the project. If I want to see a more condensed version, I can say git log and provide the oneline option. So we have a commit per line.

Basic Repository Statistics Now, in addition to oneline, I might want to know how many commits are in this repository, I'll use the word count function with the -l to count line by line. And you can see that there's 4,073 commits that we've actually downloaded from GitHub. We can get a slightly more interesting view of this by adding the -graph option which provides a graph on the left-hand side showing the different branches and merges that have happened. You can see on that fourth line down that there was a separate branch that was then merged into the third line. So we can see how the history of this project has changed. I can continue on and you can see some more branches and merges for various requests that have been made. So someone might have branched or forked the repository and then issued a pull request back to the project saying, "Here's a bunch of fixes," and git makes it very easy to incorporate these fixes back into the master repository or that central repository where all the coordination is done from. Now, we can get a variety of stats on this git repository because we've got the entire history right on our local machine. We can use the, in addition to the log command, we've got a shortlog. Shortlog is actually short for format equals short. But it's just easier to type in shortlog. What does shortlog give us? It lists of the authors and the commit messages from each of them. It also provides us with the number of commits each has made. Right now, we're listing them in alphabetical order. I can also ask for a shortlog and specify that I want a summary so I don't want to have the individual commit messages, I want them ordered-- the N option orders them numerically by number of commits decreasing, and I want to include the user's email addresses. So displaying this, we can see that John Resig has made 1209 commits. He committed another 503 under a different user name and Jorn Zaefferer has made 308. It gives you a history of how many-- or some statistics regarding who has actually made these different commits. Now, if we want to take a deeper look at this, there are a variety of packages that will create git statistics locally, but Github also provides a number of statistics for us. We can look at the graph's option and see the contributors over time, the commit activity; we can visualize additions and deletions. There's many, many statistics that can be computed over these git repositories. So let's look at the contributors over time. And GitHub is summarizing-- it's joined together John Resig's said JE Resig's commits and grouping by email address, doing a bit of additional processing there, but you can see his commits over time to the project as well as many others who have contributed. So, that's why commit-- so, you can see that there's a variety of statistics that can be computed over a git repository very easily because everything-- the entire repository is local to your machine.

Viewing Commits Coming back over to jQuery, we can also take a look at any of the commits that have been made. I can, for instance, look at the head. So what was the last commit made to jQuery? And you can see that there was a copyright change and some other changes to the repository. So interestingly, the major thing that was done here is that we are reverting a former-- a previous commit. If I say git show HEAD until the 1, it is going to show the commit that was actually just reversed. And we can see this too in a git-log online that the first commit here, 247d reverts this previous commit, 740-- issue 741. And that 532b is the commit that has been merged back at or reverted. I can also do a git show Head until the 10 or I can also provide hash or the show hash. So, if I wanted to look at that very bottom commit 5642646-- 626, there it is right there. I've got the full history of all commits that have been made. I can also set my history back to look at what the entire source repository look like. I can create a branch and look at the state of the repository at any given point in time. We'll be talking much more about branching in the next module. We can also take a look at git remote. Git remote shows that we've got one remote called origin and what is origin is just the git default name for where this source came from. If I do a -v option or verbose, it will show the URL, both the fetch and the push URL, for that particular remote. This can be different. For instance, you might be fetching from an https URL but then you want to push to it as h-based one. And there's-- we'll talk about reasons for doing that.

Git Protocols Git can operate on a variety of different protocols such as http or https. These use the default ports of 80 and 443 though these can be configured just as with any http URL. These URLs allow both read and write access and you can demand a password for one or both of reading and writing. So if you have a private repo, you can demand a password for reading as well as writing. More commonly, such as on gitHub, a public repository will allow anonymous read access but require a password for write. These URLs are firewall-friendly and don't require configuration by your corporate IT infrastructure. There's also the git protocol. It operates on port 9418 and starts with a git con wak wak (phonetic). This is a Read-only URL and only allows anonymous. It's also commonly used on GitHub. Its main disadvantages that it isn't firewall-friendly. 9418 is not a well-known port that is commonly open so you need to talk to your corporate IT infrastructure to open up that port if you want to be pulling repos down using that protocol. You can also use the SSH protocol on port 22 which is the telnet port. This is a standard secure shell that's very common in UNIX environments and you can see it's using the git act. Git is the username to log in to the remote system with. It is both a read and write, so if you have permissions, you can both read and write to this URL and it uses SSH keys for authentication. So if you have given git your public SSH key, the SSH protocol will actually use your private key for authenticating with git. So you don't need to provide a username password. It's all done by the SSH infrastructure. The last git protocol that's used is the file protocol. There's no port associated with it. It is only useful for local operations but it is both read and write. So if you need to play around with cloning repos and pushing and pulling changes and you want to just do that locally, it's very easy to set up by just pointing git to the fully qualified path name for that repo on your local system. So let's list off the entire directory contents. And you can see there's a .git directory here. If I looked at git/config, I'm going to just display the file, you can see there is that remote origin as well as the branch that we're working on master. There's some additional data here that says where to fetch from and also when we're merging, what we're going to merge into. So we'll talk about merging very shortly.

Viewing Branches and Tags I can display all the branches in this repository. We only have our one local branch called master. I can add the -r option which will display remote branches. And you can see that there's a number of branches that are part of-- that have been shared remotely by the jQuery team. Branches are often used for sort of temporary working copies or to separate out main line development from bug fixes. We can also look at the tags. These are stable points, these are known points in your code base where you can often tag versions. So there's all the different versions that have been-- of jQuery that have been tagged by the jQuery team.

Fetching from a Remote Let's switch over to our GitFundamentals repository. And it is as it was before, if I do git log, you can see the added .gitignore. If I ask for the remotes, we don't have any. This is a local repository and does not communicate with any other repositories. The remote repository was automatically added when I cloned the jQuery repository. But if I have a local repository and I want to add a remote destination to it, I can then use the git remote and ask it to add, I'm going to add origin but I could call this anything I wanted, origin is an arbitrary name. And you can have more than one, so you can pull from multiple repositories. So if someone sends you a pull request, you could add their public repository which might be a fork of yours and then pull their changes into your local working copy to examine. So you can have multiple remotes and this is commonly done in git in order to evaluate patches or pull requests that have been made to your project. Now that that remote has been added, I can run a git fetch. Git fetch will pull down any changes from that remote repository; you can run it as many times as you want. If you have multiple remotes, I can specify the remote to fetch from. Now, what I want to do is-- if I look at git log, those changes haven't been incorporated-- there's been no changes incorporated into my local repository. If I do a git log on origin master, origin master is the name of that remote branch. You can see that I've got this new Updated README from another location. That is a new commit that was in the repository but is not in my local working copy. So how do I get into my working copy? I can do a git merge and specify the merging from origin master into my current branch called master. Often, you have this correspondence between the local branch name and the remote branch name, though not always. Now that I've run that merge, I can do a git log to see that it's there. One thing I do want to point out is note that this is a fast forward. That means that the remote repository had everything up to or the local branch had everything up to CACC commit but didn't have this 9523 commit. So, git was able to simply apply that new commit on top and didn't have to actually modify the code or merge changes from multiple streams and create a new commit. So, it is able to fast forward, basically, move the head pointer to the new location. So I now have this commit from this remote location. We'll be talking much more about merging and branching in the next module where things can get more complicated.

Pulling from a Remote If I do a git branch -r, I can see that remote branch, origin master that I just merged from. Now, this act of doing a git fetch followed by a git merge origin master is so commonly done that git has a shortcut for it. That shortcut is git pull. Git pull does exactly what we just did. Now, I've done a git pull but there has been no correspondence set up between my master branch and the origin master, the remote one. So git is saying, "I don't know what to do." We can set this up easily by modifying that .git/config file but git 1.7 above provides an easier way of doing this. I can say git branch, set-upstream so this is setting an-- what's called an upstream tracking branch. The upstream tracking branch is basically what branch remotely does my local branch mirror. What-- and I'm going to just establish correspondence between my master branch, the local one, and the one coming from origin. So once I set that remote tracking branch, I can do a git pull and pull any changes down. If I didn't want to establish a remote tracking branch or this upstream tracking, I can always do a git pull and I can say origin master and specify the remote name and the remote branch that I want to pull in from. But it is very common to set this upstream branch and then just perform git pulls very simply. When I actually cloned the jQuery repository, the active cloning sets these upstream tracking branches automatically for me.

Pushing to a Remote Now, let's actually take a look at what it takes to push changes back up to a remote repository. So I'm going to edit my README file and I'm going to say, "Sharing remotely is fun and easy." And we'll look at the git status, we got one modified file. I'm going to do a git commit -am, so I'm going to add any modified files that git knows about. I don't have any new files so I don't have to perform a separate git ad on those. I can just do a -am and provide a message "Sharing is easy." So I've added that change, and if I run a git status, it notes that there's nothing to commit and it also notes that my branch is ahead of origin master by one commit. There's pending changes that I need to push. So I'm going to do a git push and I'm going to be prompted for a GitHub username and password. Now I could type these in here but having to manage usernames and passwords isn't exactly ideal. So I'm going to do something slightly different. I'm going to remove the origin so I'm going to do a remote rm to remove that origin, so if I do a git remote -v, you'll see I don't have that anymore. And I'm going to readd it, but I'm going to add the origin as the SSH version-- ( Pause ) -- the SSH version of the URL. The advantage of the SSH version is that it's going to use my SSH key to authenticate with GitHub. So now when I do a git push, it's not going to prompt me for my password, it's going to simply push that change up to GitHub. Let's come over to our browser and I'm going to go to GitHub.com GitFundamentals and you can see coming down here that that README.txt has been updated and has that new content in there. And we've authenticated using the SSH key that I've configured for my user and that has been authorized by GitHub. So when you were pushing changes back up to a git repository, it's easier to use the SSH URL rather than the http URL. Http requires a username, password whereas the SSH version can use your SSH key to do the authentication for you.

Creating and Verifying Tags Now that those changes are up there, let's look at actually noting something of interest. So I want to tag my repository, I want to say, git tag and I'm going to provide a name so I'm just saying, okay, this-- I'm going to release this master branch right now as version 1. So I'm going to tag this as v1.0. And now if I do a git tag, we'll see that I've got a 1.0 tag. I'm able to now branch from that point, it's basically a stable point that points to the 2232 commit at the very top. Regardless of what happens, that v1.0 tag is always going to point to 2232. I've made-- that was an unsigned tag. I can also add an annotation or a message to associate with the tag, so v1.0 with message. I can provide a -m option, actually to provide the message, or it will bring up my default editor, this is v1.0. And if I get to a git tag, we now have two tags. A third option is to provide a signs tag which is done with the -s option and I'll say v1.0 signed. If you're signing a tag, it automatically requires a message, signing v1.0. Now, what I'm asked for is my passphrase to unlock my signing key. So I'll type that in, and if I do a git tag, you'll see I now have three tags. If I ask to say git tag and then use the -v option which in this case means to verify. Let's try verifying an unsigned tag, so I'll verify the 1.0 with message. It will display the actual tag and who tagged it? Me. But it will note that no signature was found, so it couldn't verify that this tag was actually created by James Kovacs. Now, if I try to verify the v1.0 signed, you'll see that not only does it have the tagger, the tag name, and the message, but it also notes that it was signed by me and that the signature is actually valid, it's actually using public-private cryptography to ensure that no one has come in and modified it. So if you're exposing a public project and you want to ensure that certain commits can be verified, in other words, this is an official commit. You can then, you're signing to identify these commits, essentially sign the commits saying, "I, James Kovacs, have said that this is the official v1.0 release."

Pushing Tags to a Remote Now, if I do a git push, it says everything is up-to-date. If I go to the website and look at the code, you will notice that I don't have any tags. Let's go to the tags over here. There are no tags. Now why is that? By default, git will not push tags. So I need to do a git push and provide the tags option. When I actually perform that, it will create new tags on GitHub and that remote git repository. Let's come back over. I'll refresh the browser and taking a look at the tags, we can see the different tags. And if I changed one of these tags, it will show the state of the code base at the time that that tag was made. You can see this more easily by switching over to the jQuery code base where they actually have tag releases. So you can see here, this is the master branch, this is master so it is the latest and greatest version of jQuery, it hasn't been officially endorsed. I can come over here and say, I want to see the tags, what was the state of the code base at version 1.7, oh let's go 1.6.4. And this is the version of all those files for the 1.6.4 release. If we look at the version.txt file, and you see it says 1.6.4. Coming back over here, I'm going to switch back to my master branch. So now, I'm back on master and look at the version.txt and you can see it's 1.7.3 pre. So, tagging gives you stable points in your source code, for instance, when you have an official release, a beta, an RC, or maybe you want to tag each of the individual build that succeed on your build server. That can all be done with git and then shared widely with whoever is interested. So, I hope you've seen that sharing with git is easy and straightforward. Much of what you'll be doing is you will be pulling down changes from a remote repository or from your collaborators which will then get merged into your own development. You will make your own changes that can be committed locally and when you're ready to share with the world, you can do a git push to share those changes back out again and make them available for all to see.

Summary In this module, we've looked at how we can clone our remote repository to our local machine, how we can fetch and pull from that remote repository, get the changes down into our local working copy, how we can push our changes back up to the remote, and finally, how we can work with tags, both creating tags and sharing them remotely.