Lesson 1: Intro to UNIX - joslynnlee/CHEM-454 GitHub Wiki
Let's get started:
Today's we will get acquainted with using the shell and running on the mac's terminal program to learn new UNIX commands! In class today, we learned about the overview of the Shell and how it is communicating with the Kernel. Following the practice below, you will cover a few UNIX commands in the lecture (09/10) today.
Introducing the Shell
Lesson adapted from "The Carpentries"
Objectives
- What is a command shell and why should we use one?
- Learn how to explain how the shell relates to the keyboard, screen, the operative system and programs.
- Learn why command-line interfaces are used verses graphical interfaces.
Background
Computers do four things:
- run programs
- store data
- communicate with each other, and
- interact with us
Computers "interact with us" in many different ways. Can you name a few?
We use hardware interfaces like the keyboard, mouse, touch screen interfaces, or speech recognition using systems. Think of how these interfaces allow us to click selections of menus and drag-and-drop.
Although most modern desktop operating systems (OS) communicate with their human users by means of windows, icons and pointers, these software technologies didn’t become widespread until the 1980s. What did we do beforehand?
The Command-Line Interface
This kind of interface is called a command-line interface, or CLI, to distinguish it from a graphical user interface, or GUI (pronounced: goo-ey), which most people now use.
The heart of a CLI is a read-evaluate-print loop, or REPL. When the user types a command and then presses the Enter
(or Return
) key, the computer reads it, executes it, and prints its output. The user then types another command, and so on until the user logs off.
The Shell
The REPL description makes it sound as though the user sends commands directly to the computer, and the computer sends output directly to the user.
In fact, there is usually a program in between called a command shell. What the user types goes into the shell, which then figures out what commands to run and orders the computer to execute them.
Note that the command shell is called “the shell” because it encloses the operating system in order to hide some of its complexity and make it simpler to interact with.
A shell is a computer program like any other. What’s special about it is that its job is to run other programs rather than to do calculations itself. The most popular shell is Bash, the Bourne Again SHell (so-called because it’s derived from a shell written by Stephen Bourne). Bash is the default shell on most modern implementations of Unix and in most packages that provide Unix-like tools for Windows.
Why should you learn to use the shell?
-
Many bioinformatics tools can only be used through a command line interface, or have extra capabilities in the command line version that are not available in the GUI.
-
In bioinformatics, you often need to do the same set of tasks with a large number of files. Learning to automate those repetitive tasks in a less error-prone. When humans do the same thing a hundred different times (or even ten times), they’re likely to make a mistake. Your computer can do the same thing a thousand times with no mistakes.
-
When you carry out your work in the command-line (rather than a GUI), your computer keeps a record of every step that you’ve carried out, which you can use to re-do your work when you need to.
-
Many bioinformatic tasks require large amounts of computing power and can’t realistically be run on your own machine. These tasks are best performed using remote computers or cloud computing, which can only be accessed through a shell.
This is a nice place to start and then move into other languages like python or R.
Navigating Files and Directories
Objectives
- Explain the similarities and differences between a file and a directory.
- Explain the steps in the shell’s read-run-print cycle.
- Identify the actual command, flags, and filenames in a command-line call.
- Demonstrate the use of tab completion, and explain its advantages.
The part of the operating system responsible for managing files and directories is called the file system. It organizes our data into files, which hold information, and directories (also called “folders”), which hold files or other directories.
Getting Started
Type the command whoami
, then press SHIFT-ENTER
to send the command to the shell. The command’s output is the ID of the current user, i.e., it shows us who the shell thinks we are:
More specifically, when we type whoami
the shell:
- finds a program called
whoami
, - runs that program,
- displays that program’s output, then
- displays a new prompt to tell us that it’s ready for more commands.
Next, let’s find out where we are by running a command called pwd
(which stands for print working directory
):
At any moment, our current working directory is our current default directory, i.e., the directory that the computer assumes we want to run commands in unless we explicitly specify something else.
To understand what a “home directory” is, let’s have a look at how the file system as a whole is organized.
For the sake of this example, we’ll be illustrating the filesystem on Nelle's computer.
After this illustration, you’ll be learning commands to explore your own filesystem, which will be constructed in a similar way, but not be exactly identical.
On Nelle's computer, the filesystem looks like this:
At the top is the root directory that holds everything else. We refer to it using a slash character /
on its own; this is the leading slash in /Users/nelle
.
Inside that directory are several other directories:
bin
(which is where some built-in programs are stored)data
(for miscellaneous data files)Users
(where users’ personal directories are located)tmp
(for temporary files that don’t need to be stored long-term)
nelle
has an account on her machine. He current working directory would be stored inside /Users
. This is because /Users/nelle
is the first part of its name. Similarly, we know that /Users
is stored inside the root directory /
because its name begins with /
.
In this example directory below, underneath /Users
, we find one directory for each user with an account on Nelle's machine.
PRACTICE: Who are the two other users?
Chat with the person next to you and explain your result. Use the file system to explain the result.
**ANSWER: **
Nelle's colleagues have files stored in the /Users/inhotep
and /Users/larry
. Typically, when you open a new command prompt you will be in your home directory to start.
Commands
Now let’s learn the command that will let us see the contents of our own filesystem. We can see what’s in our home directory by running ls
, which stands for “listing”:
ls
prints the names of the files and directories in the current directory in alphabetical order, arranged neatly into columns.
We can make its output more comprehensible by using the flag -F
(also known as a switch or an option) , which tells ls to add a trailing /
to the names of directories:
ACTION: Put your GREEN STICKY UP WHEN DONE.
Commands can be run alone. When using flags, they need to come after the command and before the input.
ls
has lots of other flags. To find out what the possible, we use the --help
flag.
Getting help
Many commands and programs that people have written (that can be run from within bash) support the --help
flag to display more information on how to use the command or program.
By entering the --help
flag to the command ls
below:
QUESTION: We can read the manual of the command ls
by using the man
command. Does ls
come first or does man
come first?
Note: Discuss with partner and try both! Was there an error?
QUESTION: What is ls msn
doing?
HINT - look up the usage of ls
.
Exploring Flags
Listing Recursively and By Time
Type ls -R
in the code cell below.
The ls
command with the flag -R
lists the contents of directories recursively, i.e., lists their sub-directories, sub-sub-directories, and so on in alphabetical order at each level.
PRACTICE: Draw the directory structure (ignore files) that explains this output. Chat with your neighbor!
Type ls -t
in the code cell below.
The ls
command with the flag -t
lists things by time of last change, with most recently changed files or directories first.
Compare this output to the ls -R
.
Now type ls -R -t -l
. The ls
command combines the different flags -R
, -t
and -l
. This order will list the contents of the directories (-R
) by the time of the last change (-t
), the most recently changed files first, along with long-listing format (-l
) to view timestamps (-h
) in a human-readable format.
Using these ways of listing contents is helpful to check output files without clicking open so many windows, you can view the size of the files. Lots of information with a few keystrokes.
From the output above, in the home directory it contains sub-directories. Let's try to find the directory called shell-lesson-data
. It may be in your downloads, so let's put it somewhere on the Desktop.
Earlier we used the ls
command with F
flag to view directories. Below, type ls shell-lesson-data
, it will list the contents in the shell-lesson-data
directory.
Using the ls
to view inside other directories is helpful. We can use the same strategy to change our location to a different directory so we move out of home.
Changing Locations
As you may now see, using a bash shell is strongly dependent on the idea that your files are organized in a hierarchical file system. Organizing things hierarchically in this way helps us keep track of our work: it’s possible to put hundreds of files in our home directory, just as it’s possible to pile hundreds of printed papers on our desk, but it’s a self-defeating strategy.
We learned we can look at a directories contents by ls
.
ACTION: First let's look at our current working directory. Type pwd
:
Here we will play with the command to change locations, cd
followed by a directory name
to change our working directory. cd
stands for “change directory", which is a bit misleading: the command doesn’t change the directory, it changes the shell’s idea of what directory we are in.
We’ll start with the simplest.
There is a shortcut in the shell to move up one directory level that looks like this:
'..
' is a special directory name meaning “the directory containing this one”, or more succinctly, the parent of the current directory. Sure enough, if we run pwd
after running cd ..
,
Hint - look at drawing you made previously. This will help visualize and navigating your directory structure on the virtual machine.
You've learned the basic commands for navigating the filesystem on your computer: pwd
, ls
and cd
.
Let’s explore some variations on those commands. What happens if you type cd
on its own or now flags, without giving a directory?
Type the command for print current directory
below:
It turns out that cd
without an argument will return you to your home directory, which is great if you’ve gotten lost in your own filesystem.
Last time, we used three commands, but we can actually string together the list of directories to move to shell-lesson-data/data
in one step:
Check that we’ve moved to the right place by running pwd
and ls -F
:
Another shortcut is the -
(dash) character. cd
will translate -
into "the previous directory I was in", which is faster than having to remember, then type, the full path. This is a very efficient way of moving back and forth between directories. The difference between cd ..
(two periods) and cd -
is that the former brings you up, while the latter brings you back. You can think of it as the Last Channel button on a TV remote.
PRACTICE: Starting from /Users/amanda/data/
, which of the following commands could Amanda use to navigate to her home directory, which is /Users/amanda
?
cd /home/amanda
cd ../..
cd ~
cd home
cd
cd ..
Note: Use a paper to draw this out. Try not to scroll down into the answers. Discuss with your peers which works and doesn't work.
ANSWERS Be careful scrolling down for answers.
- No: Amanda’s home directory is
/Users/amanda
. - No: this goes up two levels, i.e. ends in
/Users
. - Yes:
~
stands for the user’s home directory, in this case/Users/amanda
. - No: this would navigate into a directory home in the current directory if it exists.
- Yes: shortcut to go back to the user’s home directory.
- Yes: goes up one level.
Now, let's go back home by typing cd
and get in the shell-lesson-data/