July / August, 2024

Developed by:
Egon A. Ozer, MD PhD ([email protected])
Ramon Lorenzo Redondo, PhD ([email protected])

Terminal

This is the application that is used to access programs that are primarily text-based (as opposed to graphical or point-and-click). This is required to install and run much of the software we will be using in this workshop.

To access the Terminal in Lubuntu, click the bird in the bottom left corner of the screen, then click System Tools --> QTerminal.

1 - Basic command line functions

This will introduce you to basic operations in the command line.

1.1 `ls`

The ls command is used to list the contents of a directory. If you use the ls -l command, you'll get more information about the contents such as whether an item is a file or directory and how big the file is, etc.

1.2 `cd`

The cd command is used to change directories.

If you type cd <directory name> you'll move into that directory
You can use cd .. (two periods) to move back up one directory
You can move down multiple directories by listing the path you'd like to take. For example:cd directory1/directory2/directory3
Most terminal programs will allow you to drag a file or folder into the terminal. So if you'd like to cd into a folder in your Finder window, for example, you can type cd (followed by a space) in your terminal, then use your mouse to drag the folder into the Terminal window and hit Enter.

1.3 `pwd`

The pwd command will tell you your present working directory. Helps keep you from getting lost sometimes.

1.4 `mkdir`

The mkdir command will create a new directory. For example, if you want to create a directory called "output" in your current directory, just type mkdir output

1.5 `cat`

The cat command is short for "concatenate" and can be used to join files together, but it can also be used to display a text-based file to your terminal window. Careful, this command dumps the entire contents of the file to your Terminal so if it's a large file (like a genome sequence file, for example), it could take a while to print everything. Unless the file you are looking at is very short, you may be better off using less (described below).

1.6 `cp`

This is the copy function. It allows you to make a copy of a file with a new name and/or in a new locataion. The usage is:

cp <original file name> <new file name>

1.7 `mv`

The mv function can be used either for moving a file or directory to a new location or to rename a file or directory. The usage for moving the file is:

mv <current file location> <new file location>

Usage for renaming the file is:

mv <file> <new_file_name>

1.8 `ln`

The ln function creates a symbolic link or "symlink" of a file or directory. This creates a pointer to a file's location, like an alias. This allows you to make "copies" of a file or directory in a new place without taking up disk space.

Usage is:

ln -s <original file> <name and location of new file>

Be careful with symbolic links. If the original file that the symlink is pointing to moves or is deleted, the symlink will become non-functional. Similarly, moving the symlink to another directory may result in it becoming non-functional.

1.9 `rm`

rm deletes files or folder

Be very careful with the rm function. There is no recycle bin or trash can that files go into before deletion. There is no way to recover a file or folder that has been deleted with rm. When it's gone, it's gone.

Usage:

rm <file to be deleted>

rm -r <directory to be deleted>

1.10 `less`

The less command allows you to look at the contents of a file in your Terminal without printing the whole thing at once.

You can either use less and the name of the file (i.e. less genome.fasta) or you can pipe the output of another command to less using the pipe character "|" to explore it in a controlled way (i.e. cat genome.fasta | less)
Use the arrow keys to scroll up or down one line at a time or use the space bar to scroll down several lines at a time
To exit the less screen, hit the q key (without shift)

1.11 `head` and `tail`

The head command will print the first few lines of a file and the tail command will print the last few lines of a file.

Usage:

head file.txt

The default is to output the first or last 10 lines. You can control the number of lines output by head or tail with the -n option.

For example, to output the last 5 lines of a file:

tail -n 5 file.txt

1.12 `wc`

The wc command counts is used to count lines, words, bytes, or characters in a file.

To output the number of lines in a file, use the -l option:

Example:

wc -l file.txt

Character counts can be output with -c and word counts with -w.

1.13 Redirect `>`

If you want to direct the output of a command that usually prints to the screen to a file instead, just use the redirect character > at the end of the command and type a file name to direct the file to.

Example: cat genome.fasta > new_file.fasta
WARNING: If you redirect to an existing file, it will be overwritten by the new file. The overwritten data will not be able to be recovered.

To addend a file, i.e. to add new data to the end of the file wihtout overwriting the old data, you can use the >> redirect instead.

Programs will sometimes output data to two separate streams: standard output (STDOUT) and standard error (STDERR). Usually the results of the program are output to the STDOUT stream whereas error messages or other information such as progress reports will be output to the STDERR stream.

To redirect the STDOUT stream to a file, use >, whereas if you needed to redirect the STDERR stream to a file you would use 2>.

1.14 `|`

The | is the "pipe" character. It is used to string commands together such that the output of one command becomes the input to the next command.

Here is a simple example of three commands piped together:

echo "B C A" | tr " " "\n" | sort

The echo command prints the letters "B C A", tr translates the spaces (" ") into new lines or returns ("\n") and then sort sorts the resulting list alphabetically such that the output of this command is:

A
B
C

Piping is a good way to save time or easily execute complex commands. In later exercises we'll also show how this approach can save storage space by skipping the creation of large intermediate files.

1.15 `tee`

The tee command, when used with | pipe, allows you to write STDOUT to the terminal and to a file simultaneously. This is useful if you want see the output of a command on the screen but also capture it in a file in just one step.

Usage:

ls -l | tee file.txt

1.16 `echo`

The echo command we mentioned briefly above does just what it sounds like: it "echoes" whatever text you give it to the terminal. This can be useful when you either need to quickly write text to a file using the > redirect without opening a text editor like nano (more on this below) or you want to send text to another command using the | pipe.

For example:

echo "The quick brown fox"

will print The quick brown fox to the terminal.

Be aware that if you want the text printed by echo to contain line breaks (\n) or tab (\t) characters, you will need to use the -e option:

Example:

echo -e "The quick\nbrown fox"

will print:

The quick
brown fox

2 - Useful command line applications

2.1 `gzip`

Gzip compression software is not necessarily a standard program, but is so widely distributed that it's more likely than not that it will be part of any distribution you might use. It comes standard on Mac and almost all flavors of Linux. Gzip is used to compress files to save space and decompress them when needed.

Gzipped files usually end with the .gz suffix
Sequence read files (ending with .fastq.gz) are almost always gzipped and it's best to leave them this way. Most software that uses sequencing reads can automatically decompress them as needed.
To gzip a file, just use the command gzip <file to be compressed>. The output file will be named the same as the input file, but with a .gz suffix
To take a quick peek at a gzipped file, you can string a few commands together. For a read file named "reads_1.fastq.gz" you use this command: gzip -cd reads_1.fastq.gz | less. The -cd setting is actually two settings together, -d to decompress (rather than the default compress) and -c tells the program to output the decompressed file to the Terminal. You can then pipe that output to less.
An alternative to gzip -cd is the zcat command which works the same as cat, but for gzipped files. (zcat comes standard on Lubuntu, but is not found on all other systems)

2.2 `top` or `htop`

These commands provide information on processor and memory usage on your system. Can be useful to get a sense of how many resources are being used, especially if you would like to run parallel processes in separate tabs or in screen

top: present on nearly all Linux or Mac systems. Very basic information

htop: More colorful and easier to read. Can also filter and sort the results. On some systems needs to be installed. More info here.

2.3 `nano`

Most Linux distributions and Macs have one or more built-in file editors for the command line. One example is VIM. But we're going to talk about another one called nano. This is useful for quickly creating or editing small text-based files.

Usage:

nano <file>

If the file exists, nano will open it. If it doesn't exist, it will open a blank file that will save to the given file name when you close nano.

Use the arrow keys to move around.

Some useful commands in nano

Ctrl-k: Delete a line.
Ctrl-x: Exit nano

Back to table of contents

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

1A Terminal and Command Line - NU-CPGME/sl_workshop_2024 GitHub Wiki

Terminal

1 - Basic command line functions

1.1 `ls`

1.2 `cd`

1.3 `pwd`

1.4 `mkdir`

1.5 `cat`

1.6 `cp`

1.7 `mv`

1.8 `ln`

1.9 `rm`

1.10 `less`

1.11 `head` and `tail`

1.12 `wc`

1.13 Redirect `>`

1.14 `|`

1.15 `tee`

1.16 `echo`

2 - Useful command line applications

2.1 `gzip`

2.2 `top` or `htop`

2.3 `nano`

Back to table of contents

⚠️ GitHub.com Fallback ⚠️

1A Terminal and Command Line - NU-CPGME/sl_workshop_2024 GitHub Wiki

Terminal

1 - Basic command line functions

1.1 ls

1.2 cd

1.3 pwd

1.4 mkdir

1.5 cat

1.6 cp

1.7 mv

1.8 ln

1.9 rm

1.10 less

1.11 head and tail

1.12 wc

1.13 Redirect >

1.14 |

1.15 tee

1.16 echo

2 - Useful command line applications

2.1 gzip

2.2 top or htop

2.3 nano

Back to table of contents

⚠️ **GitHub.com Fallback** ⚠️

1.1 `ls`

1.2 `cd`

1.3 `pwd`

1.4 `mkdir`

1.5 `cat`

1.6 `cp`

1.7 `mv`

1.8 `ln`

1.9 `rm`

1.10 `less`

1.11 `head` and `tail`

1.12 `wc`

1.13 Redirect `>`

1.14 `|`

1.15 `tee`

1.16 `echo`

2.1 `gzip`

2.2 `top` or `htop`

2.3 `nano`

⚠️ GitHub.com Fallback ⚠️