1A Terminal and Command Line - NU-CPGME/sl_workshop_2024 GitHub Wiki
July / August, 2024
Developed by:
Egon A. Ozer, MD PhD ([email protected])
Ramon Lorenzo Redondo, PhD ([email protected])
This is the application that is used to access programs that are primarily text-based (as opposed to graphical or point-and-click). This is required to install and run much of the software we will be using in this workshop.
To access the Terminal in Lubuntu, click the bird in the bottom left corner of the screen, then click System Tools --> QTerminal.
This will introduce you to basic operations in the command line.
The ls command is used to list the contents of a directory. If you use the ls -l command, you'll get more information about the contents such as whether an item is a file or directory and how big the file is, etc.
The cd command is used to change directories.
- If you type
cd <directory name>you'll move into that directory - You can use
cd ..(two periods) to move back up one directory - You can move down multiple directories by listing the path you'd like to take. For example:
cd directory1/directory2/directory3 - Most terminal programs will allow you to drag a file or folder into the terminal. So if you'd like to cd into a folder in your Finder window, for example, you can type
cd(followed by a space) in your terminal, then use your mouse to drag the folder into the Terminal window and hit Enter.
The pwd command will tell you your present working directory. Helps keep you from getting lost sometimes.
The mkdir command will create a new directory. For example, if you want to create a directory called "output" in your current directory, just type mkdir output
The cat command is short for "concatenate" and can be used to join files together, but it can also be used to display a text-based file to your terminal window. Careful, this command dumps the entire contents of the file to your Terminal so if it's a large file (like a genome sequence file, for example), it could take a while to print everything. Unless the file you are looking at is very short, you may be better off using less (described below).
This is the copy function. It allows you to make a copy of a file with a new name and/or in a new locataion. The usage is:
cp <original file name> <new file name>
The mv function can be used either for moving a file or directory to a new location or to rename a file or directory. The usage for moving the file is:
mv <current file location> <new file location>
Usage for renaming the file is:
mv <file> <new_file_name>
The ln function creates a symbolic link or "symlink" of a file or directory. This creates a pointer to a file's location, like an alias. This allows you to make "copies" of a file or directory in a new place without taking up disk space.
Usage is:
ln -s <original file> <name and location of new file>
Be careful with symbolic links. If the original file that the symlink is pointing to moves or is deleted, the symlink will become non-functional. Similarly, moving the symlink to another directory may result in it becoming non-functional.
rm deletes files or folder
Be very careful with the rm function. There is no recycle bin or trash can that files go into before deletion. There is no way to recover a file or folder that has been deleted with rm. When it's gone, it's gone.
Usage:
rm <file to be deleted>
rm -r <directory to be deleted>
The less command allows you to look at the contents of a file in your Terminal without printing the whole thing at once.
- You can either use
lessand the name of the file (i.e.less genome.fasta) or you can pipe the output of another command tolessusing the pipe character "|" to explore it in a controlled way (i.e.cat genome.fasta | less) - Use the arrow keys to scroll up or down one line at a time or use the space bar to scroll down several lines at a time
- To exit the
lessscreen, hit theqkey (without shift)
The head command will print the first few lines of a file and the tail command will print the last few lines of a file.
Usage:
head file.txtThe default is to output the first or last 10 lines. You can control the number of lines output by head or tail with the -n option.
For example, to output the last 5 lines of a file:
tail -n 5 file.txtThe wc command counts is used to count lines, words, bytes, or characters in a file.
To output the number of lines in a file, use the -l option:
Example:
wc -l file.txtCharacter counts can be output with -c and word counts with -w.
If you want to direct the output of a command that usually prints to the screen to a file instead, just use the redirect character > at the end of the command and type a file name to direct the file to.
- Example:
cat genome.fasta > new_file.fasta -
WARNING: If you redirect to an existing file, it will be overwritten by the new file. The overwritten data will not be able to be recovered.
To addend a file, i.e. to add new data to the end of the file wihtout overwriting the old data, you can use the >> redirect instead.
Programs will sometimes output data to two separate streams: standard output (STDOUT) and standard error (STDERR). Usually the results of the program are output to the STDOUT stream whereas error messages or other information such as progress reports will be output to the STDERR stream.
To redirect the STDOUT stream to a file, use >, whereas if you needed to redirect the STDERR stream to a file you would use 2>.
The | is the "pipe" character. It is used to string commands together such that the output of one command becomes the input to the next command.
Here is a simple example of three commands piped together:
echo "B C A" | tr " " "\n" | sortThe echo command prints the letters "B C A", tr translates the spaces (" ") into new lines or returns ("\n") and then sort sorts the resulting list alphabetically such that the output of this command is:
A
B
C
Piping is a good way to save time or easily execute complex commands. In later exercises we'll also show how this approach can save storage space by skipping the creation of large intermediate files.
The tee command, when used with | pipe, allows you to write STDOUT to the terminal and to a file simultaneously. This is useful if you want see the output of a command on the screen but also capture it in a file in just one step.
Usage:
ls -l | tee file.txtThe echo command we mentioned briefly above does just what it sounds like: it "echoes" whatever text you give it to the terminal. This can be useful when you either need to quickly write text to a file using the > redirect without opening a text editor like nano (more on this below) or you want to send text to another command using the | pipe.
For example:
echo "The quick brown fox"will print The quick brown fox to the terminal.
Be aware that if you want the text printed by echo to contain line breaks (\n) or tab (\t) characters, you will need to use the -e option:
Example:
echo -e "The quick\nbrown fox"will print:
The quick
brown fox
Gzip compression software is not necessarily a standard program, but is so widely distributed that it's more likely than not that it will be part of any distribution you might use. It comes standard on Mac and almost all flavors of Linux. Gzip is used to compress files to save space and decompress them when needed.
- Gzipped files usually end with the
.gzsuffix - Sequence read files (ending with
.fastq.gz) are almost always gzipped and it's best to leave them this way. Most software that uses sequencing reads can automatically decompress them as needed. - To gzip a file, just use the command
gzip <file to be compressed>. The output file will be named the same as the input file, but with a.gzsuffix - To take a quick peek at a gzipped file, you can string a few commands together. For a read file named "reads_1.fastq.gz" you use this command:
gzip -cd reads_1.fastq.gz | less. The-cdsetting is actually two settings together,-dto decompress (rather than the default compress) and-ctells the program to output the decompressed file to the Terminal. You can then pipe that output toless. - An alternative to
gzip -cdis thezcatcommand which works the same ascat, but for gzipped files. (zcatcomes standard on Lubuntu, but is not found on all other systems)
These commands provide information on processor and memory usage on your system. Can be useful to get a sense of how many resources are being used, especially if you would like to run parallel processes in separate tabs or in screen
top: present on nearly all Linux or Mac systems. Very basic information
htop: More colorful and easier to read. Can also filter and sort the results. On some systems needs to be installed. More info here.
Most Linux distributions and Macs have one or more built-in file editors for the command line. One example is VIM. But we're going to talk about another one called nano. This is useful for quickly creating or editing small text-based files.
Usage:
nano <file>
If the file exists, nano will open it. If it doesn't exist, it will open a blank file that will save to the given file name when you close nano.
Use the arrow keys to move around.
Some useful commands in nano
- Ctrl-k: Delete a line.
- Ctrl-x: Exit
nano
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
