Basic introduction to linux - Pas-Kapli/bpp-tutorial GitHub Wiki
Tutorial
This tutorial covers basic Unix commands that are sufficient for running species delimitation methods from the command line. A more detailed introduction written by Tim Massingham is available here
Open a UNIX terminal window, click on the "Terminal" icon. A terminal window will appear with a "$" prompt, waiting for you to start entering commands:
lkaplipa:~$
pwd
ls
cd
To find out where you are, execute the pwd
command (stands for "print working directory"):
lkaplipa:~$ pwd
/home/lkaplipa
To find out what files and folders are in your working directory execute the ls
command (stands for "list" from "list directory contents"):
lkaplipa:~$ ls
Documents/ Music/ Analyses/ list.txt
To create a new directory execute the mkdir
command (stands for "make directory"):
lkaplipa:~$ mkdir linux_tutorial
lkaplipa:~$ ls
Analyses/ Documents/ linux_tutorial/ Music/ list.txt
To change a directory, execute the cd
command (stands for "change directory"):
lkaplipa:~$ pwd
/home/lkaplipa
lkaplipa:~/linux_tutorial$ cd linux_tutorial
/home/lkaplipa/linux_tutorial/
lkaplipa:~/linux_tutorial$ ls
lkaplipa:~/linux_tutorial$
lkaplipa:~$ cd .. [`cd ..` gets you to the parent directory (i.e. one directory back)]
lkaplipa:~$ pwd
/home/lkaplipa/
lkaplipa:~$ cd linux_tutorial/
lkaplipa:~/linux_tutorial$ cd [`cd` takes you to the home directory]
lkaplipa:~$ pwd
/home/lkaplipa/
The ~
sign stands for your home directory so "~/linux_tutorial" == "/home/lkaplipa/linux_tutorial/"
cp
mv
rm
rmdir
Download the file BR_cob_57ind.fasta in your home directory, and then copy it with cp
to your linux_tutorial directory.
lkaplipa:~/linux_tutorial$ cp ~/BR_cob_57ind.fasta ~/linux_tutorial
or
lkaplipa:~/linux_tutorial$ cp ../BR_cob_57ind.fasta .
lkaplipa:~/linux_tutorial$ ls
BR_cob_57ind.fasta
Rename the file to "Branchiomma.fasta"
lkaplipa:~/linux_tutorial$ mv BR_cob_57ind.fasta Branchiomma.fasta
lkaplipa:~/linux_tutorial$ ls
Branchiomma.fasta
Create a copy of the "Branchiomma.fasta" file called "Branchiomma2.fasta"
lkaplipa:~/linux_tutorial$ cp Branchiomma.fasta Branchiomma2.fasta
lkaplipa:~/linux_tutorial$ ls
Branchiomma.fasta Branchiomma2.fasta
Delete one of the files with rm
(stands for remove)
lkaplipa:~/linux_tutorial$ rm Branchiomma.fasta
lkaplipa:~/linux_tutorial$ ls
Branchiomma2.fasta
Task1: Create a new directory called "test", enter the "test" directory, copy the "Branchiomma2.fasta" file in "test", return to the directory "linux_tutorial". Try to delete the directory with rm
or rmdir
.
What is the problem? Try the following command to find the solution:
lkaplipa:~/linux_tutorial$ man rm
cat
less
head
tail
sed
grep
Display a text file in the terminal with cat
:
lkaplipa:~/linux_tutorial$ cat Branchiomma.fasta
>BR_001
-------------CTTGGGGTCAAATAAGATTTTGGGGTGCCACAGTAATTACTAACCTACTATCAGCTATTCCTTATATTGGCAATTCAATTGTAGCCTGACTATGAGGCGGTTTCGCAGTAGATAACGCCACTCTTAATCGATTTTTCGTGTTCCACTTTATTTTACCATTTATTATTCTTCTCTTTACCCTAATTCACCTAATATTTTTACATAAAACAGGATCAAGAAACCCCCTTGGCCTCTCCTCTTATAATGCAACTATCCCCTTCCATCCTTATTACACTATAAAAGATCTTACAGGTGCTCTCATTAGTATCACCTTACTCTTAGTTCTAACACTAAATATCCCTAATATATTCCTAGAGCCTGACAATTTCATTCAAGCTAACCCACTAAGAACTCCCGCCCACATCAAACCA------------
>BR_002
-------------CTTGGGGTCAAATAAGATTTTGGGGTGCCACAGTAATTACTAACCTACTATCAGCTATTCCTTATATTGGCAATTCAATTGTAGCCTGACTATGAGGCGGTTTCGCAGTAGATAACGCCACTCTTAATCGATTTTTCGTGTTCCACTTTATTTTACCATTTATTATTCTTCTCTTTACCCTAATTCACCTAATATTTTTACATAAAACAGGATCAAGAAACCCCCTTGGCCTCTCCTCTTATAATGCAACTATCCCCTTCCATCCTTATTACACTATAAAAGATCTTACAGGTGCTCTCATTAGTATCACCTTACTCTTAGTTCTAACACTAAATATCCCTAATATATTCCTAGAGCCTGACAATTTCATTCAAGCTAACCCACTAAGAACTCCCGCCCACATCAAACCA------------
..............................
lkaplipa:~/linux_tutorial$
Use less
to read the file and move back and forth in the file with up and down arrows. Press q to exit
lkaplipa:~/linux_tutorial$ less Branchiomma.fasta
To see only specific parts of the file use the head
, tail
and sed
commands. For example to see the first 2 lines execute:
lkaplipa:~/linux_tutorial$ head -n 2 Branchiomma.fasta
>BR_001
-------------CTTGGGGTCAAATAAGATTTTGGGGTGCCACAGTAATTACTAACCTACTATCAGCTATTCCTTATATTGGCAATTCAATTGTAGCCTGACTATGAGGCGGTTTCGCAGTAGATAACGCCACTCTTAATCGATTTTTCGTGTTCCACTTTATTTTACCATTTATTATTCTTCTCTTTACCCTAATTCACCTAATATTTTTACATAAAACAGGATCAAGAAACCCCCTTGGCCTCTCCTCTTATAATGCAACTATCCCCTTCCATCCTTATTACACTATAAAAGATCTTACAGGTGCTCTCATTAGTATCACCTTACTCTTAGTTCTAACACTAAATATCCCTAATATATTCCTAGAGCCTGACAATTTCATTCAAGCTAACCCACTAAGAACTCCCGCCCACATCAAACCA------------
To see the last 2 lines, execute
lkaplipa:~/linux_tutorial$ tail -n 2 Branchiomma.fasta
----------------GAGGTCARATAAGATTTTGAGGTGCAACTGTTATTACTAATCTCCTTTCTGCCATCCCTTATATCGGCCAATCAATCGTAACTTGATTATGGGGGGGATTCGCAGTAGACAACGCTACCCTAAACCGATTTTTTATATTTCACTTCCTTCTTCCATTTATCCTAGCCTTCATATCCGGCCTACATCTTCTATTTCTTCATCAAACAGGCTCCAACAACCCATTAGGATTAAAGTCTACCTCCCTTATAATTCCCTTCCACCCCTACTACACAACCAAAGACCTTGTGGGAGCCCTCTTATTGATTTTCCTCCTCCTATTCCTTGCGCTCGCCTCCCCCTCGCTATTTCTTGACCCGGAAAATTTTATCCAGGCTAACCCCCTAGCTACCCCCACCCACATCAAAC--------------
To see the third line, execute:
lkaplipa:~/linux_tutorial$ sed -n "3p" Branchiomma.fasta
>BR_002
To search a file for a specific word/phrase/symbol use the command grep
lkaplipa:~/linux_tutorial$ grep ">BR_102" Branchiomma.fasta
>BR_102
>
>>
|
To direct the output text of a program to a file with the >
or the >>
symbol.
lkaplipa:~/linux_tutorial$ echo "Hello World"
Hello World
lkaplipa:~/linux_tutorial$ echo "Hello World" > test.txt
lkaplipa:~/linux_tutorial$ ls
Branchiomma.fasta test.txt
lkaplipa:~/linux_tutorial$ cat test.txt
Hello World
lkaplipa:~/linux_tutorial$ echo "Goodbye World" >> test.txt
lkaplipa:~/linux_tutorial$ cat test.txt
Hello World
Goodbye World
lkaplipa:~/linux_tutorial$ echo "Hello New World" > test.txt
Hello New World
To direct the output of one command to another use the pipe symbol |
lkaplipa:~/linux_tutorial$ grep ">" Branchiomma.fasta | wc -l
57
wc
is a command that returns the number of lines (combined with -l), the number of characters (combined with -m
), of words (combined with -w
) etc. What is the output of the above command?
sort
uniq
Download the file carabus_species.txt in your linux_tutorial directory.
check the first 10 lines
lkaplipa:~/linux_tutorial$ head -n 10 carabus_species.txt
Carabus jankowskii
Carabus jankowskii
Carabus smaragdinus
Carabus koreanus
Carabus seishinensis
Carabus semiopacus
Carabus arboreus
Carabus auronitens
Carabus taedatus
Carabus arboreus
sort the names of the file alphabetically:
lkaplipa:~/linux_tutorial$ sort carabus_species.txt | head -n 10
Carabus abbreviatus
Carabus abbreviatus
Carabus albrechti
Carabus albrechti
Carabus albrechti
Carabus albrechti
Carabus albrechti
Carabus albrechti
Carabus albrechti
Carabus albrechti
find the unique names:
lkaplipa:~/linux_tutorial$ uniq carabus_species.txt | head -n 10
Carabus jankowskii
Carabus smaragdinus
Carabus koreanus
Carabus seishinensis
Carabus semiopacus
Carabus arboreus
Carabus auronitens
Carabus taedatus
Carabus arboreus
Carabus kyushuensis
Is the output what we expect it to be?
Task2: sort the species names in the carabus_species.txt, find the unique ones and write them in a new file called carabus_species_sorted_uniq.txt
Hint: What is the difference among sort -u
, sort | uniq
and uniq
Download and compile a software
It is very common to need to download the code of a software from a git repository and compile it in your computer. Most often each program provides specific instruction for doing that. For example we will download and compile a software called "Newick tools" a software written in C and it is meant to perform a multitude of operations on newick files and visualization of phylogenetic trees. Many of the operation were inspired by small tasks that are essential in species delimitation analyses.
Installation:
lkaplipa:~/linux_tutorial$ pwd
/home/lkaplipa/linux_tutorial/
lkaplipa:~/linux_tutorial$ git clone https://github.com/xflouris/newick-tools.git
lkaplipa:~/linux_tutorial$ cd newick-tools
lkaplipa:~/linux_tutorial$ cd src
lkaplipa:~/linux_tutorial$ make
Usage: Download this newick tree in your "linux_tutorial" directory.
$ newick-tools/src/newick-tools --tree_file RAxML_bestTree.Branchiomma --info
Extract all the tip names of the phylogeny
$ newick-tools/src/newick-tools --tree_file RAxML_bestTree.Branchiomma --extract_tips
Root tree
$ newick-tools/src/newick-tools --tree_file RAxML_bestTree.Branchiomma --output_file RAxML_bestTree.Branchiomma.rooted --root BR_076,BR_018
Make tree binary (fully bifurcating)
$ newick-tools/src/newick-tools --tree_file RAxML_bestTree.Branchiomma --output_file RAxML_bestTree.Branchiomma.binary --make_binary