Unix I: Command line first steps - BDC-training/VT25 GitHub Wiki
Course: VT25 Unix applied to genomic data (SC00036)
Most of these exercises are based on: Unix Tutorial and Learn Unix with applications to NGS data
Connect to the server
Open a Terminal (Mac users) or MobaXterm (Windows users). Connect to the server using the account name and password handed in during the practical. Use the program ssh
:
ssh -Y your_account@server_address
Now let's practice some commands.
Changing your password
Change your password using passwd
. Follow the instructions on the screen.
Manual pages
Try the man
command with the ls
command as input:
man ls
Q1. Which flag would you use to colorize the output?
To exit press q
.
Graphical interfaces
It is possible to run programs with graphical interfaces on the server, if you used the -X
option while connecting to the server. Try the following to get the graphical interface of a clock:
xclock
Close the clock. At the prompt type Ctrl + C
to terminate the clock
program.
There is a wide range of text editors. Those with an graphical user interface and those with a command line interface. Here we will try nedit
for it's simplicity, type:
nedit &
A window will open with the editor. Here you can type as with any text editor such as Notepad for instance. Save what you have typed (File
-> Save
or Ctrl+S
) and close the window.
Q2. What does the
&
mean?
Listing files and directories
When you first login, your current working directory is your home directory. Check which files you have, you should see the file you just saved. Type:
ls
Q3. What is the difference between "ls" and "ls -a"?
Making directories
Create a subdirectory in your home directory, type:
mkdir unix
ls
Q4. What are the permissions of the
unix
directory? Hint: usels -l
Changing to a different directory
To change to the directory you have just made, type:
cd unix
Check the content of the directory, it should be empty. Hint: use ls
Now, create another directory within the unix
directory called backups
. Check the content of the directory with ls -a
. You should see two special directories called the current directory (.) and the parent directory (..).
Typing:
cd .
means stay where you are, in the unix directory. This may not seem very useful now, but using (.) as the name of the current directory saves a lot of typing, as we will see later on.
If you now, type:
cd ..
this will take you one directory up in the hierarchy. So you will be back to your home directory. Check with ls
. NOTE: typing cd
with no argument always returns you to your home directory. Useful when you are lost in the system!
Pathnames
Pathnames enable you to work out where you are in relation to the whole file-system. For example, to find out the absolute pathname of your home-directory, type cd
to get back to your home-directory and then type:
pwd
Now type:
ls backups
You will get a message like this:
ls: cannot access backups: No such file or directory
Why? backups
is not in your current working directory. To use a command on a file (or directory) that is not in the current working directory (the directory you are currently in), you must either cd
to the correct directory, or specify its full pathname. To list the contents of your backups
directory, you must type:
ls unix/backups
Home directories can also be referred to by the tilde ~ character. It can be used to specify paths starting at your home directory. So typing:
ls ~/unix
will list the contents of your unix
directory, no matter where you currently are in the file system.
Q5. What do you think the following commands will list?
ls ~
ls ~/..
Copying files
cp file1 file2
is the command which makes a copy of file1 in the current working directory and calls it file2
Let's copy a file to your unix
directory. First, cd
to your unix
directory:
cd ~/unix
Then at the UNIX prompt, type:
cp /home/courses/Unix/intro/science.txt .
ls
Note: Don't forget the dot at the end. Remember, in UNIX, the dot means the current directory.
The above command means copy the file science.txt
to the current directory, keeping the name the same.
Now create a backup of your science.txt
file by copying it to a file called science.bak
and check it is there by using ls
Moving files
mv file1 file2
moves (or renames) file1 to file2
This can also be used to rename a file, by moving the file to the same directory, but giving it a different name.
We are now going to move the file science.bak
to your backup directory.
First, change directories to your unix
directory. Then, inside the unix
directory, type
mv science.bak backups/.
Type ls
and ls backups
to see if it has worked.
Removing files and directories
rm
(remove), rmdir
(remove directory)
Inside your unix
directory, type:
cp science.txt tempfile.txt
ls
rm tempfile.txt
ls
Try to remove the backups
directory. You will not be able to since UNIX will not let you remove a non-empty directory.
Create a directory called temp
using mkdir
, then remove it using the rmdir
command.
Displaying the contents of a file on the screen
clear
(clear screen)
This command is quite handy when you have a busy screen. At the prompt, type:
clear
This will clear all text and leave you with the % prompt at the top of the window.
cat
(concatenate)
Let's display at our science.txt
file:
cat science.txt
As you can see, the file is longer than than the size of the window, so it scrolls past making it unreadable.
Try out the less
command:
less science.txt
Press the [space-bar]
if you want to see another page, and type [q]
if you want to quit reading. As you can see, less
is used in preference to cat
for long files.
It is common that we just want to see the first lines of a file. The head
command helps us with this:
head science.txt
Q6. How would you display the first 20 lines? Remember to use the
man
command to learn more abouthead
To have a look at the last lines, you can use the tail
command. Have a try:
tail science.txt
Searching the contents of a file
grep
is one of many standard UNIX utilities. It searches files for specified words or patterns. First clear the screen, then type:
grep science science.txt
As you can see, grep
has printed out each line containing the word science
, in theory. Try typing:
grep Science science.txt
Remember that grep
command is case sensitive; it distinguishes between Science and science.
Q7. How many instances of
science
andScience
do you find in the file? Remember to use theman
command if you don't remember the flags that can be used withgrep
Run the following command
grep -ivc science science.txt
Q8. What is this command doing?
File permissions
Let's check how the permissions are set in /etc/passwd
ls -l /etc/passwd
Q9. Who has access to read the file? Who can modify the file?
Check the permissions of /usr/bin/python3
Q10. Is everyone allowed to run the program?
Using the command touch
create an empty file called test.sh
. Then type the following:
./test.sh
This command is calling our file to be executed. Note that using ./
lets us point to the file in this directory. You should get a Permission denied
error. If you check the permissions of test.sh
you'll see that nobody can execute the file. Let's make it executable and run it:
chmod +x test.sh
./test.sh
Now you should not get any error!. However nothing is happening, since our file is empty.
Open test.sh
with a text editor. Remember to add &
at the end of the command line, so we can use the server in parallel. Write the following text:
echo "Hi there! testing to run a program"
Save the changes and in the CMD line, run again the program:
./tesh.sh
Now apply what you have learned in the following exercises
Selecting top genes from a TSV file
- Create a directory called
Some_exercises
- Create the following subdirectories within
Some_exercises
:Raw
andResults
- Copy the
VL_vs_Ctrl_DE.txt
file from/home/courses/Unix/intro
in the newly created directory - Make a copy of the file under the
Raw
directory. Addoriginal
in the name file - Inspect the file and describe the format of the file: number of columns and rows, type of data
- Extract the following genes:
ENSG00000133742, ENSG00000196565, ENSG00000237568, ENSG00000119630, ENSG00000233705, ENSG00000206178, ENSG00000166450, ENSG00000124749, ENSG00000251095, ENSG00000084734, ENSG00000268460, ENSG00000178752, ENSG00000204644, ENSG00000174358
- Save them in a file called
highly_expressed.tsv
. Hint: this can be done one by one or all at the same time, which is faster. Don't forget to check theman
pages of grep and give it a try - Extract the following genes:
ENSG00000142408, ENSG00000105366, ENSG00000234449, ENSG00000237647, ENSG00000138395, ENSG00000170558, ENSG00000114744
- Save them in a file alled
lowly_expressed.tsv
. - Merge both file into a file called
top_list_genes.tsv
. - Move this file under
Results
Q11. How many genes do you have in this file?
Selecting sequences from a FASTA file
- Copy the
rt.fa
file from/home/courses/Unix/intro
- Inspect the file and describe the format
- How many sequences are there in the file? Hint: you can use grep, just find a suitable pattern
- Retrieve the sequence description and the nucleotide sequence of the gene annotated as
LA17.RT
. Hint: check theman
pages ofgrep
, there is a flag that can display X amount of lines after the matching lines - Save it in a file called
LA17.RT.fasta
under theResults
directory. - Save the
fasta
sequences from allV clones
to a file calledV_clones.fasta
, also underResults
. - Copy
split_fasta.awk
from/home/courses/Unix/intro
. This is a tiny script that will save each sequence, of a fasta formated file, into its own file. - Open the script in a text editor and replace
*
withrt.fa
, so we process only this file and not other.fa
files we might have in this location. - At the end of the script, add
echo done!
, or some similar message so we know the program finished. - Save the file and run it. Note: Remember to check the permissions! it has to be executable
- How many files where created? Hint: you can use regular expression (patterns) with
ls
, for instancels *.txt
will list all the files that end in .txt, like VL_vs_Ctrl_DE.txt - Extract this motif
TTCTGGGAAGT
from all the newly created files.
Q12. Is the motif found in all sequences (in this case files)?
A little more of info
-
Make a list of all your files and save them in a file called
Files.txt
Q13. How many files are there in your directory?
-
Display all the commands you have been using and save them in a file called
[Todays_date].history
. Hint: use thehistory
command. If you want to check the date, just typedate
. You can also format it in different ways, trydate +%m%d%Y
Q14. How many times did you use the
ls
command? -
Check who is logged in the server. Hint: use the
who
commandQ15. List up to 5 users
-
What are they running? Hint: use the
top
commandQ16. List up to 5 commands
Unix applied to genomic data
Home:Modified by Marcela Dávila, 2018, 2021