Lab 04: Genome browsers and General Feature Format - jfgout/AppliedGenomics GitHub Wiki
The goal of this lab is to familiarize yourself with Genome Browsers and learn how genome annotation information is encoded into General Feature Format (GFF) files.
As usual, answer all the questions asked during this lab and submit the answers to canvas in a PDF file.
Genome Browsers.
There are many different flavors of Genome Browser. They all aim at providing a user-friendly interactive representation of a genome. For this lab, we will be using the Genome Browser available at http://www.ensembl.org
Navigating the genome browser.
- Use your favorite Internet browser to go to: http://www.ensembl.org and find the page for the nematode Caenorhabditis elegans.

From the main C. elegans page, click on the "Example region" link to display the genome browser's example region (Chromosome X:937766-957832).
-
Explore the "Example region". Start by locating the 3 panels showing the region at 3 different zoom level. Explore the bottom window (the one with the maximum zoom level) in greater details. Click on the different features and familiarize yourself with how genes are represented.
-
Navigate to a new region of the genome: Chromosome V, between the positions 156,000 and 186,200.
Questions:
Q1. Is this region toward the beginning, middle or end of chromosome V?
Q2. How many protein-coding genes are present in this region?
Q3. How many transcripts (from protein-coding genes only) can you identify in this region?
Q4. The gene WBGene00006648 (Transcription initiation factor IIB) (left-most in this region) has 3 annotated transcripts: W03F9.5.1, W03F9.5.2 and W03F9.5.3. What are the differences between these 3 transcripts?
Q5. Identify and describe a case of alternative splicing in this region.
General Feature Format
All the information displayed in the genome browser can be stored in a text-tabulated file following a certain syntax: the General Feature Format.
Read the general description of the GFF format here: http://ensembl.org/info/website/upload/gff3.html
Here is the part of the GFF file describing the first transcript W03F9.5.1 of gene WBGene00006648 (the gene mentioned in Q5).
V WormBase gene 156825 166866 . + . ID=gene:WBGene00006648;Name=ttb-1;biotype=protein_coding;description=Transcription initiation factor IIB;gene_id=WBGene00006648
V WormBase mRNA 156825 166866 . + . ID=transcript:W03F9.5.1;Parent=gene:WBGene00006648;Name=W03F9.5.1;biotype=protein_coding;transcript_id=W03F9.5.1
V WormBase five_prime_UTR 156825 156830 . + . Parent=transcript:W03F9.5.1
V WormBase exon 156825 156937 . + . Parent=transcript:W03F9.5.1;Name=W03F9.5.1.e1;
V WormBase CDS 156831 156937 . + 0 ID=CDS:W03F9.5.1;Parent=transcript:W03F9.5.1;protein_id=W03F9.5.1
V WormBase exon 157019 157286 . + . Parent=transcript:W03F9.5.1;Name=W03F9.5.1.e2
V WormBase CDS 157019 157286 . + 1 ID=CDS:W03F9.5.1;Parent=transcript:W03F9.5.1;protein_id=W03F9.5.1
V WormBase exon 160308 160642 . + . Parent=transcript:W03F9.5.1;Name=W03F9.5.1.e3
V WormBase CDS 160308 160642 . + 0 ID=CDS:W03F9.5.1;Parent=transcript:W03F9.5.1;protein_id=W03F9.5.1
V WormBase CDS 163140 163350 . + 1 ID=CDS:W03F9.5.1;Parent=transcript:W03F9.5.1;protein_id=W03F9.5.1
V WormBase exon 163140 163364 . + . Parent=transcript:W03F9.5.1;Name=W03F9.5.1.e4
V WormBase three_prime_UTR 163351 163364 . + . Parent=transcript:W03F9.5.1
V WormBase exon 166568 166866 . + . Parent=transcript:W03F9.5.1;Name=W03F9.5.1.e5
V WormBase three_prime_UTR 166568 166866 . + . Parent=transcript:W03F9.5.1
Exercise: Create a GFF file (make sure to use a text editor such as TextEdit on Mac or Notepadd on Windows and save the document as plain text!) that contains the following annotation:
A protein-coding gene (pick a name for your gene, your choice) on chromosome V, strand +, positions: 92,450 to 95,200.
This gene is made of 2 exons separated by one intron (intron location: 93,232-93,799).
The first 150 nt of the gene are the 5 prime UTR.
Pro Tip: use a ".gff3" extension for your file name.
Use Ensembl's Custom tracks functionality to upload your gff file and display its content. Include in your lab report a screenshot of the genome browser displaying the information from your gff file.
Include the content of your gff file in your lab report.
