From ontology annotation results to enrichment computation - wdlingit/cop GitHub Wiki
Based on a whole genome functional annotation table, it would be great to extract ontological information into files in GMT for GSEA. This would be helpful for doing downstream analysis like GSEA or ontology enrichment.
Suppose that we have a table made by using Trinotate that is with many columns of different kinds of annotation.
ubuntu@ubuntu20:~$ head -1 myTrinotate.tsv | perl -ne 'chomp; @t=split(/\t/); for($i=0;$i<@t;$i++){ print "$i\t$t[$i]\n" }'
0 #gene_id
1 transcript_id
2 sprot_Top_BLASTX_hit
3 infernal
4 prot_id
5 prot_coords
6 sprot_Top_BLASTP_hit
7 Pfam
8 SignalP
9 TmHMM
10 eggnog
11 Kegg
12 gene_ontology_BLASTX
13 gene_ontology_BLASTP
14 gene_ontology_Pfam
15 EggNM.seed_ortholog
16 EggNM.evalue
17 EggNM.score
18 EggNM.eggNOG_OGs
19 EggNM.max_annot_lvl
20 EggNM.COG_category
21 EggNM.Description
22 EggNM.Preferred_name
23 EggNM.GOs
24 EggNM.EC
25 EggNM.KEGG_ko
26 EggNM.KEGG_Pathway
27 EggNM.KEGG_Module
28 EggNM.KEGG_Reaction
29 EggNM.KEGG_rclass
30 EggNM.BRITE
31 EggNM.KEGG_TC
32 EggNM.CAZy
33 EggNM.BiGG_Reaction
34 EggNM.PFAMs
35 transcript
36 peptide
Suppose that we would like to extract GO annotations (columns 12,13,14 and 23) and KEGG Pathway/Module annotations (columns 26 & 27) for further use. The first thing to do is to extract them into a list of gene-ontologyID entries. For the GO part, we did the following.
ubuntu@ubuntu20:~$ cat myTrinotate.tsv | perl -ne 'chomp; @t=split(/\t/); print "$t[0]\t".join("\t",@t[12..14,23])."\n"' | perl -ne 'chomp; @t=split(/\t/); $id=shift @t; for $x (@t){ while($x=~/(GO:\d{7})/g){ print "$id\t$1\n" } }' > myGO.txt
ubuntu@ubuntu20:~$ head myGO.txt
Species_18805 GO:0016779
Species_18770 GO:0003677
Species_18770 GO:0006310
Species_18770 GO:0015074
Species_18750 GO:0005524
Species_18750 GO:0003677
Species_18750 GO:0009035
Species_18750 GO:0009307
Species_18750 GO:0005524
Species_18750 GO:0003677
ubuntu@ubuntu20:~$ cat myGO.txt | perl -ne '@t=split; $hash{$t[0]}=1; if(eof){ $cnt=keys %hash; print "$cnt\n" }'
2277
The first command is composed of two piped perl one-liners. The first one-liner perl -ne 'chomp; @t=split(/\t/); print "$t[0]\t".join("\t",@t[12..14,23])."\n"'
is to extract gene IDs and corresponding GO informations into lines. The second one-liner is to take the first token from every line as the gene ID and extract every GOIDs from rest tokens and print gene ID-GOID entries. The last perl one-liner showed that 2277 genes were associated with at least one GO terms.
For the KEGG part, we did the following.
ubuntu@ubuntu20:~$ cat myTrinotate.tsv | perl -ne 'next if $.==1; chomp; @t=split(/\t/); print "$t[0]\t$t[26]\n" if $t[26] ne "."' | perl -ne 'chomp; @t=split(/\t/); @s=split(/,/,$t[1]); for $x (@s){ print "$t[0]\t$x\n" if $x=~/^map/ }' > myKO.txt
ubuntu@ubuntu20:~$ cat myTrinotate.tsv | perl -ne 'next if $.==1; chomp; @t=split(/\t/); print "$t[0]\t$t[27]\n" if $t[27] ne "."' | perl -ne 'chomp; @t=split(/\t/); @s=split(/,/,$t[1]); for $x (@s){ print "$t[0]\t$x\n" }' >> myKO.txt
ubuntu@ubuntu20:~$ head myKO.txt
Species_18635 map00230
Species_18635 map00240
Species_18635 map01100
Species_18635 map03030
Species_18635 map03410
Species_18635 map03420
Species_18635 map03440
Species_18705 map00230
Species_18705 map00240
Species_18705 map01100
ubuntu@ubuntu20:~$ cat myKO.txt | perl -ne '@t=split; $hash{$t[0]}=1; if(eof){ $cnt=keys %hash; print "$cnt\n" }'
1106
The first command is to skip the first line and extract column 26 as avoiding no data (.
). KEGG IDs in this column that started with k
were found to be duplicated with those started with map
so only those entries with map12345 were outputted (if $x=~/^map/
). The second command is to skip the first line and extract column 27 as avoiding no data (.
). Note that the output was appended (>>
) to the previous output file.
(You may consider this part as a self-advertisement)
As we mentioned to transfer information into GMT format, we are not going to do this for the GO part because I would like to honor the true path rule (geneX is associated with GO_A AND GO_A is_a
child of GO_B => geneX is associated with GO_B). To do that, we build a GOBU tree and do corresponding GO computation using GOBU.
Download the software and extract it. Make sure that java 8 or above is available.
ubuntu@ubuntu20:~$ wget https://downloads.sourceforge.net/project/gobu/20230301/gobu20230301.zip
ubuntu@ubuntu20:~$ unzip gobu20230301.zip
The first step to build the GOBU tree is to build a backbone (myTree) and prepare leaves (myGO) for attaching them to the backbone.
ubuntu@ubuntu20:~$ cat myTrinotate.tsv | perl -ne 'next if $.==1; @t=split; print "reference\tRP:$t[0]\n"' > myTree.0.m
ubuntu@ubuntu20:~$ cat myGO.txt | perl -ne 'chomp; @t=split(/\t/); print "RP:$t[0]\t$t[1]\n"' > myGO.m
The GOBU java commands need to be run in the install(extracted) directory. The first command is to transfer the tab-delimited file (myTree.0.m
) into a tree file.
ubuntu@ubuntu20:~$ cd gobu/
ubuntu@ubuntu20:~/gobu$ java -classpath gobu.jar bio301.goutil.gobu.data.TreeFileMaker -I ../myTree.0.m -O ../myTree.0.tree -OBO data/gene_ontology.1_2.obo
program: TreeFileMaker
input filename (-I): ../myTree.0.m
output filename (-O): ../myTree.0.tree
OBO filename (-OBO): data/gene_ontology.1_2.obo
The second command is to attach the GO annotation leaves to the tree.
ubuntu@ubuntu20:~/gobu$ java -classpath gobu.jar bio301.goutil.gobu.data.AddAnnotation -I ../myTree.0.tree -S ../myGO.m -O ../myTree.1.tree -OBO data/gene_ontology.1_2.obo
program: AddAnnotation
input filename (-I): ../myTree.0.tree
subtree filename (-S): ../myGO.m
output filename (-O): ../myTree.1.tree
OBO filename (-OBO): data/gene_ontology.1_2.obo
coverage check (-C): false
This last GOBU command is optional. This is to sort the GO annotation items in user data to follow their actual order in the GO hierarchy. Sorting them would make it rather comfortable when browsing the GOBU tree in the GOBU program. The program would report GOIDs (in numbers) that were not found in the GO OBO file. This would be due to the inconsistency between the annotation database and the GO vocabulary database. As we have 36198 GO associations in this case, the number of missing GOIDs is relatively small and acceptable.
ubuntu@ubuntu20:~/gobu$ java -classpath gobu.jar bio301.goutil.gobu.data.TreeSorter -I ../myTree.1.tree -O ../myTree.sorted.tree -OBO data/gene_ontology.1_2.obo
program: TreeSorter
input filename (-I): ../myTree.1.tree
output filename (-O): ../myTree.sorted.tree
OBO filename (-OBO): data/gene_ontology.1_2.obo
55114 not in GO tree, its parent: Species_00040
5623 not in GO tree, its parent: Species_00310
18065 not in GO tree, its parent: Species_01190
6732 not in GO tree, its parent: Species_01385
5451 not in GO tree, its parent: Species_01470
8144 not in GO tree, its parent: Species_01820
48037 not in GO tree, its parent: Species_02080
51186 not in GO tree, its parent: Species_04510
44421 not in GO tree, its parent: Species_06450
44425 not in GO tree, its parent: Species_07000
70035 not in GO tree, its parent: Species_07020
51181 not in GO tree, its parent: Species_07045
51704 not in GO tree, its parent: Species_09560
453 not in GO tree, its parent: Species_14320
Recall that there is a description field in the GMT format and we got only KEGG IDs in file myKO.txt
. We applied the KEGG REST API for getting names of KEGG IDs.
ubuntu@ubuntu20:~$ cat myKO.txt | perl -ne 'chomp; @t=split; $hash{$t[1]}=1; if(eof){ for $x (sort keys %hash){ print "$x\n"} }' | perl -ne 'chomp; $restURL="https://rest.kegg.jp/get/$_"; $cmd="wget -q -O /dev/stdout $restURL"; $msg=`$cmd`; $name=""; if($msg=~/^NAME\s+(.+)$/m){ $name="$1" } print "$_\t$name\n"; sleep 1' > KO.names
The first piped perl one-liner is to output a non-redundant list of KEGG IDs for query because every calling of the KEGG REST API takes a few seconds. The second piped perl one-liner is to use wget
to access the API and extract NAME
from the output. In this practice, a number of KEGG IDs were found without API output and they were found not in the database. This might be due to the inconsistency between the annotation database and the KEGG database. Note that the KEGG server has some kind of DOS avoiding mechanism so we use sleep 1
to make 1 second intervals between queries.
The next command collects gene IDs with the same KEGG IDs into lists, and then output lines of KEGG ID, KEGG NAME, and gene IDs (the GMT format).
ubuntu@ubuntu20:~$ cat myKO.txt | perl -ne 'if($.==1){ open(FILE,"<KO.names"); while($line=<FILE>){ chomp $line; @s=split(/\t/,$line); $defHash{$s[0]}=$s[1]; } close FILE } chomp; @t=split(/\t/); $hash{$t[1]}{$t[0]}=1; if(eof){ for $set (sort keys %hash){ @arr=keys %{$hash{$set}}; print "$set\t$defHash{$set}\t".join("\t",@arr)."\n" if length($defHash{$set})>0 } }' > KO.gmt
This file can be used as a database file for GSEA. Note that this described procedure can be similarly applied to other annotation categories.
Assume that we have a file DEG.tsv
for differential comparison results and we are not sure which p-value threshold would be appropriate. In this case, generate a few DEG lists based on different thresholds and investigate their enrichment results would be helpful.
ubuntu@ubuntu20:~$ head -1 DEG.tsv | perl -ne 'chomp; @t=split(/\t/); for($i=0;$i<@t;$i++){ print "$i\t$t[$i]\n" }'
0 #geneID
1 Length
2 ctrl_rep1
3 ctrl_rep2
4 trt_rep1
5 trt_rep2
6 weightedMultiRatio
7 T-TEST
8 adjP
9 regulation
10 logFC
ubuntu@ubuntu20:~$ mkdir Lists
ubuntu@ubuntu20:~$ ls DEG.tsv | perl -ne 'chomp; /(.+)\./; print "!!!FILE: $1\n"; system "cat $_"; print "!!!END\n"' | perl -ne 'chomp; if(/^!!!FILE: (\S+)/){ $f=$1; %pHash=(); }elsif(/^#/){}elsif(/^!!!END/){ for $pTh (0.05,0.01,0.005,0.001){ open(FILE,">Lists/$f.$pTh.list"); for $g (sort keys %pHash){ print FILE "$g\n" if $pHash{$g}[0]<=$pTh } close FILE; } }else{ @t=split(/\t/); push @{$pHash{$t[0]}},$t[7] if (length($t[0])>0) && (abs($t[10])>=1) }'
ubuntu@ubuntu20:~$ wc -l Lists/*.list
2 Lists/DEG.0.001.list
11 Lists/DEG.0.005.list
22 Lists/DEG.0.01.list
128 Lists/DEG.0.05.list
163 total
The first piped perl one-liner is to send contents of DEG table(s) together with table file information. This is for processing multiple DEG tables at one time. Given that the coding is correct, it would be better to avoid inputting commands several times to prevent manual mistakes. The second piped perl one-liner is to recognize source filenames of the tables (if(/^!!!FILE: (\S+)/){ $f=$1; %pHash=(); }
), store p-value information into a hash table (push @{$pHash{$t[0]}},$t[7]
) if abs(log2-fold-change)>=1, and save different lists in Lists directory for different p-value threshold.
To compute GO enrichment using GOBU in command line (instead of using GUI), we will have to further append list information in front of gene IDs.
ubuntu@ubuntu20:~$ ls Lists/*.list | perl -ne 'chomp; /.+\/(.+)\.list/; $name=$1; print "!!!FILE: $name\n"; system "cat $_"' | perl -ne 'chomp; if(/^!!!FILE:\s(\S+)/){ $tag=$1; open(FILE,">Lists/$1.grouplist"); }else{ chomp; print FILE "$tag\t$_\n" }'
ubuntu@ubuntu20:~$ wc -l Lists/*
2 Lists/DEG.0.001.grouplist
2 Lists/DEG.0.001.list
11 Lists/DEG.0.005.grouplist
11 Lists/DEG.0.005.list
22 Lists/DEG.0.01.grouplist
22 Lists/DEG.0.01.list
128 Lists/DEG.0.05.grouplist
128 Lists/DEG.0.05.list
326 total
ubuntu@ubuntu20:~$ diff Lists/DEG.0.001.grouplist Lists/DEG.0.001.list
1,2c1,2
< DEG.0.001 Species_16535
< DEG.0.001 Species_16895
---
> Species_16535
> Species_16895
The .grouplist
files are for saving different selections into one single file. For this walkthrough, we save them separately for other purpose. The next command is for doing GO enrichment on the .grouplist
files.
ubuntu@ubuntu20:~$ mkdir Counts
ubuntu@ubuntu20:~$ cd gobu/
ubuntu@ubuntu20:~/gobu$ ls ../Lists/*.grouplist | perl -ne 'chomp; /.+\/(.+)\.grouplist/; $cmd="java -classpath gobu.jar:plugins/multiview.jar bio301.goutil.gobu.data.GOEnrichment -R ../myTree.sorted.tree -M $_ -OBO data/gene_ontology.1_2.obo -O ../Counts/$1.go.out"; print "\nCMD: $cmd\n"; #system $cmd'
CMD: java -classpath gobu.jar:plugins/multiview.jar bio301.goutil.gobu.data.GOEnrichment -R ../myTree.sorted.tree -M ../Lists/DEG.0.001.grouplist -OBO data/gene_ontology.1_2.obo -O ../Counts/DEG.0.001.go.out
CMD: java -classpath gobu.jar:plugins/multiview.jar bio301.goutil.gobu.data.GOEnrichment -R ../myTree.sorted.tree -M ../Lists/DEG.0.005.grouplist -OBO data/gene_ontology.1_2.obo -O ../Counts/DEG.0.005.go.out
CMD: java -classpath gobu.jar:plugins/multiview.jar bio301.goutil.gobu.data.GOEnrichment -R ../myTree.sorted.tree -M ../Lists/DEG.0.01.grouplist -OBO data/gene_ontology.1_2.obo -O ../Counts/DEG.0.01.go.out
CMD: java -classpath gobu.jar:plugins/multiview.jar bio301.goutil.gobu.data.GOEnrichment -R ../myTree.sorted.tree -M ../Lists/DEG.0.05.grouplist -OBO data/gene_ontology.1_2.obo -O ../Counts/DEG.0.05.go.out
(surely the sharp mark was removed for executing the commands)
So we have a number of GO enrichment output tables by inputting different DEG lists. They are all in tab-delimited format and columns mean:
- group: source list name, would be useful when the computed
.grouplist
file contains multiple lists - Term Name for GO term name
- GOID
- Term Type: in (C)omponent, (F)unction, or (P)rocess
- Depth of the GO term in the GO hierarchy
- cnt (the first one): number of whole genome genes associated with this GO term
- cnt (the second one): number of input genes associated with this GO term
- bin: classical enrichment p-values made by applying the binomial model
- fsh: classical enrichment p-values made by applying fisher exact test (special thanks to Dr. Oyvind Langsrud for his fast code for fisher exact test)
- elim: enrichment p-values by applying TopGO elim method (PMID: 16606683)
For the KEGG part, we have a GMT file and DEG lists. We apply GmtIntersectFisher.pl
in this repository for classical enrichment. Running this script requires a java class from rackj for fisher exact test. Note that 3659 is the number of whole genome genes of this case.
ubuntu@ubuntu20:~$ wget https://downloads.sourceforge.net/project/rackj/0.99a/rackJ.tar.gz
ubuntu@ubuntu20:~$ tar -zxvf rackJ.tar.gz
ubuntu@ubuntu20:~$ git clone https://github.com/wdlingit/cop.git
ubuntu@ubuntu20:~$ ./cop/scripts/GmtIntersectFisher.pl
Usage: GmtIntersect4fisher.pl <geneFile> <GMT> <bkgNumber> <rackjJARpath>
ubuntu@ubuntu20:~$ ls Lists/*.list | perl -ne 'chomp; /.+\/(.+)\.list/; $cmd="./cop/scripts/GmtIntersectFisher.pl $_ KO.gmt 3659 rackJ/rackj.jar > Counts/$1.ko.out"; print "$cmd\n"; #system $cmd'
./cop/scripts/GmtIntersectFisher.pl Lists/DEG.0.001.list KO.gmt 3659 rackJ/rackj.jar > Counts/DEG.0.001.ko.out
./cop/scripts/GmtIntersectFisher.pl Lists/DEG.0.005.list KO.gmt 3659 rackJ/rackj.jar > Counts/DEG.0.005.ko.out
./cop/scripts/GmtIntersectFisher.pl Lists/DEG.0.01.list KO.gmt 3659 rackJ/rackj.jar > Counts/DEG.0.01.ko.out
./cop/scripts/GmtIntersectFisher.pl Lists/DEG.0.05.list KO.gmt 3659 rackJ/rackj.jar > Counts/DEG.0.05.ko.out
(again, we surely removed the sharp mark for executing the commands)
The output .ko.out
files have no header lines. Their column meaning:
- KEGG ID
- NAME of the KEGG ID
- number of input genes associated with this KEGG ID
- number of non-input genes (whole genome exclude input) associated with this KEGG ID
- number of input genes not associated with this KEGG ID
- number of non-input genes not associated with this KEGG ID
- fisher exact test p-value, left tailed
- fisher exact test p-value, right tailed (the enrichment p-value)
- fisher exact test p-value, two tailed
With these enrichment tables, the plan is to combine them to show which ontology terms were enriched under which threshold setting. Heatmap is a feasible visualization for this idea. We use genOntoHeatmapData.pl
, which is also from this repository, for generating heatmap data matrix with p-values transformed by -log10(p)
.
ubuntu@ubuntu20:~$ ./cop/scripts/genOntoHeatmapData.pl
Usage: genOntoHeatmapData.pl [-term <string>] [-x <p-value cutoff>] <p-value column> [<input name> <input file>]+
-x <float> : p-value cutoff (default: 0.05)
-term <string> : combination of columns for term definitions (default: "1")
ubuntu@ubuntu20:~$ ./cop/scripts/genOntoHeatmapData.pl -term 3,2 -x 0.01 9 DEG_0.05 Counts/DEG.0.05.go.out DEG_0.01 Counts/DEG.0.01.go.out DEG_0.005 Counts/DEG.0.005.go.out DEG_0.001 Counts/DEG.0.001.go.out > heatmap.go.txt
ubuntu@ubuntu20:~$ ./cop/scripts/genOntoHeatmapData.pl -term 1,2 8 DEG_0.05 Counts/DEG.0.05.ko.out DEG_0.01 Counts/DEG.0.01.ko.out DEG_0.005 Counts/DEG.0.005.ko.out DEG_0.001 Counts/DEG.0.001.ko.out > heatmap.ko.txt
ubuntu@ubuntu20:~$ wc -l heatmap.*.txt
56 heatmap.go.txt
19 heatmap.ko.txt
75 total
The first execution of genOntoHeatmapData.pl
is for generating heatmap matrix data of GO enrichment results, where the 9th column (fsh) is for the enrichment p-values to be visualized in the heatmap. Option -term 3,2
is to specify the combination of GOID_GO Term
as row names in the matrix data. Option -x 0.01
is to constrain that only GO terms with enrichment p-value<=0.01 in some DEG list would be included in the heatmap data. In other words, GO terms without any significant p-values (<=0.01) will not be included. It is advisable to adjust -x
to obtain a heatmap matrix data of an appropriate size.
The second execution of genOntoHeatmapData.pl
is for KEGG enrichment results, where the 8th column (right tailed fisher exact test p-value) is for the p-values to be visualized. Option -term 1,2
is to specify the combination of KEGG ID_KEGG NAME
as row names in the matrix data.
By standardizing the enrichment results from different annotation categories in to the same heatmap matrix data, the R script heatmap.R
can be used to draw the heatmaps. To use it, make sure R is available with R packages pheatmap
and RColorBrewer
installed. It is strongly suggested to modify the command inside heatmap.R
to fit your needs.
ubuntu@ubuntu20:~$ mkdir Heatmap
ubuntu@ubuntu20:~$ ls heatmap.*.txt | perl -ne 'chomp; /(.+)\.txt/; $cmd="Rscript cop/scripts/heatmap.R $_ Heatmap/$1.png"; print "\nCMD: $cmd\n"; #system $cmd'
CMD: Rscript cop/scripts/heatmap.R heatmap.go.txt Heatmap/heatmap.go.png
CMD: Rscript cop/scripts/heatmap.R heatmap.ko.txt Heatmap/heatmap.ko.png
(remove the sharp mark for executing the commands)
The following two pictures are the two heatmaps. It can be observed that some ontology terms are mostly enriched in different DEG lists. This somehow tells us functional enrichment in different scales. Controlling the gene lists by parameters like fold-changes and/or regulation direction would surely give us other perspectives.
The script genOntoDataframe.pl
can be use to generate a dataframe of the enrichment results so that they can be further adopted by other tools. In this example, we applied genOntoDataframe.pl
for making point using the R ggplot2
package.
ubuntu@ubuntu20:~$ ./cop/scripts/genOntoDataframe.pl
Usage: genOntoDataframe.pl [options] [<input name> <input file>]+
-filter <col> <filter> : column to be filtered by the filter (default: none)
-term <string> : combination of columns for term definitions (default: "1")
-data <col> <header> : assign data column and header name (default: no columns)
ubuntu@ubuntu20:~$ ./cop/scripts/genOntoDataframe.pl -term 3,2 -data 9 pVal -data 7 count -filter 7 ">1" -filter 9 "<0.01" DEG_0.05 Counts/DEG.0.05.go.out DEG_0.01 Counts/DEG.0.01.go.out DEG_0.005 Counts/DEG.0.005.go.out DEG_0.001 Counts/DEG.0.001.go.out > dataframe.go.txt
ubuntu@ubuntu20:~$ head dataframe.go.txt
TERM SET pVal count
GO:0004129_cytochrome-c oxidase activity DEG_0.05 0.0035598925574887356 2
GO:0004386_helicase activity DEG_0.05 0.008814439484553419 4
GO:0004386_helicase activity DEG_0.01 0.00879306622578951 2
GO:0004386_helicase activity DEG_0.005 0.06992305632225482 1
GO:0005215_transporter activity DEG_0.05 0.004705390123695814 15
GO:0005215_transporter activity DEG_0.01 0.12205315350291851 3
GO:0005215_transporter activity DEG_0.005 0.12318805016331086 2
GO:0005215_transporter activity DEG_0.001 0.10892799213816683 1
GO:0005886_plasma membrane DEG_0.05 0.008183812329780885 27
Descriptions on the options:
-
-term 3,2
: use the combination of "columns 3+2" as the TERM name. Note that we are extracting data from GO enrichment tables so "columns 3+2" means GOID+GO term. -
-data 9 pVal
: append a column with headerpVal
where the data cells are the fisher exact test enrichment p-values -
-data 7 count
: append a column with headercount
where the data cells are numbers of input genes associated with the GO term -
-filter 7 ">1"
: extract only terms with number of associated input genes>1 in some gene list -
-filter 9 "<0.01"
: extract only terms with a fisher exact test enrichment p-value<0.01 in some gene list -
DEG_0.05 Counts/DEG.0.05.go.out...
: useDEG_0.05
as SET for data from fileCounts/DEG.0.05.go.out
Accordingly, the output file can be read by the R script point.R
(requires ggplot2
) and transformed into a point plot. The point plot would be organized with ontology terms as rows and gene lists as columns. The points would be sized by the intersection gene size and colored by enrichment p-avlues (afater -log10(p)
transform).
ubuntu@ubuntu20:~$ Rscript cop/scripts/point.R dataframe.go.txt point.go.png