Building an NCBI genome - pcingola/SnpEff GitHub Wiki
When building a database with SnpEff if your genomic reference is in NCBI, there is a script that might help you build the database.
The script is buildDbNcbi.sh
and is located in snpEff's scripts directory.
It takes only one argument, which is the NCBI's ID.
Example: Salmonella enterica
In this example, we build the database for "Salmonella enterica subsp. enterica serovar Typhi str. P-stx-12" having accession ID CP003278.1
$ cd ~/snpEff
$ ./scripts/buildDbNcbi.sh CP003278.1
Downloading genome CP003278.1
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10.2M 0 10.2M 0 0 3627k 0 --:--:-- 0:00:02 --:--:-- 3627k
00:00:00 SnpEff version SnpEff 4.3p (build 2017-07-28 14:02), by Pablo Cingolani
00:00:00 Command: 'build'
00:00:00 Building database for 'CP003278.1'
00:00:00 Reading configuration file 'snpEff.config'. Genome: 'CP003278.1'
00:00:00 Reading config file: /home/pcingola/workspace/SnpEff/snpEff.config
00:00:00 done
Chromosome: 'CP003278' length: 4768352
Create exons from CDS (if needed): ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Exons created for 4690 transcripts.
Deleting redundant exons (if needed):
Total transcripts with deleted exons: 0
Collapsing zero length introns (if needed):
Total collapsed transcripts: 0
Adding genomic sequences to exons: Done (4690 sequences added, 0 ignored).
Adjusting transcripts:
Adjusting genes: .
Adjusting chromosomes lengths:
Ranking exons:
Create UTRs from CDS (if needed):
Remove empty chromosomes:
Marking as 'coding' from CDS information:
Done: 0 transcripts marked
00:00:01 Caracterizing exons by splicing (stage 1) :
....
00:00:01 Caracterizing exons by splicing (stage 2) :
....00:00:01 done.
00:00:01 [Optional] Rare amino acid annotations
00:00:01 Warning: Cannot read optional protein sequence file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/protein.fa', nothing done.
00:00:01 Protein check file: '/home/pcingola/workspace/SnpEff/./data/CP003278.1/genes.gbk'
00:00:01 Checking database using protein sequences
00:00:01 Reading proteins from file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/genes.gbk'...
00:00:01 done (4690 Proteins).
00:00:01 Comparing Proteins...
Labels:
'+' : OK
'.' : Missing
'*' : Error
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Protein check: CP003278.1 OK: 4690 Not found: 0 Errors: 0 Error percentage: 0.0%
00:00:02 Saving database
00:00:02 [Optional] Reading regulation elements: GFF
00:00:02 Warning: Cannot read optional regulation file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/regulation.gff', nothing done.
00:00:02 [Optional] Reading regulation elements: BED
00:00:02 Cannot find optional regulation dir '/home/pcingola/workspace/SnpEff/./data/CP003278.1/regulation.bed/', nothing done.
00:00:02 [Optional] Reading motifs: GFF
00:00:02 Warning: Cannot open PWMs file /home/pcingola/workspace/SnpEff/./data/CP003278.1/pwms.bin. Nothing done
00:00:02 Done
00:00:02 Logging
00:00:03 Checking for updates...
00:00:04 Done.