-
Notifications
You must be signed in to change notification settings - Fork 80
Building an NCBI genome
Pablo Cingolani edited this page Aug 11, 2017
·
1 revision
When building a database with SnpEff if your genomic reference is in NCBI, there is a script that might help you build the database.
The script is buildDbNcbi.sh
and is located in snpEff's scripts directory.
It takes only one argument, which is the NCBI's ID.
In this example, we build the database for "Salmonella enterica subsp. enterica serovar Typhi str. P-stx-12" having accession ID CP003278.1
$ cd ~/snpEff
$ ./scripts/buildDbNcbi.sh CP003278.1
Downloading genome CP003278.1
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10.2M 0 10.2M 0 0 3627k 0 --:--:-- 0:00:02 --:--:-- 3627k
00:00:00 SnpEff version SnpEff 4.3p (build 2017-07-28 14:02), by Pablo Cingolani
00:00:00 Command: 'build'
00:00:00 Building database for 'CP003278.1'
00:00:00 Reading configuration file 'snpEff.config'. Genome: 'CP003278.1'
00:00:00 Reading config file: /home/pcingola/workspace/SnpEff/snpEff.config
00:00:00 done
Chromosome: 'CP003278' length: 4768352
Create exons from CDS (if needed): ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Exons created for 4690 transcripts.
Deleting redundant exons (if needed):
Total transcripts with deleted exons: 0
Collapsing zero length introns (if needed):
Total collapsed transcripts: 0
Adding genomic sequences to exons: Done (4690 sequences added, 0 ignored).
Adjusting transcripts:
Adjusting genes: .
Adjusting chromosomes lengths:
Ranking exons:
Create UTRs from CDS (if needed):
Remove empty chromosomes:
Marking as 'coding' from CDS information:
Done: 0 transcripts marked
00:00:01 Caracterizing exons by splicing (stage 1) :
....
00:00:01 Caracterizing exons by splicing (stage 2) :
....00:00:01 done.
00:00:01 [Optional] Rare amino acid annotations
00:00:01 Warning: Cannot read optional protein sequence file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/protein.fa', nothing done.
00:00:01 Protein check file: '/home/pcingola/workspace/SnpEff/./data/CP003278.1/genes.gbk'
00:00:01 Checking database using protein sequences
00:00:01 Reading proteins from file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/genes.gbk'...
00:00:01 done (4690 Proteins).
00:00:01 Comparing Proteins...
Labels:
'+' : OK
'.' : Missing
'*' : Error
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Protein check: CP003278.1 OK: 4690 Not found: 0 Errors: 0 Error percentage: 0.0%
00:00:02 Saving database
00:00:02 [Optional] Reading regulation elements: GFF
00:00:02 Warning: Cannot read optional regulation file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/regulation.gff', nothing done.
00:00:02 [Optional] Reading regulation elements: BED
00:00:02 Cannot find optional regulation dir '/home/pcingola/workspace/SnpEff/./data/CP003278.1/regulation.bed/', nothing done.
00:00:02 [Optional] Reading motifs: GFF
00:00:02 Warning: Cannot open PWMs file /home/pcingola/workspace/SnpEff/./data/CP003278.1/pwms.bin. Nothing done
00:00:02 Done
00:00:02 Logging
00:00:03 Checking for updates...
00:00:04 Done.