Skip to content

Building an NCBI genome

Pablo Cingolani edited this page Aug 11, 2017 · 1 revision

When building a database with SnpEff if your genomic reference is in NCBI, there is a script that might help you build the database.

The script is buildDbNcbi.sh and is located in snpEff's scripts directory. It takes only one argument, which is the NCBI's ID.

Example: Salmonella enterica

In this example, we build the database for "Salmonella enterica subsp. enterica serovar Typhi str. P-stx-12" having accession ID CP003278.1

$ cd ~/snpEff
$ ./scripts/buildDbNcbi.sh CP003278.1
Downloading genome CP003278.1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10.2M    0 10.2M    0     0  3627k      0 --:--:--  0:00:02 --:--:-- 3627k
00:00:00	SnpEff version SnpEff 4.3p (build 2017-07-28 14:02), by Pablo Cingolani
00:00:00	Command: 'build'
00:00:00	Building database for 'CP003278.1'
00:00:00	Reading configuration file 'snpEff.config'. Genome: 'CP003278.1'
00:00:00	Reading config file: /home/pcingola/workspace/SnpEff/snpEff.config
00:00:00	done
Chromosome: 'CP003278'	length: 4768352

	Create exons from CDS (if needed): ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
	Exons created for 4690 transcripts.

	Deleting redundant exons (if needed): 
		Total transcripts with deleted exons: 0

	Collapsing zero length introns (if needed): 
		Total collapsed transcripts: 0
		Adding genomic sequences to exons: 	Done (4690 sequences added, 0 ignored).

	Adjusting transcripts: 
	Adjusting genes: .
	Adjusting chromosomes lengths: 
	Ranking exons: 
	Create UTRs from CDS (if needed): 
	Remove empty chromosomes: 

	Marking as 'coding' from CDS information: 
	Done: 0 transcripts marked
00:00:01	Caracterizing exons by splicing (stage 1) : 
	....
00:00:01	Caracterizing exons by splicing (stage 2) : 
	....00:00:01	done.
00:00:01	[Optional] Rare amino acid annotations
00:00:01	Warning: Cannot read optional protein sequence file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/protein.fa', nothing done.
00:00:01	Protein check file: '/home/pcingola/workspace/SnpEff/./data/CP003278.1/genes.gbk'

00:00:01	Checking database using protein sequences
00:00:01	Reading proteins from file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/genes.gbk'...
00:00:01	done (4690 Proteins).
00:00:01	Comparing Proteins...
	Labels:
		'+' : OK
		'.' : Missing
		'*' : Error
	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

	Protein check:	CP003278.1	OK: 4690	Not found: 0	Errors: 0	Error percentage: 0.0%
00:00:02	Saving database
00:00:02	[Optional] Reading regulation elements: GFF
00:00:02	Warning: Cannot read optional regulation file '/home/pcingola/workspace/SnpEff/./data/CP003278.1/regulation.gff', nothing done.
00:00:02	[Optional] Reading regulation elements: BED 
00:00:02	Cannot find optional regulation dir '/home/pcingola/workspace/SnpEff/./data/CP003278.1/regulation.bed/', nothing done.
00:00:02	[Optional] Reading motifs: GFF
00:00:02	Warning: Cannot open PWMs file /home/pcingola/workspace/SnpEff/./data/CP003278.1/pwms.bin. Nothing done
00:00:02	Done
00:00:02	Logging
00:00:03	Checking for updates...
00:00:04	Done.