Skip to content

Error and Warning messages

Pablo Cingolani edited this page Nov 25, 2020 · 9 revisions

SnpEff defines several messages in roughly 3 categories:

  • INFO: An informative message
  • WARNING: A problem in the reference genome definition that MAY result in an incorrect variant annotation
  • ERROR: A problem in the reference genome definition that WILL ALMOST CERTAINLY result in an incorrect variant annotation

INFO_REALIGN_3_PRIME

The variant has been realigned to the most 3-prime position within the transcript.

This is usually done to comply with HGVS specification to always report the most 3-prime annotation. While VCF requires to realign to the left-most of the reference genome, HGSV requires to realign to the most 3-prime. These two specifications are contradicting in some cases, so in order to comply with HGSV, sometimes a local realignment is required.

IMPORTANT: This message is just indicating that a realignment was performed, so ** when this INFO message is present, the original coordinates from the VCF file are not exactly the same as the coordinates used to calculate the variant annotation **

WARNING_SEQUENCE_NOT_AVAILABLE

The exon does not have reference sequence information. The annotation may not be calculated (e.g. incomplete transcripts).

WARNING_REF_DOES_NOT_MATCH_GENOME

The genome reference does not match the variant's reference.

For example, if the VCF file indicates that the reference at a certain location is 'A', while SnpEff's database indicates that the reference should be 'C', this WARNING would be added.

Under normal circumstances, there should be none of these warnings (or at most a handful).

IMPORTANT: If too many of these warnings are seen, this indicates a severe problem (version mismatch between your VCF files and the reference genome). A typical case when too many of these warning are seen is when trying to annotate using a different genome than the one used for alignment (e.g. reads are aligned to hg19 but variants are annotated to using hg38)

WARNING_TRANSCRIPT_INCOMPLETE

The number of coding bases is NOT multiple of 3, so there is missing information for at least one codon. This indicates an error in the reference genome gene and/or transcript definition. This could happen in genomes that are not well understood.

WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS

Multiple STOP codons found in a CDS. There should be only one STOP codon at the end of the transcript, but in this case, the transcript has multiple STOP codons, which is unlikely to be real.

This usually indicates an error on the reference genome (or database). Could for, for example, indicating frame errors in the reference genome for one or more exons in this transcript.

WARNING_TRANSCRIPT_NO_START_CODON

Start codon does not match any 'start' codon in the CodonTable.

This usually indicates an error on the reference genome (or database) but could be also due to a misconfigured codon table for the genome. You should check that the codon table is properly set in snpEff.config

WARNING_TRANSCRIPT_NO_STOP_CODON

Stop codon does not match any 'stop' codon in the CodonTable.

This usually indicates an error on the reference genome (or database) but could be also due to a misconfigured codon table for the genome. You should check that the codon table is properly set in snpEff.config

ERROR_CHROMOSOME_NOT_FOUND

Chromosome name not found. Typically due to mismatch in chromosome naming conventions between variants file and database, but can be a more several problems (different reference genome).

See more details (here)[https://github.com/pcingola/SnpEff/wiki/ERROR_CHROMOSOME_NOT_FOUND]

ERROR_OUT_OF_CHROMOSOME_RANGE

Variant's genomic position is outside chromosome's range.

Simple, the variant coordinate is outside the reference genome chromosome's length.

IMPORTANT: If too many of these warnings are seen, this indicates a severe problem (version mismatch between your VCF files and the reference genome). A typical case when too many of these warning are seen is when trying to annotate using a different genome than the one used for alignment (e.g. reads are aligned to hg19 but variants are annotated to using hg38)

ERROR_OUT_OF_EXON

An exonic variant is falling outside the exon.

ERROR_MISSING_CDS_SEQUENCE

Missing coding sequence information. In this case, the full variant annotation cannot be calculated due to missing CDS information.

This usually indicates an error on the reference genome (or database).