Skip to content

Commit

Permalink
fixup: dataset
Browse files Browse the repository at this point in the history
Applies fixes to the dataset so far

1. Gff coordinate fixup
2. Adding the example sequences
3. Set defaultCds to the E gene
  • Loading branch information
j23414 committed May 30, 2024
1 parent 3bd63f6 commit baf0263
Show file tree
Hide file tree
Showing 15 changed files with 27,928 additions and 13 deletions.
2 changes: 1 addition & 1 deletion nextclade/datasets/all/genome_annotation.gff3
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
##sequence-region NC_002640.1 1 10649
NC_002640.1 feature gene 102 440 . + . codon_start=1;gene=C;gene_name=C;
NC_002640.1 feature gene 441 713 . + . codon_start=1;gene=pr;gene_name=pr;
NC_002640.1 feature gene 714 938 . + . codon_start=1;gene=M;gene_name=M;
NC_002640.1 feature gene 441 938 . + . codon_start=1;gene=M;gene_name=M;
NC_002640.1 feature gene 939 2423 . + . codon_start=1;gene=E;gene_name=E;
NC_002640.1 feature gene 2424 3479 . + . codon_start=1;gene=NS1;gene_name=NS1;
NC_002640.1 feature gene 3480 4133 . + . codon_start=1;gene=NS2A;gene_name=NS2A;
Expand Down
3 changes: 2 additions & 1 deletion nextclade/datasets/all/pathogen.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,5 +62,6 @@
"schemaVersion": "3.0.0",
"version": {
"tag": "unreleased"
}
},
"defaultCds": "E"
}
10,118 changes: 10,118 additions & 0 deletions nextclade/datasets/all/sequences.fasta

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions nextclade/datasets/denv1/genome_annotation.gff3
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
##gff-version 3
##sequence-region NC_001477.1 1 10735
NC_001477.1 feature gene 95 394 . + . codon_start=1;gene=C;gene_name=C;
NC_001477.1 feature gene 95 436 . + . codon_start=1;gene=C;gene_name=C;
NC_001477.1 feature gene 437 709 . + . codon_start=1;gene=pr;gene_name=pr;
NC_001477.1 feature gene 710 934 . + . codon_start=1;gene=M;gene_name=M;
NC_001477.1 feature gene 437 934 . + . codon_start=1;gene=M;gene_name=M;
NC_001477.1 feature gene 935 2419 . + . codon_start=1;gene=E;gene_name=E;
NC_001477.1 feature gene 2420 3475 . + . codon_start=1;gene=NS1;gene_name=NS1;
NC_001477.1 feature gene 3476 4129 . + . codon_start=1;gene=NS2A;gene_name=NS2A;
Expand Down
3 changes: 2 additions & 1 deletion nextclade/datasets/denv1/pathogen.json
Original file line number Diff line number Diff line change
Expand Up @@ -61,5 +61,6 @@
"schemaVersion": "3.0.0",
"version": {
"tag": "unreleased"
}
},
"defaultCds": "E"
}
4,492 changes: 4,492 additions & 0 deletions nextclade/datasets/denv1/sequences.fasta

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions nextclade/datasets/denv2/genome_annotation.gff3
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
##gff-version 3
##sequence-region NC_001474.2 1 10723
NC_001474.2 feature gene 97 396 . + . codon_start=1;gene=C;gene_name=C;
NC_001474.2 feature gene 97 438 . + . codon_start=1;gene=C;gene_name=C;
NC_001474.2 feature gene 439 711 . + . codon_start=1;gene=pr;gene_name=pr;
NC_001474.2 feature gene 712 936 . + . codon_start=1;gene=M;gene_name=M;
NC_001474.2 feature gene 439 936 . + . codon_start=1;gene=M;gene_name=M;
NC_001474.2 feature gene 937 2421 . + . codon_start=1;gene=E;gene_name=E;
NC_001474.2 feature gene 2422 3477 . + . codon_start=1;gene=NS1;gene_name=NS1;
NC_001474.2 feature gene 3478 4131 . + . codon_start=1;gene=NS2A;gene_name=NS2A;
Expand Down
3 changes: 2 additions & 1 deletion nextclade/datasets/denv2/pathogen.json
Original file line number Diff line number Diff line change
Expand Up @@ -61,5 +61,6 @@
"schemaVersion": "3.0.0",
"version": {
"tag": "unreleased"
}
},
"defaultCds": "E"
}
5,040 changes: 5,040 additions & 0 deletions nextclade/datasets/denv2/sequences.fasta

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions nextclade/datasets/denv3/genome_annotation.gff3
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
##gff-version 3
##sequence-region NC_001475.2 1 10707
NC_001475.2 feature gene 95 394 . + . codon_start=1;gene=C;gene_name=C;
NC_001475.2 feature gene 95 436 . + . codon_start=1;gene=C;gene_name=C;
NC_001475.2 feature gene 437 709 . + . codon_start=1;gene=pr;gene_name=pr;
NC_001475.2 feature gene 710 934 . + . codon_start=1;gene=M;gene_name=M;
NC_001475.2 feature gene 437 934 . + . codon_start=1;gene=M;gene_name=M;
NC_001475.2 feature gene 935 2419 . + . codon_start=1;gene=E;gene_name=E;
NC_001475.2 feature gene 2414 3469 . + . codon_start=1;gene=NS1;gene_name=NS1;
NC_001475.2 feature gene 3470 4123 . + . codon_start=1;gene=NS2A;gene_name=NS2A;
Expand Down
3 changes: 2 additions & 1 deletion nextclade/datasets/denv3/pathogen.json
Original file line number Diff line number Diff line change
Expand Up @@ -61,5 +61,6 @@
"schemaVersion": "3.0.0",
"version": {
"tag": "unreleased"
}
},
"defaultCds": "E"
}
5,531 changes: 5,531 additions & 0 deletions nextclade/datasets/denv3/sequences.fasta

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion nextclade/datasets/denv4/genome_annotation.gff3
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
##sequence-region NC_002640.1 1 10649
NC_002640.1 feature gene 102 440 . + . codon_start=1;gene=C;gene_name=C;
NC_002640.1 feature gene 441 713 . + . codon_start=1;gene=pr;gene_name=pr;
NC_002640.1 feature gene 714 938 . + . codon_start=1;gene=M;gene_name=M;
NC_002640.1 feature gene 441 938 . + . codon_start=1;gene=M;gene_name=M;
NC_002640.1 feature gene 939 2423 . + . codon_start=1;gene=E;gene_name=E;
NC_002640.1 feature gene 2424 3479 . + . codon_start=1;gene=NS1;gene_name=NS1;
NC_002640.1 feature gene 3480 4133 . + . codon_start=1;gene=NS2A;gene_name=NS2A;
Expand Down
3 changes: 2 additions & 1 deletion nextclade/datasets/denv4/pathogen.json
Original file line number Diff line number Diff line change
Expand Up @@ -61,5 +61,6 @@
"schemaVersion": "3.0.0",
"version": {
"tag": "unreleased"
}
},
"defaultCds": "E"
}
2,729 changes: 2,729 additions & 0 deletions nextclade/datasets/denv4/sequences.fasta

Large diffs are not rendered by default.

0 comments on commit baf0263

Please sign in to comment.