More informative error message for filter_gtf_for_genes_in_genome.py
(Python only task!)
#1082
Labels
Milestone
filter_gtf_for_genes_in_genome.py
(Python only task!)
#1082
Description of feature
The rnaseq pipeline filters records of the transcript annotation GTF, which refer to scaffolds lacking a sequence record in the FASTA file.
If there is e.g. a name mismatch between the two files and no overlaps are found, the custom script
./bin/filter_gtf_for_genes_in_genome.py
fails ungracefully:The reason is that the script never corroborates that
seq_name_gtf
was assigned by this step:There are further scenarios causing this step to fail, e.g. if
g.readlines()
doesn't yield a single line or if the GTF file is not tab-separated.Thus, I suggest that the script
filter_gtf_for_genes_in_genome.py
should be improved to account for those cases and produce meaningful and actionable error messages. I think, this would be a good issue for the next Hackathon?For more details, see the respective conversation on the #rnaseq Slack channel.
The text was updated successfully, but these errors were encountered: