gene without cDNA sequence causes error when running quant.py #102

adeschen · 2023-03-02T23:05:33Z

Hi,

The quantification step is crashing for some of our samples. The error is always the same:

Traceback (most recent call last):
  File "/grid/tuveson/home/software_elzar/bin/miniconda/miniconda3_v4.12.0/miniconda3/envs/arcasHLA-v0.5.0_db3.47.0/share/arcas-hla-0.5.0-0/scripts/quant.py", line 250, in <module>
    allele_results[allele_id[:-1]]['allele' + allele_id[-1]] = allele

There is an easy fix to this problem. The current quant.py core (line 250):

    for allele_id, allele in genotype.items():
        allele_results[allele_id[:-1]]['allele' + allele_id[-1]] = allele

can be modified to check if the key exists and if not, create one:

   for allele_id, allele in genotype.items():
    if not allele_id[:-1] in allele_results:
        allele_results[allele_id[:-1]] = defaultdict(float)
    allele_results[allele_id[:-1]]['allele' + allele_id[-1]] = allele

That modification removes the error but don't remove the cause of the problem. The quant.py script uses the .p file to populate the allele_results dictionary. The .p contains both the genes and the genotype dictionaries. The genes dictionary is used by the quant.py script to create the keys of the allele_results dictionary. However, it appears that the genes dictionary is something missing genes that have alleles in the genotype dictionary.

It seams the cause of this problem is related to the lack of cDNA sequence for some allele as managed in the customize.py script. The script generates the .p file that is going to be used by the quant.py script. Both the genes and genotype dictionaries are saved in the .p file. However, the genes dictionary won't contains the genes that are missing cDNA sequences for both alleles (line 137 of customize.py) as the condition for the loop is not respected:

        for seq in sequences:
            hla_idx[allele_id].append(str(idx))
            genes[gene].append(allele_id)

Can you address this issue.

Best regards
Astrid

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gene without cDNA sequence causes error when running quant.py #102

gene without cDNA sequence causes error when running quant.py #102

adeschen commented Mar 2, 2023 •

edited

Loading

gene without cDNA sequence causes error when running quant.py #102

gene without cDNA sequence causes error when running quant.py #102

Comments

adeschen commented Mar 2, 2023 • edited Loading

adeschen commented Mar 2, 2023 •

edited

Loading