Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An issue with the installation/environment #16

Open
lauraht opened this issue Jun 11, 2022 · 5 comments
Open

An issue with the installation/environment #16

lauraht opened this issue Jun 11, 2022 · 5 comments

Comments

@lauraht
Copy link

lauraht commented Jun 11, 2022

Hello!

I have an issue with the isONcorrect installation and would appreciate your advice.

Last November, I installed isONcorrect using conda (and created the isoncorrect env) by following your README instructions, and I was able to run isONcorrect successfully. Now, I wanted to run isONcorrect again, so I just used:

conda activate isoncorrect

However, isONcorrect --help now gave me the following error:

Traceback (most recent call last):
  File "~/miniconda3/envs/isoncorrect/bin/isONcorrect", line 20, in <module>
    from modules import create_augmented_reference, help_functions, correct_seqs #,align
ModuleNotFoundError: No module named 'modules'

I found that modules and isONcorrect-0.0.8.dist-info are located under ~/miniconda3/envs/isoncorrect/lib/python3.10/site-packages. But under ~/miniconda3/envs/isoncorrect/lib/python3.7/site-packages, there is no modules and no isONcorrect-0.0.8.dist-info. However, ~/miniconda3/envs/isoncorrect/bin has only python3.7, no python3.10.

What puzzles me is that isONcorrect used to work fine when I first installed it.

I was wondering if you have any ideas about what may be wrong with my installation or environment? I would really appreciate your advice.

Thank you very much in advance!

@ksahlin
Copy link
Owner

ksahlin commented Jun 11, 2022

Hi @lauraht,

Not exactly. I would recommend uninstalling the isoncorrect environment and reinstall it from scratch.

conda deactivate isoncorrect
conda env remove -n isoncorrect
then reinstall..

If that doesn't work we could take it from there.

Let me know how it goes!

@lauraht
Copy link
Author

lauraht commented Jun 13, 2022

Hi Kristoffer,

Thank you so much for your advice!

I removed the isoncorrect env and then reinstalled isONcorrect. Now isONcorrect works fine!

I have another question about isONcorrect and would appreciate your advice:
(1) For those singleton reads (belonging to size-1 clusters), isONcorrect does not perform any error correction on them, right?
(2) For those reads belonging to size-2 clusters, it seems that for some of them, isONcorrect performs error correction (i.e. reads sequences are changed compared to the original reads). I was wondering how spoa generates the consensus from only two reads. If read-1 has a base ‘A’ while read-2 has a base ‘T’ at the same position, how would isONcorrect determine whether it is read-1 or read-2 that has the error (since there are only two reads in the cluster)? Similarly, if read-1 misses a base compared to read-2, how would we know whether it is a deletion in read-1 or actually an insertion in read-2?

Thank you very much for your help!

@ksahlin
Copy link
Owner

ksahlin commented Jun 13, 2022

As for Q1 the answer is no.

As for Q2, I thought we set the threshold to minimum of 3 reads. So I am surprised to hear that you have observed correction to 2-reads clusters. I don't want to doubt you (and it was a long time ago I implemented it), but you could probably check again whether this is true (because this line is supposed to stop anything lower than 3 reads from being corrected).

Your question can still be answered about what happens to the spoa consensus: In terms of A/T - I don't know. In terms of an indel - spoa will choose the longer path (insertion).

However, isONcorrect will not simply use the spoa consensus as the corrected version of the read. isONcorrect will remap all read-segments to the spoa consensus segment generated from the read-segments. Then it will infer the allele frequency of the particular variant (SNP/indel) and correct the position in the read-segment only if its frequency is lower than a certain frequency threshold (default is lower than 10% frequency, with a hard lower occurrence of 3, seed this line.

@lauraht
Copy link
Author

lauraht commented Jun 16, 2022

Hi Kristoffer,

Thank you so much for your explanations!

Just to confirm, for Q1, when you say “the answer is no”, you mean isONcorrect does not perform any error correction on singleton reads, is that right?

About 2-reads clusters, I used the diff command between the original fastq file and the corrected fastq file of a cluster. I found that only in a small fraction of 2-reads clusters, the diff command reported the difference on the sequence line (line 2 and line 6).

Thank you very much again!

@ksahlin
Copy link
Owner

ksahlin commented Jun 18, 2022

Yes, isONcorrect does not perform any error correction on singleton reads.

I see. I don’t know why tbh. From my memory and from looking at the code, this should not happen. If you have an example where two sequences are input and changed after I could do a bug search for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants