-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError #1
Comments
Hi Lars, Would you mind sharing a few thousand lines of your SAM file?, I haven't tried SAMs from Edgardo |
Wow, very fast reply :D I can share the whole file, it is not that big. It is the product of an reference based scaffolding step. So, it contains orientated contigs. |
Thanks, I will take a look and get back in the new few days (pretty busy with work too) |
Hi Edgardo, hi Lars, Processing file locus_accepted_hits.sam: SAM header processed, 0 references found. Traceback (most recent call last): |
Sorry , it took me longer than I expected to check on your files. Edgardo |
Hi Edgardo,
In ran sam2consensus with headers in the sam file. The job runs endless after "Start processing..." and is finally aborted :
Start processing...
Segmentation fault (core dumped)
[locus.sam.txt](https://github.com/edgardomortiz/sam2consensus/files/7512690/locus.sam.txt)
|
Hey Edgardo,
I used Minimap2 for mapping. I switched now to bcftools to pull the consensus. It is only to scaffold the draftgenome out of contigs, after the denovo assembly. I found an workaround to do so…
Thank you for your effort!
Best regards,
Lars
…--
Lars Steinbrück
Medizinische Hochschule Hannover
Institut für Virologie OE 5230
AG Schulz/NGS/Genomics
Geb. I6, Ebene 6, Raum 2060
Carl-Neuberg-Straße 1
30625 Hannover
Tel.: ++49 (0)511 532 4313
Fax.: ++49 (0)511 532 8736
Von: Edgardo M. Ortiz ***@***.***>
Gesendet: Mittwoch, 10. November 2021 09:14
An: edgardomortiz/sam2consensus ***@***.***>
Cc: Steinbrück, Lars ***@***.***>; Mention ***@***.***>
Betreff: Re: [edgardomortiz/sam2consensus] KeyError (#1)
Sorry , it took me longer than I expected to check on your files.
@steinbrl<https://github.com/steinbrl> sam2consensus needs a SAM with full header information, also the SAM you sent doesn't have any CIGAR strings, perhaps you could provide the command you used to create this SAM?
@DenisTEMPE<https://github.com/DenisTEMPE> your case is easier to solve, assuming these are the first lines of the SAM you got, perhaps you are obtaining the file through samtools view which without additional options skips the header, just add -h to your samtools view command to include the header in the output.
Edgardo
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#1 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AOAEBCFMMX22LRFOV2PAFMDULISTLANCNFSM5F7TCRNQ>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Hi @DenisTEMPE , After a few tests I found the reason for the failure. My script creates empty containers for the reference sequences which get filled with nucleotides sums as the SAM is processed, these take around 10 bytes per nucleotide in the reference. So, given a reference as long as the one you have the script quickly fills up the RAM and get the process killed. My code is impractical for large references, it can only work well with references of at most 10Mbp, when I wrote the program I was working with small organellar genomes. Perhaps I will attempt to rewrite the code for large references, but I can't promise when. Sorry, you will have to look for another solution. @steinbrl thank for the info, that makes perfect sense for your case. If the consensus calculated by bcftools is reference-agnostic it could help @DenisTEMPE too. Edgardo |
Hi,
I tried to pull an consensus from an Minimap2 Alignment. sam2consensus stops with the following error message:
lars@helmut:~/Desktop/VZV/Carina_09_2021/VZV_ORF14_cmv/assembly-test/Scaffold$ python /home/lars/Desktop/NGS/sam2consensus-master/sam2consensus.py -f N -p final -i scaffold_mapping.sam
Processing file scaffold_mapping.sam:
SAM header processed, 1 references found.
0 reads processed.
Traceback (most recent call last):
File "/home/lars/Desktop/NGS/sam2consensus-master/sam2consensus.py", line 430, in
main()
File "/home/lars/Desktop/NGS/sam2consensus-master/sam2consensus.py", line 212, in main
sequences[refname][pos_ref][nuc] += 1
KeyError: '*'
The .sam file seemed to be correct. There are no issues while importing and visualize it in CLC or Geneious.
Best,
Lars
The text was updated successfully, but these errors were encountered: