Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polishing with racon using multiple input files? #70

Open
BioLaFu opened this issue Sep 6, 2022 · 2 comments
Open

Polishing with racon using multiple input files? #70

BioLaFu opened this issue Sep 6, 2022 · 2 comments

Comments

@BioLaFu
Copy link

BioLaFu commented Sep 6, 2022

Hi,
I have got a reference genome which I want to polish with more than just one sequence file.
Basically I've got a reference genome which was generated using PacBio and then I've got 50 whole genome sequences which were generated using Illumina.
Now I want to polish the PacBio reference genome using the "Illumina data". I managed to polish the reference genome once, with just one Illumina data set. But when I go ahead and try to polish the resulting file with the next Illumina data set I get the following error:
[racon::Window::add_layer] error: layer begin and end positions are invalid!

Is there a way to do what I want or is it simply not possible using racon?

I also thought about combining all of the Illumina sequences into one file, but that doesn't seem sensible, regarding I am working on snail genomes each about 1 Gb big....

Thanks in advance!
Laura

@rvaser
Copy link
Collaborator

rvaser commented Sep 9, 2022

Hello Laura,
unfortunately current API does not allow multiple files. You would need to combine them together and run Racon in one command. Also, if you have paired-ends they need to have unique names up to the first white space (you can preprocess the file before mapping with https://github.com/lbcb-sci/racon/blob/master/scripts/racon_preprocess.py).

Best regards,
Robert

@BioLaFu
Copy link
Author

BioLaFu commented Sep 12, 2022

Hello Robert,

Thanks so much for your reply!
I am aware of the preprocessing step to get unique names for paired end data.
So I will go ahead and try to combine several fastq files into one, preprocess that and then run racon as usual...
Thanks again for your help.
Laura

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants