-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Binning fails #7
Comments
Hi, could you provide the command that you used? |
Sure: for my own datasets, i used the same command, but modified the files and output folder, obviously. |
I guess the confusing message you mentioned before is due to an installation issue. for example, the name of conda env should be nanophase0.2.2, but somehow, as I can see from the log file, you activated nanophase env using a command like conda activate nanophase, but the nanophase command invoked was under the nanophase0.2.2 env. Because they are only warning messages, so no need to worry about this. Before I can identify the potential issues, could you use the following command (after activation of the nanophase package) to see what exactly has happened for metabat binning: |
Thanks for your answer. |
It is weird for me, I can't reproduce this error using the example dataset. I would suggest removing the whole package of nanophase 0.2.2 and re-install it to see if this problem could be resolved. |
I am having this exact same issue with the exact same results as this thread. Winterlich did you ever solve the problem? |
Okay, that is interesting. I reinstalled the package as suggested, but this did not resolve the problem. I wasn't able to dive deeper into this, so far. But I am happy for any suggestions........ |
Hmm, what are the general size of your reads? Mine are admittedly kind of small for nanopore and its possible that flye is filtering too many so as that metabat2 does not have enough information to work with. |
Thats a good idea, my read sets are also rather small. I will try another, larger dataset these days and will report on this... |
Thank you both for your contributions! If only a small long-read dataset was provided, it would be pretty challenging to perform genome binning. If you wanted to try nanophase with a long-read dataset, we had sequenced a mock community (you can find more details about the mock community in our paper) using nanopore sequencing and uploaded it to NCBI. The dataset can be downloaded via the following command: (you may need to install sra-tools)
Please don't hesitate to let me know if I can help. Best |
So I ran it using the provided practice data set from your setup page and this was my result: All required packages have been found in the environment. If the above certain packages integrated into nanophase were used in your investigation, please give them credit as well :) So the I would guess that the issue is going beyond just the data we are providing, though as of yet/what I am not sure. I'm not running this on the world's most powerful computer either, is it possible I'm hitting a CPU bottleneck? I'm running it on a laptop with an i7 1360p (12 cores, 5Ghz) with 32 GBs of RAM. The ram is definitely not the bottleneck but I'm noticing my CPU is hitting 100% utilization during this run. |
Did you mean the lr.fa.gz in the Example dataset? |
Yes! Is there a better one I should run? |
I am still unsure what happened, I would expect the command to exit at the semibin stage rather than metabat2 if you use the provided lr.fa.gz. If you want to try v0.2.3, you can download the long-read dataset: SRR17913199, as I mentioned earlier. Is that possible for you to run it on a Linux workstation? |
I actually did run this on ubuntu on a windows subsystem, I don't have a workstation though. I did originally do this on v0.2.3 and I had gotten the same output with my data, I didn't try it with the practice data though. I can try it with the specific long-read data set as well though. |
Ok so I ran the specified data-set on v0.2.3 and this was my output: Which is different than our previous outputs, I looked into the flye.log.debug and it looks like my system just ran out of memory (oops), so I didn't get much data out of that attempt. Mayhaps I shall try again. The bin log is showing conitigs being created from my previous attempts with my own data, I think flye might just be set to too high an overlap. |
Hi there,
I just tried Nanophase, both with one of my datasets and with the example dataset.
The assembly using flye --meta works fine, but the pipeline keeps terminating at the initial binning step. The logfile of MetaBat2 shows only this:
MetaBAT 2 (v2.12.1) using minContig 2500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, and maxEdges 200.
I tried the version 0.2.2 and 0.2.3 but both versions did not work with mine datasets or the example dataset.
The NanoPhase check shows this information:
Check software availability and locations
The following packages have been found
#package location
flye /home/xxx/anaconda3/envs/nanophase0.2.2/bin/flye
metabat2 /home/xxx/anaconda3/envs/nanophase0.2.2/bin/metabat2
maxbin2 /home/xxx/anaconda3/envs/nanophase0.2.2/bin/run_MaxBin.pl
metawrap /home/xxx/anaconda3/envs/nanophase0.2.2/bin/metawrap
checkm /home/xxx/anaconda3/envs/nanophase0.2.2/bin/checkm
racon /home/xxx/anaconda3/envs/nanophase0.2.2/bin/racon
medaka /home/xxx/anaconda3/envs/nanophase0.2.2/bin/medaka
polypolish /home/xxx/anaconda3/envs/nanophase0.2.2/bin/polypolish
POLCA /home/xxx/anaconda3/envs/nanophase0.2.2/bin/polca.sh
bwa /home/xxx/anaconda3/envs/nanophase0.2.2/bin/bwa
seqtk /home/xxx/anaconda3/envs/nanophase0.2.2/bin/seqtk
minimap2 /home/xxx/software/ont-guppy/bin/minimap2
BBMap /home/xxx/anaconda3/envs/nanophase0.2.2/bin/BBMap
parallel /home/xxx/anaconda3/envs/nanophase0.2.2/bin/parallel
perl /home/xxx/anaconda3/envs/nanophase0.2.2/bin/perl
samtools /home/xxx/anaconda3/envs/nanophase0.2.2/bin/samtools
gtdbtk /home/xxx/anaconda3/envs/nanophase0.2.2/bin/gtdbtk
fastANI /home/xxx/anaconda3/envs/nanophase0.2.2/bin/fastANI
blastp /home/xxx/anaconda3/envs/nanophase0.2.2/bin/blastp
All required packages have been found in the environment. If the above certain packages integrated into nanophase were used in your investigation, please give them credit as well :)
grep: warning: stray \ before /
Warning: [flye metabat2 maxbin2 metawrap checkm racon medaka polypolish POLCA bwa seqtk BBMap parallel perl samtools gtdbtk fastANI blastp minimap2] has not been installed in the [nanophase] env. Strongly recommend intalling all packages in the nanophase env, or it may result in a failure
This message is confusing, since the required packages are installed and found, but the pipeline keeps warning about missing software.
Anyway, I would love to test your pipeline. Please let me know, if i can provide any additional information for this issue.
The text was updated successfully, but these errors were encountered: