Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract “Tangled circles” #35

Open
Flooooooooooooower opened this issue Aug 29, 2024 · 15 comments
Open

Extract “Tangled circles” #35

Flooooooooooooower opened this issue Aug 29, 2024 · 15 comments

Comments

@Flooooooooooooower
Copy link

Then, the circular contigs were left alone, and each tangled “circle” was re-assembled by Hifiasm-meta with default parameters, using the fragmental contigs that constructed the “circle” as input reads, and resulted seven extra circular contigs.

HiI am using hifiasm-meta for metagenome assembly and have divided the assembled results into three types. However, I am encountering some issues, as mentioned above, because I am unfamiliar with the software and don’t know how to extract and reassemble the “tangled circles.” I would greatly appreciate any help you can provide.
Best wishes

@Flooooooooooooower
Copy link
Author

Hi,I have been resolved the problem

@xfengnefx
Copy link
Owner

Sorry for the late rely. Glad it's resolved. Please feel free to drop a new issue if you have any questions :)

@Flooooooooooooower
Copy link
Author

Hi, I'm sorry to bother you again. I encountered the following error while running hifiasm, and I'm not sure what it means. It's also worth noting that the memory usage can reach up to 600GB during the run.

Writing reads to disk...
wrote cmd of length 387: hamt version=0.3-r073, ha base version=0.13-r308, CMD= /home/jiaoh/00.Software/hifiasm-meta/hifiasm_meta -o /home/jiaoh/10.meta-genome/04.Enviroment/01.PRJNA879921/02.Assembly/plant_gas -t 50 /home/jiaoh/10.meta-genome/04.Enviroment/01.PRJNA879921/01.host_remove/plant_gas.clean.fq.gz
Bin file was created on Mon Sep  9 09:59:05 2024
Hifiasm_meta 0.3-r073 (hifiasm code base 0.13-r308).
Reads has been written.
[hamt::write_All_reads] Writing per-read coverage info...
[hamt::write_All_reads] Finished writing.
Writing ma_hit_ts to disk...
ma_hit_ts has been written.
Writing ma_hit_ts to disk...
ma_hit_ts has been written.
bin files have been written.
[M::hamt_clean_graph] (peak RSS so far: 620.3 GB)
[M::hamt_clean_graph] no debug gfa
[M::hamt_clean_graph] (peak RSS so far: 620.3 GB)
MGA.sh: line 27: 30568 Killed                  ~/00.Software/hifiasm-meta/hifiasm_meta -o ${path}/02.Assembly/${i} -t ${threads} ${fq_clean}

@xfengnefx
Copy link
Owner

Looks like oom kill, unless job hit some time limit. Could you try resuming the run by ~/00.Software/hifiasm-meta/hifiasm_meta -o ${path}/02.Assembly/${i} -t ${threads} ${fq_clean} in the same directory, with all variables the same as before? The stage after all-v-all read ovec should use less memory and hopefully could finish.

If you have other samples to run, or runs killed before the log could say "bin files have been written", please try the meta_dev branch (commit f98f1ad, r74) instead, which tries to fix the high peak RSS issue and otherwise identical to r73. I will merge r74 into master and update bioconda this week.

@xfengnefx xfengnefx reopened this Sep 9, 2024
@Flooooooooooooower
Copy link
Author

Hi, has the latest r74 version been released? Does this version address the high peak RSS issue and help reduce memory usage? Thank you!

@xfengnefx
Copy link
Owner

xfengnefx commented Sep 22, 2024

It is now merged into master meta branch (the default branch here) and is in the current release. I also opened PR at bioconda, which is waiting for review and hopeful will be merged soon now merged. Thanks for your patience.

@Flooooooooooooower
Copy link
Author

Hi, I saw that you released an update, and I immediately installed it using conda. However, unfortunately, I encountered the following error. Could you please help me understand what might be the cause

[M::main] Start: Mon Sep 23 14:42:49 2024

[M::hamt_assemble] Skipped read selection.
/opt/gridview/slurm/spool_slurmd/job9804618/slurm_script: line 53: 112549 Segmentation fault      hifiasm_meta -o ${output}/02.Assembly/${i} -t ${threads} ${fq_clean}

@xfengnefx
Copy link
Owner

Are you using bin files from the oom killed run above? From the log I guess no? If it is indeed a new run: could you try simple rerun the failed job with everything unchanged? I remember from a very long time ago, I saw a segfault, around this stage into the run, that disappeared upon rerun and I failed to reproduce it afterwards, therefore whatever that has been unfixed...

If rerun does not resolve this segfault, I might need to roll HEAD back to r73.

I wanted to say "share data if it can be shared and I will troubleshoot" as usual, but I do not have access to HPC clusters right now. Sorry.

@Flooooooooooooower
Copy link
Author

Hello, I am still encountering the above error when rerunning locally on the HPC. The data is from all Colorectum data in PRJNA748109.

hifiasm_meta -o 02.Assembly/PRJNA748109_Colorectum  -t 50 ../05.PRJNA748109_Colorectum/01.host_remove/PRJNA748109_Colorectum.clean.fq.gz
[M::main] Start: Tue Sep 24 11:20:12 2024

[M::hamt_assemble] Skipped read selection.
Segmentation fault

@Flooooooooooooower
Copy link
Author

Hello. I'm sorry to bother you again. What should I do when encountering the following log information? However, I still obtained the assembled results.

********** checkpoint: post-assembly **********

[M::hamt_clean_graph] (peak RSS so far: 126.1 GB)
[M::hamt_ug_opportunistic_elementary_circuits] collected 0 circuits, used 0.02s
[M::hamt_ug_opportunistic_elementary_circuits] wrote all rescued circles, used 0.00s
[T::hamt_ug_opportunistic_elementary_circuits_helper_deduplicate_minhash] got the sequences, used 0.0s
[T::hamt_minhash_mashdist] sketched - 0.0s.
[T::hamt_minhash_mashdist] compared - 0.0s.
[T::hamt_ug_opportunistic_elementary_circuits_helper_deduplicate_minhash] collected mash distances for 0 seqs, used 0.0s
[M::hamt_ug_opportunistic_elementary_circuits_helper_deduplicate_minhash] had 0 paths, 0 remained (0 dropped by length diff, 0 by length abs),used 0.0s after sketching.
[M::hamt_ug_opportunistic_elementary_circuits] deduplicated rescued circles, used 0.01s
[M::hamt_ug_opportunistic_elementary_circuits] wrote deduplicated rescued circles, used 0.00s
[M::hamt_simple_binning] Will try to bin on 101 contigs (skipped 0 because blacklist).
Using random seed: 42
Perplexity too large for the number of data points!

@xfengnefx
Copy link
Owner

Perplexity too large for the number of data points!

Sorry about the vague error. This run actually finished, both the assembly and the circle finding results should have been produced. The error was on the built-in MAG binning, which failed to find any bins: either because the sample was simple & there is nothing to bin with, or the assembly was fragmented & there is nothing to bin with. I should've include catching of this signal in the latest patch, but forgot to...

Is this the PRJNA748109 that segfaulted, or a different sample?

@Flooooooooooooower
Copy link
Author

Thank you for clarifying my confusion. The segmentation fault occurred during the run of Colorectum in PRJNA748109.

@xfengnefx
Copy link
Owner

So did PRJNA748109 somehow managed to have a run without the segfault, and got to the "checkpoint: post-assembly" part as the log posted above? Or this sample always triggered the segfault, while other samples of yours were assembled?

Thanks for letting me know though, I will remember to test on PRJNA748109 when I have access to servers. Sorry for no actual fix at the moment.

@Flooooooooooooower
Copy link
Author

Hello, I'm not sure what the reason is, but when I switch to another server, it runs successfully. However, on some servers, including clusters, it encounters a segmentation fault (SG). They were all installed via conda.

@xfengnefx
Copy link
Owner

I see, thanks so much for the report. I will remember this when testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants