-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with running Liftoff on HPC environment #162
Comments
I also encountered this problem, have you solved it? |
Hi, I'm also having the same issue; any advice? |
@yeeus not yet, I heard from someone that liftoff needs to be run with a gtf file and not a gff file. So I tried that but I got the following error 'GFF does not contain any gene features. Use -f to provide a list of other feature types to lift over.' |
We'll look into this - but Liftoff usually runs in no more than an hour or two on a mammalian genome, so if it's running for many hours something is wrong. It doesn't need that much memory. |
Hello,
I've been having issues running Liftoff. It's taking days to run and then terminates. I'm running it on an HPC environment using 100GB memory and a computer node that has 2000 cores. The below command is what I'm using to run liftoff. The target genome is rhemac10 FASTA and I've also inputed the human genome hg38 FASTA and human genome annotation GFF.
Here is the bsub command I used to submit my script
And here is my script
However, it has been running for several days and it's stuck on lifting features.
extracting features
2024-01-23 11:57:09,016 - INFO - Populating features
2024-01-23 12:04:20,319 - INFO - Populating features table and first-order relations: 4900134 features
2024-01-23 12:04:20,319 - INFO - Updating relations
2024-01-23 12:05:01,905 - INFO - Creating relations(parent) index
2024-01-23 12:05:05,589 - INFO - Creating relations(child) index
2024-01-23 12:05:10,210 - INFO - Creating features(featuretype) index
2024-01-23 12:05:14,158 - INFO - Creating features (seqid, start, end) index
2024-01-23 12:05:19,103 - INFO - Creating features (seqid, start, end, strand) index
2024-01-23 12:05:24,253 - INFO - Running ANALYZE features
aligning features
[M::main::16.3110.41] loaded/built the index for 2939 target sequence(s)
[M::mm_mapopt_update::17.5900.45] mid_occ = 596
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 2939
[M::mm_idx_stat::18.3710.48] distinct minimizers: 101324913 (39.04% are singletons); average occurrences: 5.469; average spacing: 5.362; total length: 2971331530
[M::worker_pipeline::226.3593.67] mapped 10628 sequences
[M::worker_pipeline::382.8163.79] mapped 10362 sequences
[M::worker_pipeline::555.9683.84] mapped 12280 sequences
[M::worker_pipeline::711.785*3.85] mapped 14834 sequences
[M::main] Version: 2.26-r1175
[M::main] CMD: minimap2 -o intermediate_files/reference_all_to_target_all.sam -a --end-bonus 5 --eqx -N 50 -p 0.5 -t 32 liftoff/rheMac10.fa.gz.mmi intermediate_files/reference_all_genes.fa
[M::main] Real time: 712.151 sec; CPU: 2743.497 sec; Peak RSS: 27.401 GB
lifting feature
Am I using enough memory or cores/threads for liftoff? Is there a typical runtime for lifting over features from one large genome to another?
The text was updated successfully, but these errors were encountered: