Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation time #198

Open
Elisa89m opened this issue Jun 11, 2024 · 1 comment
Open

Computation time #198

Elisa89m opened this issue Jun 11, 2024 · 1 comment

Comments

@Elisa89m
Copy link

  • spladder version: last (I guess)
  • Python version: 3.9
  • Operating System: Ubuntu

I installed spladder by the command pip install spladder under conda environment with python 3.9 since under an environment with python 2 it returned me dependence errors.

I'm trying to analyze the splicing variant in a mus musculus cell line, I downloaded the GTF reference (from ensembl).

I run the following command
spladder build --parallel 2 --output-txt-conf --output-gff3-conf -o ./Panc02_test_2 -b Panc02_input/uniqueTUMOR_sorted.bam -a ../Mus_musculus.GRCm38.102.chr_mod.gtf -c 3

The following files were generated in few hours:
genes_graph_conf3.merge_graphs.count.hdf5
genes_graph_conf3.merge_graphs.gene_exp.hdf5
genes_graph_conf3.merge_graphs.pickle
genes_graph_conf3.uniqueTUMOR_sorted.pickle

But the splicing variant files hdf5 were not generated even if I waited two days.
I run also the example test as following:
spladder build --parallel 2 --output-txt-conf --output-gff3-conf -o ./test_spladder -b input_spladder_test/testcase_events_1_sample1.bam -a input_spladder_test/testcase_events_spladder.gtf

and this was fine.

Similar to cell line test I performed a run on human sample by using the GTF reference downloaded from ensemble, but I gained the same problem.

Is there something that I wrong? Why the computation time is so long?

Thanks in advance for you availability.

Elisa

@akahles
Copy link
Member

akahles commented Oct 3, 2024

Dear Elisa,

sorry for the late reply. If your issue is still relevant, please send a ping and I will have a look. If you use --verbose in the output, it might give you a hint for where the program hangs.

Further, I recommend running with --sparse-bam, which summarises your bam files first into a format that SplAdder can interact with more efficiently.

Lastly, try using the latest release. Depending on which aligner you used, you might have been affected by a bug that occurred when parsing CIGAR strings that contain = or X characters.

Best,

Andre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants