Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeWarning: invalid value encountered in subtract #179

Open
kate-stankiewicz opened this issue Dec 2, 2022 · 2 comments
Open

RuntimeWarning: invalid value encountered in subtract #179

kate-stankiewicz opened this issue Dec 2, 2022 · 2 comments

Comments

@kate-stankiewicz
Copy link

  • spladder version: 3.0.3
  • Python version: 3.9.12
  • Operating System: Ubuntu 20

Description

I am trying to run SplAdder on 83 samples using the instructions for Use on large cohorts ( https://spladder.readthedocs.io/en/latest/spladder_cohort.html ). When I get to the testing mode, I have several subgroups of samples to test based on different conditions. Some of the tests failed with the following error message:

raise ValueError(self.msg.format('endog'))
ValueError NaN, inf or invalid value detected in endog, estimation infeasible.

These error messages were accompanied by these warnings:
/users/kstankie/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
/users/kstankie/anaconda3/lib/python3.9/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars

Based on these warnings, I looked at the distribution of inserted events and found some samples that had very low or zero inserted events while all other samples in the condition group had hundreds or thousands. Example of two samples in the same condition group for a failed test:

Inserted:
cassette_exon: 0
intron_retention: 0
intron_in_exon: 0
alt_53_prime: 1
exon_skip: 0
gene_merge: 0
new_terminal_exon: 0

Inserted:
cassette_exon: 1357
intron_retention: 706
intron_in_exon: 3552
alt_53_prime: 12145
exon_skip: 17159
gene_merge: 0
new_terminal_exon: 41576

I removed the sample with almost no inserted events and re-ran spladder test. This time, no errors indicating "mean of empty slice" and I did receive output! However, for many of my tests I keep receiving this warning still (it was also present before I removed the offending samples causing the previous error):

users/kstankie/anaconda3/lib/python3.9/site-packages/spladder/spladder_test.py:742: RuntimeWarning: invalid value encountered in subtract

The run finishes and produces output that looks similar to tests that do not contain this RunTimeWarning. So I am not sure what is causing it and if it should raise alarm bells. As mentioned above, for this one dataset, I run several tests with different groups of samples for different conditions and only receive this warning for some of the tests. I can't figure out the reason why some tests receive this warning and not others (it is not just for the tests where I had to remove some samples due to the previous "mean of empty slice" issue). For each of my tests, each condition has 5-7 samples. I do not receive any warnings or errors in any previous steps (for spladder build)...it is only at spladder test where these RunTimeWarnings occur.

Are these warnings a concern or can the they be ignored as long as it finished running and produced output?

Thanks in advance for the help!

What I Did

# Here I use a job array to run each of the 83 samples in parallel
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${1} --merge-strat single --no-extract-ase --parallel 2 -v

# next merge the splice graphs
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${workdir}/bams.txt --merge-strat merge_graphs --no-extract-ase --parallel 40 -v

# next run quantification for each sample separately using a job array again
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${1} --merge-strat merge_graphs --no-extract-ase --quantify-graph --qmode single --parallel 2 -v

# aggregate them into a joint database
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${workdir}/bams.txt --merge-strat merge_graphs --no-extract-ase --quantify-graph --qmode collect --parallel 40 -v

# call events
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${workdir}/bams.txt --parallel 40 -v

# run 30 different tests comparing different groups of samples (again using a job array to automate running each comparison)
spladder test -o ${workdir}/array_spladder_out --out-tag Rem_Prob --conditionA ${workdir}/testing_contrasts/contrast_files/sym_${1}_${3}.txt --conditionB ${workdir}/testing_contrasts/contrast_files/sym_${2}_${3}.txt --labelA ${1}${3} --labelB ${2}${3} --diagnose-plots -v --parallel 5

@akahles
Copy link
Member

akahles commented Feb 6, 2023

Dear @kate-stankiewicz ,

Thanks for reporting this. It is on my TODO list since a while to catch these warnings early and give a more informative feedback to the user. Depending on where the warnings occur, they might indicate different things. For instance that an event does not have sufficient number of quantified events in a group or that the gene expression or event fold-change contains NaNs. You can ignore these for now, but I will leave the ticket open as a reference (and reminder) for me to improve this.

Best,

Andre

@kate-stankiewicz
Copy link
Author

Hi Andre,

Thanks so much for the reply and explanation! I do note that in the test_results_C3gene_unique.tsv files, the column 'log2FC_event_count' contains both 'nan' and 'inf' values for some events. Also, if I look at the mere_graphsC3.confirmed.txt files, I do see that some samples show 'nan' for psi for some events.

In this case, I can still ignore these warnings for now? And perhaps simply exclude events that have NaN values for further analysis? (looking at #124 )

Thanks,
Kate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants