Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: gene essentiality workflow #675

Merged
merged 24 commits into from
Oct 28, 2024
Merged

Conversation

mihai-sysbio
Copy link
Member

@mihai-sysbio mihai-sysbio commented Jul 7, 2023

Main improvements in this PR:

This PR aims to automatise the gene essentiality checks as introduced in #563 with a GitHub workflow, also discussed in that PR thread.

I hereby confirm that I have:

  • Tested my code on my own computer for running the model
  • Selected develop as a target branch
  • Any removed reactions and metabolites have been moved to the corresponding deprecated identifier lists

- name: Run gene essentiality
id: essentiality
run: |
TEST_RESULTS=$(/usr/local/bin/matlab -nodisplay -nosplash -nodesktop -r "ihuman = importYaml('Human-GEM.yml'); taskStruct = parseTaskList('data/metabolicTasks/metabolicTasks_Essential.txt'); eGenes = estimateEssentialGenes(ihuman, 'Hart2015_RNAseq.txt', taskStruct); disp(evaluateHart2015Essentiality(eGenes));" | awk 'NR>9 && !/^\.+/')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haowang-bioinfo the output of this line is very hard to capture/use. Do you have any suggestions to make this output more compact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As seen in this workflow run, the output is way too rich to fit in a PR comment. @haowang-bioinfo how would you suggest to change it to end up with a nice summary like here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, I've never thought about this. Because they are non-binary indicators that are hard to be used for evaluating purpose in actions

@mihai-sysbio
Copy link
Member Author

@haowang-bioinfo
Copy link
Member

haowang-bioinfo commented Aug 13, 2023

Also noting here this relevant research article Synthetic lethality in large-scale integrated metabolic and regulatory network models of human cells.

yes, it mentioned another method gMCStool that can predict essential genes in context-specific models.

Other checks than essentiality are also welcome

@mihai-sysbio mihai-sysbio marked this pull request as ready for review October 6, 2023 11:33
@feiranl
Copy link
Collaborator

feiranl commented Oct 9, 2023

@mihai-sysbio Could you explain the reason for the failing task?

@mihai-sysbio
Copy link
Member Author

@mihai-sysbio Could you explain the reason for the failing task?

The task isn't actually failing, really. As mentioned in the comment above, it's simply that the output of the gene essentiality is so long that it cannot be posted as a PR comment. I couldn't figure out how @haowang-bioinfo ran the calculations to obtain a much more compressed output. As visible in the workflow changes, the code here is just calling functions defined elsewhere.

@haowang-bioinfo
Copy link
Member

I couldn't figure out how @haowang-bioinfo ran the calculations to obtain a much more compressed output. As visible in the workflow changes, the code here is just calling functions defined elsewhere.

the concise output was an outcome of a series of manual operations of running functions, copy/paste results, and modifications with text editor

@mihai-sysbio
Copy link
Member Author

If these last manual steps could be ran as a function, then we would be ready to run the gene essentiality analysis on every PR 💪🏻

@mihai-sysbio mihai-sysbio mentioned this pull request Jun 27, 2024
@edkerk
Copy link
Member

edkerk commented Sep 23, 2024

As is suggested for MACAW results in #840, a summary output could be automatically posted in the PR, while a more detailed text file could be written and committed as part of each PR. This could be organized in the same folder as the MACAW results, and a README.md file could indicate for both output files from which PR they derived from.

@edkerk edkerk mentioned this pull request Oct 19, 2024
3 tasks
@edkerk edkerk force-pushed the feat/gene-essentiality-workflow branch from ef03efb to 497bd7b Compare October 19, 2024 10:06
Copy link

github-actions bot commented Oct 19, 2024

This PR has been automatically tested with GH Actions. Here is the output of the MACAW test:

Starting dead-end test...
- Found 1514 dead-end metabolites.
- Found 1319 reactions incapable of sustaining steady-state fluxes in either direction due to these dead-ends.
- Found 1977 reversible reactions that can only carry steady-state fluxes in a single direction due to dead-ends.
Starting duplicate test...
- Skipping redox duplicates because no redox_pairs and/or proton_ids were provided.
- Found 447 reactions that were some type of duplicate:
- 0 were completely identical to at least one other reaction.
- 13 involve the same metabolites but go in the opposite direction or have the opposite reversibility as at least one other reaction.
- 447 involve the same metabolites but with different coefficients as at least one other reaction.

This and a more detailed output from MACAW are also committed to data/macawResults/.

Note: In the case of multiple test runs, this post will be edited.

Copy link

github-actions bot commented Oct 19, 2024

This PR has been automatically tested with GH Actions. Here is the output of the gene essentiality test:

     cellLine     TP     TN     FP    FN     accuracy    sensitivity    specificity       F1         MCC   
__________ __ ____ __ ___ ________ ___________ ___________ ________ ________

{'DLD1' } 36 2185 59 279 0.86792 0.11429 0.97371 0.17561 0.15291
{'GBM' } 34 2165 61 298 0.85966 0.10241 0.9726 0.15925 0.1333
{'HCT116'} 46 2207 53 309 0.86157 0.12958 0.97655 0.20264 0.19047
{'HELA' } 30 2263 69 254 0.87653 0.10563 0.97041 0.15666 0.12398
{'RPE1' } 14 2204 81 259 0.86708 0.051282 0.96455 0.076087 0.025853
{'all' } 7 2408 92 109 0.92317 0.060345 0.9632 0.065116 0.0254

Note: In the case of multiple test runs, this post will be edited.

@SysBioChalmers SysBioChalmers deleted a comment from github-actions bot Oct 19, 2024
@edkerk edkerk force-pushed the feat/gene-essentiality-workflow branch from 19b9a8e to 0ef23be Compare October 20, 2024 17:02
@edkerk
Copy link
Member

edkerk commented Oct 23, 2024

Finally 😄 it seems to work! Ready for review.

@mihai-sysbio
Copy link
Member Author

This is amazing 🤩 thank you @edkerk for seeing this through 💪🏻

Copy link
Member Author

@mihai-sysbio mihai-sysbio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a contributor to this PR I cannot formally review it - only comments are allowed - so please consider this an approval 👍🏻

@edkerk edkerk merged commit 19af535 into develop Oct 28, 2024
6 checks passed
@edkerk edkerk deleted the feat/gene-essentiality-workflow branch October 28, 2024 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants