Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/clin 2947 add exomiser #25

Merged
merged 1 commit into from
Oct 1, 2024
Merged

Conversation

LysianeBouchard
Copy link
Contributor

@LysianeBouchard LysianeBouchard commented Sep 18, 2024

This PR introduce the exomiser tool in the pipeline. Additionally, it introduces a new parameter called tools, which allows users to specify whether VEP, Exomiser, or both should be executed.

Additionally, there is some refactoring in the main workflow (postprocessing.nf) to use the def keyword and more standard variable names. I could not do it only for exomiser variables as, strangely, the code was not compiling when a channel with local scope was derived from a channel with non-local scope.

Here is the associated JIRA for exomiser:
https://ferlab-crsj.atlassian.net/browse/CLIN-2947

Limitations

For the schema, if-else blocks are used to conditionally require exomiser/vep parameters based on the value of the tools parameter.

If an exomiser or vep parameter is removed/added/modified, the logic in the if-else block will need to be manually updated. For example, if the removed parameter was required, it must be manually removed from the list of required parameters.

Tests

Running the pipeline:

  • Run the pipeline locally with the public test dataset (V3).
    • Check that vep output files are under results/vep and exomiser output files under results/exomiser/results.
    • Check that vep and exomiser output files are non empty.
  • Run the pipeline locally with the public test dataset (V3) again, this time commenting in the test config parameters exomiser_analysis_wes and exomiser_analysis_wgs. We want to see if the pipeline is correctly executed using the default analysis files.
  • Run the pipeline on juno with the public test dataset (V3).
  • Make sure that ci tests pass

Note: the V3 dataset covers REMM and CADD data sources and produces non empty exomiser files.

Linter:

  • Run the linter locally and compare with the main branch

Test the schema:

  • Check that the nf-core schema build commands still works and does not break the added if-else blocks.
    nf-core schema lint
    nf-core schema docs
    nf-core schema validate . data-test/paramsJuno.json
    nf-core schema build

     The validate command seem to take into account if-else blocks correctly according to my tests.
      I tweaked the paramsJuno.json file to remove a required exomiser parameter and it reported it correctly.

Test exomiser/vep parameter validation (stub mode)

  • exomiser tool present:
    • missing required parameter: should be detected and pipeline should stop
    • missing optional parameter (ex: cadd version): pipeline should run with exomiser
    • remm/cadd version specified:
      • missing other remm/cadd parameters: missing parameters should be detected and pipeline should stop
      • all remm/cadd paremeters specified: pipeline should run exomiser and pass remm/cadd options
    • remm/cadd version not specified:
      • missing other remm/cadd parameters: pipeline should run exomiser and not pass remm/cadd options
    • exomiser_analysis_wes / exomiser_analysis_wgs not specified: pipeline should use the default analysis files
  • exomiser tool absent:
    • missing required/optional exomiser parameter: pipeline should run without exomiser
    • remm/cadd versions specified:
      • missing other remm/cadd parameters: pipeline should run without exomiser
  • vep tool present:
    • missing required vep parameter: should be detected and pipeline should stop
  • vep tool absent:
    • missing required/optional vep parameters: pipeline should run without vep
  • check that exomiser is run if tools is set to vep,exomiser | exomiser,vep | exomiser
  • check that exomiser is skipped if tools is set to vep | empty string | null
  • check that vep is run if tools is set to vep,exomiser | exomiser,vep | vep
  • check that vep is skipped if tools is set to exomiser | empty string | null

Extra tests with CADD/REMM

  • If cadd/remm parameters are not passed (i.e. missing cadd/remm version) and the analysis file includes cadd/remm pathogenicity sources, the pipeline should be executed, but it should fail at the exomiser step
  • if cadd/remm parameters are passed and the analysis file does not include cadd/remm pathogenicity sources, the pipeline should be executed successfully

Validate json format is supported for analysis files and familyPheno files:

  • ask copilot to generate a json copy of one of the default analysis file and pass it at the command line
  • ask copilot to generate a json copy of one of the family file, create a copy of the test sample sheet pointing on the json family file and run the pipeline locally with it

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@LysianeBouchard LysianeBouchard force-pushed the feat/CLIN-2947-add-exomiser branch 9 times, most recently from 81c6414 to acb6e8f Compare September 23, 2024 19:44
"errorMessage": "Filename of the pedigree file, mandatory for exomiser",
"meta": ["familypheno"],
"format": "file-path",
"pattern": "^\\S*.y(a)?ml$",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on acceptait les json aussi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah oui c'est vrai bon point. Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also included a pattern for the analysis files in the parameters schema.

@LysianeBouchard LysianeBouchard force-pushed the feat/CLIN-2947-add-exomiser branch 6 times, most recently from ae7818e to a143836 Compare September 24, 2024 19:24
docs/reference_data.md Outdated Show resolved Hide resolved
docs/reference_data.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@@ -140,11 +95,10 @@ The documentation of the various tools used in this workflow are available here:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove last parenthesis + update link

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

docs/reference_data.md Show resolved Hide resolved
- 1000G reference file : 1000G_phase1.snps.high_confidence.hg38.vcf.gz
- SNP database : Homo_sapiens_assembly38.dbsnp138.vcf.gz

## Reference Genome
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can put this part in first before VQSR no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, good point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@LysianeBouchard LysianeBouchard force-pushed the feat/CLIN-2947-add-exomiser branch 2 times, most recently from 05fe1e9 to 5ba7519 Compare September 26, 2024 13:58
This commit introduces the exomiser tool in the pipeline.
Additionally, it introduces a new parameter called `tools`, which allows users to specify whether VEP, Exomiser, or both should be executed.

Other changes:
- Group vep output files (i.e. vcf + .tbi files) in subfolder `vep`.
- Improve documentation (README, OUTPUT, USAGE, REFERENCE_DATA).
- Refactor postprocessing workflow to use `def` for local variables and standardize variable names.
- Update GitHub CI nf-test command: remove local tag constraint and activate CI mode.
- Replace workflow diagram in README.
@LysianeBouchard LysianeBouchard marked this pull request as ready for review October 1, 2024 20:07
@LysianeBouchard LysianeBouchard merged commit 85af2f2 into main Oct 1, 2024
3 checks passed
@LysianeBouchard LysianeBouchard deleted the feat/CLIN-2947-add-exomiser branch October 1, 2024 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants