Skip to content

Releases: eastgenomics/eggd_dias_batch

v3.2.0

01 Aug 13:57
eea364e
Compare
Choose a tag to compare

Summary

Changes to improve unarchiving of files plus minor bug fixes and improvements

Changes

  • properly check for files to unarchive before running any jobs to ensure all required files are unarchived
  • improve details in readme
  • catch samples with no tests codes and raise as an error
  • fix total sample no run in summary report
  • fix the issue with subsetting the manifest to allow skipping samples, restricts valid samplename checking to just the subset
  • strip whitespace on string inputs to not cause jobs to fail from inputs like exclude samples having bonus spaces
  • addition of new input -iunarchive_only to allow for just unarchiving and not running jobs
  • fix issue of single gene reports having : in the report filename

Issues closed

v3.1.0

05 Mar 11:27
b63a04e
Compare
Choose a tag to compare
  • new input added to exclude control samples from CNV calling by default
  • fix for reading in files from DNAnexus with a trailing blank line
  • pass through optional multiqc report for artemis
  • explicitly raise an error if 2 configs of same version found
  • properly handle research code
  • replace hard coded dynamic string inputs (i.e. indications, panels, test codes) with config placeholders
  • handle current and new genepanels file
  • add SNV / mosaic as a string to the reports output folder

See https://cuhbioinformatics.atlassian.net/wiki/spaces/DV/pages/3100770334/eggd+dias+batch+v3.1.0

v3.0.1

29 Dec 15:31
18afe48
Compare
Choose a tag to compare

Changes

  • Allow a custom url_duration for artemis download links to be passed from the config to eggd_artemis jobs started by eggd_dias_batch
  • Delete unused warning to fix #165

v3.0.0

19 Dec 12:15
b2da2b3
Compare
Choose a tag to compare

Summary

Refactor of original tool into DNAnexus app, with various improvements and bug fixes

Changes

  • Refactor to DNAnexus app to remove reliance on running jobs from server
  • Refactor of whole code base to add in better handling of launching CNV calling + all reports workflow with one command
  • Added unit tests to cover the majority of functions (93% code coverage)
  • Optionally handle unarchiving any required files for analysis that are archived

Fixes

v2.1.0

12 Sep 09:16
a94521c
Compare
Choose a tag to compare

Adds mosaicreports subcommand. This will now look for a mutect2 vcf in a TNHaplotyper2 output directory & make an appropriate report (with extra excluded columns). Otherwise the report is the same as a normal SNV one.

See: https://cuhbioinformatics.atlassian.net/wiki/spaces/DV/pages/2983395507/dias+batch+running+v2.1.0

v2.0.2

26 May 08:51
e4d31f9
Compare
Choose a tag to compare

Summary

This is a minor bug fix update to improve:

  • creating the correct output_file_prefix input string from _HGNC gene IDs, rather than only from test-code_clinical-indication.

Changes

in the reports.py and cnvreports.py:

  • provide a && joined string of test codes as input to all generate_bed stages (output_file_prefix input field)
  • provide this list of prefixes in a way that can record both single gene and clinical indication requests

Bug fixes:

#104

For further information including development notes and testing evidence, see: https://cuhbioinformatics.atlassian.net/wiki/spaces/DV/pages/2936799233/dias+batch+running+v2.0.2

v2.0.1

18 May 09:10
1e21343
Compare
Choose a tag to compare

Summary

This is a minor bug fix update to improve:

  • finding clinical indications and panels against test codes
  • correctly add multiple clinical indications to a list for a sample when present on multiple lines in the Gemini input file (manifest or reanalysis tsv)
  • name panel-specific bed files with test_code only to avoid hitting the filesystem character limit, especially at the eggd_annotate_excluded_regions stage of the dias_cnvreports_workflow.

Changes

parse_Gemini_manifest:

  • restrict parsing of clinical indications that start with an R code, C code or _HGNC ID
  • correctly identify ALL test codes for a sample that may be across multiple lines within the reanalysis file

create_job_report_file:

  • present list of samples and invalid test codes as one pair per line for easier copy-pasting to repeat analysis job with corrected test codes

in the reports.py and cnvreports.py:

  • match only on the R code part of the clinical indication from a Gemini file to genepanels
  • provide a && joined string of test codes as optional input to all generate_bed stages (output_file_prefix input field)

Bug fixes:

#78
#101

For further information including development notes and testing evidence, see: https://cuhbioinformatics.atlassian.net/wiki/spaces/DV/pages/2888663263/dias+batch+running+v2.0.1

v2.0.0

15 May 17:28
e6345c4
Compare
Choose a tag to compare

Summary

This is an update to reflect that eggd_conductor is now being used routinely to set off dias_single, dias_multi and eggd_MultiQC jobs, as well as, to accommodate the lab-wide transition to Epic. Epic will be used routinely as a sample tracking system, including booking of samples against test codes/clinical indications which information needs to be parsed by dias_batch_running to set off dias_reports workflows for each samples against the required test codes/clinical indications.

Changes

Major changes to accommodate transition to automated running of dias_single, dias_multi and QC steps, as well as, updates to gathering input files for dias_reports and dias_cnvreportsworkflows.

  • removed support for setting off dias_single, dias_multi and multiQC jobs as these are now handled by eggd_conductor
  • cnvcall command no longer relies on having "_single" in the dias_single workflow's output folder name
  • updated logic for determining which sample to be analysed with which clinical indication/panel: the sample and test requirements are now always parsed from an input file provided as a command arg, and this file is being uploaded to the dias_reports workflow’s output folder on DNAnexus
  • overall improvements to the code and general_functions, including better commenting throughout
  • reports and cnvreports commands expect a manifest file from Epic
    • manifest file from Epic is expected to be a semicolon-separated file with a batch ID in the first row, followed by column headers: sample identifiers (Re-analysis Specimen ID, Re-analysis Instrument ID, Specimen ID, Instrument ID) in this order and the final column containing comma-separated Test Codes (column headers are exact matched against these strings), with each row a separate sample with its set of test codes.
    • accepted test codes start with R, C or _HGNC
  • reanalysis and cnvreanalysis commands expect a manifest file from Gemini (see expected format below)
    • manifest file from Gemini is expected to be a tab-separated file with an X number in the first column and comma-separated full clinical indication names (matching entries in the genepanels file), or _HGNC:<ID> in the second column.
  • a job report file is uploaded to the dias_reports workflow’s output folder on DNAnexus, specifying:
    • the number of samples which have sentieon VCF files available for filtering and annotation
    • the number of samples for which reports were requested in the manifest file
    • the number of samples for which reports job started successfully
    • the available sample identifiers parsed from the manifest if it could not be linked with a sample VCF filename (invalid sample ID)
    • a list of sample identifiers parsed from the manifest that were booked against tests that could not be identified (invalid test code)

Bug fixes:

#93

For further information including development notes and testing evidence, see: https://cuhbioinformatics.atlassian.net/wiki/spaces/DV/pages/2888728586/dias+batch+running+v2.0.0

v1.10.2

29 Nov 15:57
a1cee88
Compare
Choose a tag to compare

Summary

Bug fix release to allow CNV report generation for all samples except the samples that were excluded from CNV calling. Other improvement features were included in this release. These are to use the full sample name in the excluded sample list to increase specificity in selecting sample and to upload the excluded sample list to the CNV calling output directory for record keeping.

v1.10.1

05 Oct 09:55
2ec941d
Compare
Choose a tag to compare

Summary
Bug fix release to fix issue with finding new xlsx variant reports in project folder #87

Changes

  • Searches for both old xls reports and new xlsx reports