Releases: BD2KGenomics/protect
2.5.0
ProTECT 2.5.0 is here with some great features (and a docker too)!
Major new features include:
- ProTECT can now start from checkpoints. Alignments are a thing of the past. ProTECT can now be run in the following ways -- a) fastqs trios, b) any combinations bam and fastq trios + haplotype (if at least one input is a bam), c) vcf + rna (bam or fastq) + haplotypes
- ProTECT now provides rudimentary support for peptides originating from fusion-gene events. Fusions will only be called from input RNA fastqs.
- ProTECT now implements an updated version of Transgene that optionally alllows the user to filter mutations for OxoG events.
- ProTECT now allows users to pull files directly from the NCBI GDC, using the file UUID and a valid download token.
- ProTECT allows users to only process a subset of "chromosomes" in the bam (This is useful if you want to drop the viral genomes in the GRCh338.d1.vd1 reference used by the GDC.
- ProTECT now has 2 additional reporting modules, describing the status of published immunotherapy-related gene networks in the sample, and the expression of CAR-T targets used in relevant clinical trials that are currently recruiting new patients.
- ProTECT now allows users to specify an email to receive completion updates per sample in a run.
Minor changes include:
- Fixed a small bug in processing MHCII peptide binding predictions using the sturniolo method.
- Bumped rankboost to 2.0.3 to handle edge cases where normal peptides weren't being handled correctly.
- Bams in the filestore are now deleted in a better fashion to reduce pressure on the file store.
- STAR is now sorted with samtools, fixing the --limitBAMsortRAM issue.
- Fixed a small issue with Transgene requesting teh wrong requirements.
- Updated resource requirements to increase efficiency.
2.4.1
Uses a newer version of Rankboost that does not crash because of a malformed Error message when there are no actionable mutations.
2.4.0
ProTECT 2.4.0 is here with so many goodies to share!
Major new features include:
- HG38 support: ProTECT now supports neoantigen prediction on samples using the GRCh38/HG38 reference genome. Like hg19, hg38 references are provided in our s3 bucket at
s3://cgl-protect-data/hg38_references/
- Altered-self support: ProTECT now has measures to calculate the binding scores for the corresponding wild-type peptides for each neoepitope and then consider that while ranking the results.
- User-specified dockerized-tool versions: ProTECT now allows the user to specify the version of a tool to use in the analysis. This requires the user to ensure that the tag exists in the provided dockerhub.
- http, https and ftp download support: ProTECT can now pull samples and references from http, https and ftp endpoints.
- Support for non-standard naming scheme for fastqs: ProTECT now allows users to optionally specify a separate link for the _2.fastq files (and ignores the previous requirement of the two _1 and _2 files sharing a common naming schema and existing in the same directory). This functionality along with https support allows users to pull signed aws s3 links into their pipelines.
- Added code that produces Dockerised versions of ProTECT (Currently is hardcodes the version to be 3.2.0 but this will be addressed in #173)
Other minor features and bugfixes:
- Updated the flow chat, doc strings and all associated supporting material to more accurately describe ProTECT.
- Bumped Transgene to 2.0.0 (using the correct tag of the Docker image)
- Fixed rsem to correctly request disk space
- Bumped Toil dependency to 3.5.2 (Handles some issues seen in a 90-samples scale-run)
- Fixed a small bug where in certain cases star was asking for float disk requirements and causing Toil to crash.
2.3.2
This release targets 2 bugs
- Fixed a small issue where rsem (via the wrapper) was not dynamically deciding how much disk to use for the job.
- Bumped the Transgene version to 2.0.0 (This changes nothing since the Transgene 2.0.0 head previously been incorrectly tagged as 1.0.0). ProTECT now uses the correct version of the docker image
2.3.1
This release contains come bug fixes to handle issues seen in a large 90-sample run on an AWS cluster.
- Bwa now uses the correct number of cores
- STAR uses the
--limitBAMsortRAM
flag using an appropriate amount of memory - Docker tools used in the pipeline now use specific version from the dockerhub.
2.3.0
2.2.1
This release of ProTECT targeted a couple of features
- Issue #90 fixed a bug that wrote the results from all samples in the run to the same list of files in the output directory. Files are now written to their own individual directories under the output directory.
- issue #83 added a little more description to the manual regarding s3am
- Issues #85 added overwrite support for uploads to s3
- Issue #95 fixed a log error where --max-cores-per-job was being grouped with the argument group used to specify whether the user wants to generate a config, or run ProTECT.
- Issue #97 reduced the memory requested by STAR to keep it under 50G for a full genome run.
2.1.5
This release featured
- A fix to the
--max-cores-per-job
command line flag, removing it from the mutually exclusive group concerning creating a config file, or running ProTECT. - A slight tweak to the memory used by STAR, allowing for a sub-50G run with whole genome indexes.
2.1.4
Added a small section to the manual talking about the version of s3am we prefer.
2.2.0
ProTECT 2.2.0 brings in the fifth SNV caller in the pipeline, Strelka, closing issue #73 .
SNVs are accepted for downstream processing if a mutation is called by any 2 of the 5 callers.... for now. Issue #82 will be addressed in the future and will use more sophisticated machine learning approaches to get a consensus call.