Skip to content

v1.22.0

Compare
Choose a tag to compare
@tomkinsc tomkinsc released this 27 Nov 20:23
· 154 commits to master since this release
7947f7a

New:

  • Adding commands for working with kmer sets using the KMC tool. (#854)
    • new top-level python file: kmer_utils.py providing the following functions (see the documentation for more information):
      • build_kmer_db: Build a database of kmers occurring in given sequences
      • dump_kmer_counts : Dump kmers and their counts from kmer database to a text file
      • filter_reads : Filter reads based on their kmer content
      • kmers_binary_op: Perform a simple binary operation on kmer sets
      • kmers_set_counts : Copy the kmer database, setting all kmer counts in the output to the given value
  • add metagenomics.py::filter_bam_to_taxa (#883)
    • This function filters an input bam file to include only reads that have been mapped to specified taxonomic IDs or scientific names. This requires a classification TSV file, as produced by tools such as Kraken, as well as the NCBI taxonomy database. The column numbers of the tax ID and read ID can be specified, allowing use beyond kraken-format read classification files, however the relationship is assumed to be bijective.
  • add WDL for filter_bam_to_taxa
  • assembly.py::assemble_spades now has an option, --minContigLen, to so spades-based de novo assembly now yields only contigs longer than a specified length (#889)
  • assembly.py - added --alwaysSucceed option to SPAdes (#888)
  • allow RunInfo.xml override in illumina_demux WDL task (#891)
  • Added read_utils.py::read_names to extract read names from a sequence file
  • Added run-pipe_local.sh wrapper script for invoking the Snakemake-based pipeline on a single compute instance (#897)

Changed:

  • the Unmatched.bam file is now preserved in the illumina_demux WDL task (#887)
  • increase memory headroom requested for UGER jobs by 10% (#892)
  • (Broad only) change dotkit providing python-yaml (#890)
  • use python3 in easy-deploy script if available (#894)
  • Snakemake rules now specify their memory requirement via the mem_mb param, which is recognized by certain execution engines such as kubernetes (#897)

Fixed:

  • do not require chromosome names when checking whether a bam file is sorted (#898)
  • add --no-same-owner to tar -x in WDL tasks (#880)
  • safely build snpEff database (#881)
  • allow ints in Snakemake remote protocols ("s3://"...) (#895)
  • fix ncbi tbl parser for refseq accessions (#899)

Added/Upgraded:

  • coveralls 1.1 -> 1.3.0(#876)
  • pytest 3.6.3 -> 3.7.1 (#876)
  • pytest-mock 1.5.0 -> 1.10.0 (#876)
  • pytest-xdist 1.15.0 -> 1.22.5 (#876)
  • coverage 4.4.1 -> 4.5.1 (#876)
  • spades 3.11.1 -> 3.12.0 (#878)
  • Added kmc 3.1.1rc1
  • update Docker viral-baseimage 0.1.12 - 0.1.13 (#884)