Skip to content

Captus v1.1.0

Latest
Compare
Choose a tag to compare
@edgardomortiz edgardomortiz released this 23 Dec 17:49
· 1 commit to master since this release

New in the assemble module:

  • Contig depth of coverage is now calculated by mapping the reads back to the contigs using Salmon right after the assembly with MEGAHIT. This is now the default behavior unless --disable_mapping is enabled.
  • The assembly is then automatically filtered by depth of contig, if --disable_mapping is used then only contigs with depth of coverage >1x are retained, otherwise contigs with depth of coverage >=1.5x are retained. The filtering threshold can be changed with --min_contig_depth.
  • To replicate the behavior of previous versions use --disable_mapping and --min_contig_depth 0.
  • The filtering can be repeated with --redo_filtering, without the need to reassemble, to try different values for --max_contig_gc and --min_contig_depth.
  • The assembly HTML report has been completely rewritten to reflect these changes.

New in the extract module:

  • Options --nuc_depth_tolerance, --ptd_depth_tolerance, --mit_depth_tolerance, and --dna_depth_tolerance allow to filter contigs by depth of coverage during locus extraction. Among the contigs with hits to a particular marker type (e.g., nuclear), the median of the depths of coverage is calculated and this tolerance factor is used to determine the minimum (median / tolerance) and maximum (median * tolerance) depth allowed. The depth of coverage is taken from the contig names when they contain the pattern _cov_X.XX_.
  • To replicate the behavior of previous versions use --ignore_depth.
  • Added option --disable_stitching. By default, Captus recover a locus across multiple contigs, this option forces the recovery of a locus in a single contig (for example when providing chromosome-level genome assemblies).

Other improvements or additions:

  • The accessory script filter_most_common_target_per_locus.py creates a new reference target file with only the most common target per locus found during the extraction step. This new reference target set can be used to re-extract the loci and potentially improve the informed paralog filtering.
  • All the reports have been updated to include the version and command of Captus used.
  • Updated installation instructions and documentation.
  • Some long output filenames have been shortened.