New in the assemble
module:
- Contig depth of coverage is now calculated by mapping the reads back to the contigs using
Salmon
right after the assembly withMEGAHIT
. This is now the default behavior unless--disable_mapping
is enabled. - The assembly is then automatically filtered by depth of contig, if
--disable_mapping
is used then only contigs with depth of coverage >1x are retained, otherwise contigs with depth of coverage >=1.5x are retained. The filtering threshold can be changed with--min_contig_depth
. - To replicate the behavior of previous versions use
--disable_mapping
and--min_contig_depth 0
. - The filtering can be repeated with
--redo_filtering
, without the need to reassemble, to try different values for--max_contig_gc
and--min_contig_depth
. - The assembly HTML report has been completely rewritten to reflect these changes.
New in the extract
module:
- Options
--nuc_depth_tolerance
,--ptd_depth_tolerance
,--mit_depth_tolerance
, and--dna_depth_tolerance
allow to filter contigs by depth of coverage during locus extraction. Among the contigs with hits to a particular marker type (e.g., nuclear), the median of the depths of coverage is calculated and this tolerance factor is used to determine the minimum (median / tolerance) and maximum (median * tolerance) depth allowed. The depth of coverage is taken from the contig names when they contain the pattern_cov_X.XX_
. - To replicate the behavior of previous versions use
--ignore_depth
. - Added option
--disable_stitching
. By default, Captus recover a locus across multiple contigs, this option forces the recovery of a locus in a single contig (for example when providing chromosome-level genome assemblies).
Other improvements or additions:
- The accessory script
filter_most_common_target_per_locus.py
creates a new reference target file with only the most common target per locus found during the extraction step. This new reference target set can be used to re-extract the loci and potentially improve theinformed
paralog filtering. - All the reports have been updated to include the version and command of Captus used.
- Updated installation instructions and documentation.
- Some long output filenames have been shortened.