Skip to content

Mabs 2.24

Compare
Choose a tag to compare
@shelkmike shelkmike released this 10 Aug 11:58
· 22 commits to master since this release
50a318b
  1. The optimization algorithm of Mabs-flye has been changed.
    Previously, Mabs-flye assumed the values of Flye parameters "assemble_ovlp_divergence" and "repeat_graph_ovlp_divergence" equal to each other. Mabs-flye optimized the resulting single parameter (called "max_divergence" in Mabs-flye) using the golden section method. Now Mabs-flye optimizes these two parameters independently using the Nelder-Mead method. This allows for a more thorough exploration of the parameter space.

    By default, Mabs-flye tests at most 10 points in the two-dimensional parameter space. One of the starting points corresponds to the default values of "assemble_ovlp_divergence" and "repeat_graph_ovlp_divergence" used by Flye.

    Less than 10 points may be tested if the optimization algorithm (the function scipy.optimize.minimize of the Python library SciPy) desides that convergence has been achieved.

    The maximum number of tried points can be set with a new Mabs-flye parameter "--maximum_number_of_points_to_try". Increasing the value of this parameter will increase the computation time of Mabs-flye, but may make the assembly better. By default, the value of "--maximum_number_of_points_to_try" is 10.

  2. The optimization algorithm of Mabs-hifiasm has been changed.
    Previously, if two "-s" values of Hifiasm led in the same AG, Mabs-hifiasm considered the assembly with the smaller "-s" better (smaller "-s" corresponds to stricter pruning of possible haplotypic duplications). Now if two "-s" values lead to the same AG, Mabs-hifiasm considers the best the one whose assembly has the larger N50. If both assemblies have the same N50 then, as previously, the assembly with the smaller "-s" is preferred.

  3. Previously, Mabs-flye provided all reads to Flye via the option "--nano-raw". Now, Mabs-flye uses the Phred score lines in FASTQ to calculate the accuracy of each read. Then, Mabs-flye calculates the median accuracy among all reads.
    The correspondence between the median accuracy and the option with which Mabs-flye provides reads to Flye:
    (0%; 95%] - "--nano-raw"
    (95%; 97%] - "--nano-hq"
    (97%; 99%] - "--nano-corr"
    (99%; 100%] - "--pacbio-hifi"

    To speed up the read accuracy calculation, Mabs-flye calculates the accuracy only for BUSCO reads (i.e. reads that correspond to BUSCO genes).

  4. Now Mabs-hifiasm is based on Hifiasm 0.19.5 instead of Hifiasm 0.19.3.

  5. Several small changes in names of subfolders and files produced by Mabs have been made.

  6. Several bugs which slightly decreased the assembly accuracy have been fixed.