Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Extensive changes aimed at speeding up tree loading & parameter
optimisation (~2x) & tree search (~2x) by "serialising" their memory access patterns to improve spatial and temporal locality of reference. 1. PhyloTree alignment summaries may now be "borrowed" from another tree that has the same alignment (the relevant member is isSummaryBorrowed; if it is true, this instance doesn't own the summary, it is only "Borrowing" a reference to a summary owned by another tree). 2. PhyloTree member functions copyPhyloTree and copyPhyloTreeMixlen take an extra parameter indicating whether the copy is to "borrow" a copy of the alignment summary of the original (if it has one). This matters a lot for ratefree.cpp and +R free rate models, and modelmixture.cpp! The temporary copies of the phylo tree that are used during parameter Optimization can now re-use the AlignmentSummary of the original; which means they can "linearise" their memory access to sites, when they are Optimising branch lengths (see changes listed below, e.g. #4, #5, #6, #7). 3. PhyloTree::setAlignment does its "name check" a different way (rather than finding each sequence by name by scanning the tree, if asks MTree::getMapOfTaxonNameToNode for a name to leaf node map, and checks the array of sequence names against the map (updating the id on the node for each hit). The new approach is equivalent but is much faster, O(n.ln(n)) rather than O(n^2). This speeds up tree loads markedly (particularly for large trees), but it matters most for free rate parameter optimization (on middling inputs this was a significant factor: about ~10% of parameter optimization time). This can be turned off by changing the FAST_NAME_CHECK symbol. 4. IQTree::optimizeModelParameters now calls prepareToComputeDistances() (So that AlignmentSummary's matrix of converted sequences) will be available (to be borrowed, via calls to PhyloTree::copyPhyloTree (see change # 2 above, and changes #5 through #7 below). Likewise IQTree::doNNISearch (so changes #5, #8 help tree searches too). 5. AlignmentPairwise::computeFunction and AlignmentPairwise::computeFuncDerv( can now make use of AlignmentSummary's "Matrix of converted sequences" (if it is available) via PhyloTree's accessor methods, e.g. PhyloTree::getConvertedSequenceByNumber(). For this to work as expected, it's necessary for callers to ask AlignmentSummary to construct that matrix *including* even sites where there is no variety at all (added the keepBoringSites parameter on the AlignmentSummary constructor for this). 6. RateMeyerDiscrete::computeFunction and RateMeyerDiscrete::computeFuncDerv likewise. And RateMeyerDiscrete::normalizeRates can make use of the "flat" frequency array exposed by PhyloTree::getConvertedSequenceFrequencies() too. 7. PhyloTree::computePartialLikelihoodGenericSIMD (in phylokernelnew.h) makes use of the matrix of converted sequences (if one is available), in about six (!) different places. In terms of actual effect, this is the most important change in this commit, but it needs changes #1, #2, and #4 committed too, if it is to have any effect. This change speeds up both parameter optimisation and tree searching significantly. 8. As well as inv_eigenvectors, there is now an iv_eigenvectors_transposed (Using the transpose makes for some faster multiplications; see change #9 listed below). ModelMarkov::calculateSquareMatrixTranspose is used to calculate the transpose of the inverse eigen vectors. Unpleasant consequence: ModelMarkov::update_eigen_pointers has to take an extra parameter. Keeping this additional member set correctly is the only Thing that forced changes to modelpomomixture.cpp (and .h), modelset.cpp, and modelsubst.h. 9. ModelMarkov::computeTransMatrix and ModelMarkov::computeTransDerv now use (a) calculateExponentOfScalarMultiply and (b) aTimesDiagonalBTimesTransposeOfC to calculate transition matrices (This is quite a bit faster than the Eigen code, since it doesn't bother to construct the diagonal matrix B.asDiagonal()...). (a) and (b) and the supporting functions, calculateHadamardProduct And dotProduct, are (for now) members of ModelMarkov. 10.Minor tweaks to vector processing code in phylokernelnew.h: (a) dotProductVec hand-unrolled treatment of the V array; (b) dotProductPairAdd treated the last item (in A and B) as the special case, when handling an odd number of items. Possibly the treatment of the AD and BD arrays should be hand-unrolled here, too, but I haven't tried that yet. (c) dotProductTriple (checking for odd uses & rather than %) (faster!) 11.The aligned_free free function (from phylotree.h ?!) does the "pointer Null?" check itself, and (because it takes a T*& rather than a T*), can itself set the pointer to nullptr. This means that client code that used to go... if (x) { aligned_free(x); x=NULL; } ... can now be simplified to just... aligned_free(x); 12.Next to it (in phylotree.h), there is now an ensure_aligned_allocated method. That lets you replace code like ... this: if (!eigenvalues) eigenvalues = aligned_alloc<double>(num_states); With: ensure_aligned_allocated(eigenvalues, num_states); which is, I reckon, more readable. 13.In many places where there was code of the form... if (x) { delete x; } I have replaced it with delete x (likewise delete [] x). delete always checks for null (it's required to, that's in the C++ standards), and "Rolling your own check" merely devalues the check that delete will later do! I've made similar "don't bother to check for null" changes in some other files, that I haven't included in this commit (since there aren't any *material* changes to anything in those files).
- Loading branch information