Major code refactoring and various cosmetic changes #11

StefanFlaumberg · 2024-09-07T23:59:58Z

This pull request depends on the noahares/GeneRaxCore#1 PR

Together they introduce the following changes:

Make code more logically organized, more unified and less complicated wherever possible
Fix the gradient search parameter setting in the thorough rate optimization mode
Fix the alpha parameter optimization in the rate mixture mode (speciation rate categories)
Fix internal species branch label update on the species tree topology changes
Introduce checkpoints for the filtered families, the mixture alpha and the highways (and they do work!)
Restrict using checkpoint if the run options are changed
Restrict concomitant usage of incompatible options and superfluous MPI rank usage (fixes [BUG] Stalling while using MPI #10 )
Update the help message
Add a newline char to the end of output files wherever it's missing
Give better control over the optimization verbosity
Make logs more pretty

StefanFlaumberg · 2024-09-08T03:08:59Z

Hi Benoit,

I have restricted some options not to be used together, but I am not completely sure regarding the following 2 points:

The RELDATED transfer constraint:
Tree topology search should clearly mix up the species relative ranks, so the HYBRID search seems to be logically incompatible with the constraint. One wouldn't either start a REROOT search from a dated species tree, but the root search algorithm itself implements some kind of dating search, so one can use the REROOT search to get a properly rooted and dated species tree.
Thus, I decided under the RELDATED contraint to restrict usage of the HYBRID search, but to allow for usage of the REROOT search if the --infer-speciation-order option is additionally invoked.
The PER-SPECIES rates:
Your wiki states that "different rates for different species are not compatible with species tree inference". However, the HYBRID tree search runs seemingly ok both with PER-SPECIES and ORIGINATION-PER-SPECIES rates (why shouldn't it?).
Thus, I decided to restrict only concomitant usage of tree search (both HYBRID and REROOT) and model parametrization with a custom file (for an obvious reason: the file uses ancestral clade names). All the other combinations are allowed.

Could you, please, tell if this looks sensible?

Best,
Stefan

noahares · 2024-10-03T07:40:06Z

Hi Stefan,

I will be continuing Benoit's work with AleRax. Thanks for opening this PR! I will look into your changes and questions.

Best,
Noah

StefanFlaumberg · 2024-10-08T21:11:08Z

Hi Noah,

It's good to know that AleRax will be actively maintained again!

Since I didn't expect any updates coming, I've been doing major code refactoring for a couple of months (fixing a number of small bugs, code structure optimization, making output more fancy and stuff mentioned in this PR description) and was going to publish the changes within this PR in the coming weeks.

Now I'll have to update my changes by including your recent commits and hope to publish the final version of the PR by the end of the next week.

Best,
Stefan

…ckpoints for the mixture alpha and highways

StefanFlaumberg · 2024-10-28T03:12:34Z

Dear @noahares and @BenoitMorel ,

The PR is ready for review.

I started the work in August. The main goal was refactoring the code to make it more logically organized, succinct and readable. Some functions changed their names, some were merged, splitted or even removed, but the algorithmic side is left 99% intact. The 1% goes to fixing small bugs like making the optimized rate gamma categories to be included to the actual evaluators, proper speciation date backuping, proper branch label update on changing topology etc. The output became more user-friendly, as the help menu did. Incompatible options now cannot be run together leading to explicit error messages. Checkpointing became more error-proof by remembering the filtered families, the rate mixture parameter and highways and by prohibiting running with changed options.

The code has been eye-checked countless times, and test runs in various modes have not uncovered any errors or unexpected LL changes compared to the current version performance. I have also run tests using corrected reconciliation models (not part of this PR, see below) to ensure everything works fine even where the current version fails.
It is huge, but should be fine. I will be rather surprised if any undetected error caused by my amends eventually shows up.

The only obvious problem we have is the style discrepancy. Most of the changes were done before Noah's publishing of the restyled code. Since my amendments are numerous and quite diffuse, I could not reproduce them on the current restyled code version from scratch. Thus, I only managed to include your HGT highway search improvements to my code, yet left all the code mostly in the original style of Benoit (barring extra whitespaces and newlines).
Personally, I find this original style more easy-to-read. But I totally acknowledge the advantages of adhering to the existing coding standards, so if you consider it critical, I could rewrite my version in the new style in some time. Yet, I would rather postpone the rewriting since I also have some changes to propose fixing quite serious algorithmic flaws in the reconciliation model files and these should be of higher priority. I'll publish the fixed models in a separate PR a bit later.

If you find something in the code that doesn't seem right, please contact me back :)

With best hopes for understanding,
Stefan

noahares · 2024-10-30T06:54:04Z

Hi @StefanFlaumberg,

Thank you very much for the work you put into this! I will have a look into the changes as soon as possible. I am currently working on some improvements and feature extensions myself which are semi time-critical so I hope to merge your changes before I work on mine (or deal with merging both later). The "restyling" is exclusively from the clangd-format file so unifying the styles should not be an issue. If you have suggestions to changes to the style file, let me know.

Best
Noah

StefanFlaumberg · 2024-10-31T04:39:44Z

Hi @noahares ,
Thank you for the reply!

I didn't know about automatically formatting the code. Well, great that no one will have to reformat the changes manually. I'm fine with that)

Considering your time constraints, just to synchronize:
Sorry for the inconvenience. I think I'll be ready to add another PR fixing the reconciliation models at the beginning of next week. While the current PR and the one covering the models substantially alter the code, they also fix the most prominent algorithmic flaws of the tool. There are also some other minor problems I've found, but they cannot significantly affect performance, so fixing them can wait. I believe that implementing and testing new features might be more efficient with the old flaws covered if possible (for example, the current model in certain cases estimates LL > 0, which I'm fixing in the coming PR).

Best wishes,
Stefan

StefanFlaumberg · 2024-11-19T20:52:23Z

Hi @noahares ,

Hope your work on the program is going well. Since it's already been 3 weeks, I just wanted to kindly remind you of this PR and to inform that I'm also ready with another PR fixing the reconciliation models. Could you please roughly outline when you think you'll be able to check it?
As I consider using AleRax in our group's projects, I find it important that the program have as few bugs as possible. And other users would probably desire the same from the program main version.

Thank you for your time!

Best,
Stefan

noahares · 2024-11-20T13:02:08Z

Hi @StefanFlaumberg,

My current workload and travels did not yet allow for sitting down to fully inspect the changes. I hope you understand that even though most changes are simple, I want to look through them closely for a codebase I am familiarizing myself with at the moment and plan to maintain in the long run.
I hope to look at it in the upcoming weeks.

Best
Noah

StefanFlaumberg closed this Oct 27, 2024

StefanFlaumberg force-pushed the main branch from 1c80ed3 to 790d110 Compare October 27, 2024 19:06

StefanFlaumberg added 3 commits October 28, 2024 03:52

Major refactoring and better checkpointing

8b6404b

Fix rate mixture optimization, fix mpi freezing on reldating, add che…

00499aa

…ckpoints for the mixture alpha and highways

Better output files

6a5b008

StefanFlaumberg reopened this Oct 28, 2024

StefanFlaumberg mentioned this pull request Oct 28, 2024

Auxiliary changes for AleRax refactoring noahares/GeneRaxCore#1

Open

StefanFlaumberg marked this pull request as ready for review October 28, 2024 03:01

StefanFlaumberg changed the title ~~Various cosmetic changes~~ Major code refactoring and various cosmetic changes Oct 28, 2024

Merge branch 'main' into main

5438068

Fix parameter indexing in checkpoints

4690892

StefanFlaumberg mentioned this pull request Nov 29, 2024

Assertion false failed error in UndatedDTLMultiModel.hpp #16

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major code refactoring and various cosmetic changes #11

Major code refactoring and various cosmetic changes #11

StefanFlaumberg commented Sep 7, 2024 •

edited

Loading

StefanFlaumberg commented Sep 8, 2024

noahares commented Oct 3, 2024

StefanFlaumberg commented Oct 8, 2024

StefanFlaumberg commented Oct 28, 2024

noahares commented Oct 30, 2024

StefanFlaumberg commented Oct 31, 2024

StefanFlaumberg commented Nov 19, 2024 •

edited

Loading

noahares commented Nov 20, 2024

Major code refactoring and various cosmetic changes #11

Are you sure you want to change the base?

Major code refactoring and various cosmetic changes #11

Conversation

StefanFlaumberg commented Sep 7, 2024 • edited Loading

StefanFlaumberg commented Sep 8, 2024

noahares commented Oct 3, 2024

StefanFlaumberg commented Oct 8, 2024

StefanFlaumberg commented Oct 28, 2024

noahares commented Oct 30, 2024

StefanFlaumberg commented Oct 31, 2024

StefanFlaumberg commented Nov 19, 2024 • edited Loading

noahares commented Nov 20, 2024

StefanFlaumberg commented Sep 7, 2024 •

edited

Loading

StefanFlaumberg commented Nov 19, 2024 •

edited

Loading