-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor and cleanup tree finetuning code #1250
Refactor and cleanup tree finetuning code #1250
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
"Is leaf and not root" should always be the case for trees with more than one node, probably redundant. Old version: nextclade/packages_rs/nextclade/src/tree/tree_builder.rs Lines 173 to 175 in 15757ec
I kept it as is in the new version, but I think we can remove the check for root, because nextclade/packages_rs/nextclade/src/tree/tree_builder.rs Lines 207 to 209 in a442999
(Seems |
Check for potential bug here. For example, it might be that the original code meant to do something else. This is a roundrip: node -> node key -> node, equivalent to no-op: nextclade/packages_rs/nextclade/src/tree/tree_builder.rs Lines 169 to 171 in 15757ec
In the new version I replaced it with just the variable itself:
|
The control flow here is very convoluted:
Can this be simplified? Old version: nextclade/packages_rs/nextclade/src/tree/tree_builder.rs Lines 159 to 182 in 15757ec
In the new version, I kept it mostly as is, but it looks simpler (due to removal of assignment to mutable vars): nextclade/packages_rs/nextclade/src/tree/tree_builder.rs Lines 188 to 213 in a442999
|
The refactoring attempt in this PR involved extracting new scopes (functions) which breaks continuity of the original large scope, which happens to have mutable state. Because the original code relied on modifying the state a lot, I might have introduced bugs where the variables are not updated properly anymore. This needs to be double-checked. |
by checking first whether the candidate_node equals the current best node, the set of conditionals gets simplified and multiple previous conditions are combined into one. There are now three possible results: - move to the parent node. Happens whenever best_node==candidate_node and all mutations that lead to the candidate_node are also found in the private mutations of best_node - move to a child (candidate_node!=best_node). This only happens when there is at least one mutation shared. - stay: in this case, the final placing is either a direct child of candidate_node, or splits the branch leading to it.
- replace `.left.nuc_muts.is_empty()` with a check for `n_left_muts==0` which is calculated using the same function as `n_shared_muts`. - fix left/right error is split result when attaching to the root
Followup of #1249
Inspired by refactoring in
15757ec
(#1249) I decided to continue, because the code in thefinetune_nearest_node()
can be clearly divided into independent functions.Change of terminology, to clarify meaning of varables:
current_best_node
->best_node
(this is the loop's external state)best_node
->candidate_node
best_split_result
->candidate_split
I tried to make commits to contain only 1 minimal change which compiles and runs. Plus there is a few pink elephants I encountered - reported in the PR comments below.