Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder atoms in label data (Fixes 502) + documenting C++ #521

Merged
merged 21 commits into from
Sep 23, 2024

Conversation

ndickson-nvidia
Copy link
Collaborator

@ndickson-nvidia ndickson-nvidia commented Jul 22, 2024

Changelogs


Checklist:

  • _Was this PR discussed in an issue? Wrong ordering of nodes/edges in multitasking for node/edge level tasks #467
  • Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
  • Update the API documentation is a new function is added, or an existing one is deleted.
  • Write concise and explanatory changelogs above.
  • If possible, assign one of the following labels to the PR: feature, fix or test (or ask a maintainer to do it for you).

discussion related to that PR

Copy link
Collaborator

@DomInvivo DomInvivo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that canonicalization was used for the re-ordering at dataloading and data preparation time. The only issue is that we can't simply use the "predict mode" for real-world predictions without un-ordering. But I think it's fine. We should probably warn about that in the docs and the code

  • Log warning about canonicalization cannot be used for predictions without un-ordering
  • Warn in the documentation about the point above

@DomInvivo DomInvivo mentioned this pull request Jul 23, 2024
28 tasks
@ndickson-nvidia ndickson-nvidia changed the title Draft: Reorder atoms in label data (Fixes 502) Reorder atoms in label data (Fixes 502) Sep 23, 2024
@ndickson-nvidia ndickson-nvidia marked this pull request as ready for review September 23, 2024 14:08
ndickson-nvidia and others added 15 commits September 23, 2024 14:17
…ame molecule appears in multiple tasks with different atom orders, to be consistent with the first task's atom order
…gs, compute_mol_keys, compute_stats, save_non_label_data, and save_label_data, to make the code more manageable.
…sks with the same molecule and one of them has edge-level label data. This should order bonds consistent with the first copy of the molecule.
… atom classes in the SMILES strings, and only if they represent a complete atom order, to fix the accidental removal of support for explcit ordering. The atom classes (atom map numbers) are then removed, so that they don't cause equivalent molecules to have different canonical atom orders. This allows the ordered parameter to default to true again. Some callers may want to set this to false, for performance.
…_laplacian_eigendecomp to accept `const int32_t*` instead of `const std::vector<int32_t>*`
@DomInvivo DomInvivo changed the title Reorder atoms in label data (Fixes 502) Reorder atoms in label data (Fixes 502) + documenting C++ Sep 23, 2024
@DomInvivo DomInvivo merged commit e887176 into datamol-io:graphium_3.0 Sep 23, 2024
2 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants