Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added branch composition to the modelsystem normalizer #76

Merged
merged 12 commits into from
Jun 4, 2024

Conversation

JFRudzinski
Copy link
Collaborator

No description provided.

@JFRudzinski JFRudzinski linked an issue May 28, 2024 that may be closed by this pull request
@coveralls
Copy link

coveralls commented May 28, 2024

Pull Request Test Coverage Report for Build 9268882832

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 10 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.3%) to 98.22%

Files with Coverage Reduction New Missed Lines %
tests/conftest.py 10 89.62%
Totals Coverage Status
Change from base Build 9254335937: -0.3%
Covered Lines: 607
Relevant Lines: 618

💛 - Coveralls

@JFRudzinski
Copy link
Collaborator Author

@JosePizarro3, @Bernadette-Mohr
I added the attribute composition_formula to ModelSystem, which describes the children of each ModelSystem with the notation X(n)Y(m), where X, Y are the branch labels and n,m are the number of repeating branches with the same label. This is applied during normalization only for representative systems. At the lowest level of the hierarchy, i.e., when a model system has no model_system subsection, composition formula then corresponds to the normal chemical formula in terms of the atom_labels.

I am not attached to this notation, but I think it is useful for understanding what each branch is containing. Let me know what you think. There are TODOs and ?'s that warrant addressing, please have a look at those.

Copy link
Collaborator

@JosePizarro3 JosePizarro3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, just some minor reorganization comments.

I am mainly interested on moving the couple of functions you have under normalize() directly as class methods and combine them into something called resolve_composition_formula. Still, you will need to define a specific function inside there to do the recursion.

I will also change the .get('...') methods for our typical style to resolve quantities and sub-sections, e.g., system.model_system instead of system.get('model_system'). I think the second would return a dictionary, but I'd might be wrong.

Let me know if you need help or something is not clear.

src/nomad_simulations/model_system.py Outdated Show resolved Hide resolved
src/nomad_simulations/model_system.py Outdated Show resolved Hide resolved
src/nomad_simulations/model_system.py Outdated Show resolved Hide resolved
src/nomad_simulations/utils/utils.py Outdated Show resolved Hide resolved
src/nomad_simulations/model_system.py Outdated Show resolved Hide resolved
src/nomad_simulations/model_system.py Outdated Show resolved Hide resolved
src/nomad_simulations/model_system.py Outdated Show resolved Hide resolved
src/nomad_simulations/model_system.py Outdated Show resolved Hide resolved
tests/test_model_system.py Outdated Show resolved Hide resolved
@JFRudzinski
Copy link
Collaborator Author

@JosePizarro3 In terms of system.model_system instead of system.get('model_system'), they return the same thing if system.model_system exists. However, the .get() returns None if it doesn't exist, whereas the former breaks with an error. I guess it is somehow guaranteed that these archive sections are populated with all their possible attributes, but I am just used to using .get() to be safe.

I will change it, just wanted to clarify.

@JosePizarro3
Copy link
Collaborator

@JosePizarro3 In terms of system.model_system instead of system.get('model_system'), they return the same thing if system.model_system exists. However, the .get() returns None if it doesn't exist, whereas the former breaks with an error. I guess it is somehow guaranteed that these archive sections are populated with all their possible attributes, but I am just used to using .get() to be safe.

I will change it, just wanted to clarify.

Yeah, this is actually kind of annoying. @TLCFEM @ladinesa could this be implemented or is there any reason why we don't want system.model_system in the example above to return None?

@ladinesa
Copy link
Collaborator

@JosePizarro3 In terms of system.model_system instead of system.get('model_system'), they return the same thing if system.model_system exists. However, the .get() returns None if it doesn't exist, whereas the former breaks with an error. I guess it is somehow guaranteed that these archive sections are populated with all their possible attributes, but I am just used to using .get() to be safe.
I will change it, just wanted to clarify.

Yeah, this is actually kind of annoying. @TLCFEM @ladinesa could this be implemented or is there any reason why we don't want system.model_system in the example above to return None?

model_system since it is repeating subsection should return an empty list right?

@JFRudzinski
Copy link
Collaborator Author

@JosePizarro3 In terms of system.model_system instead of system.get('model_system'), they return the same thing if system.model_system exists. However, the .get() returns None if it doesn't exist, whereas the former breaks with an error. I guess it is somehow guaranteed that these archive sections are populated with all their possible attributes, but I am just used to using .get() to be safe.
I will change it, just wanted to clarify.

Yeah, this is actually kind of annoying. @TLCFEM @ladinesa could this be implemented or is there any reason why we don't want system.model_system in the example above to return None?

model_system since it is repeating subsection should return an empty list right?

I actually wasn't saying it doesn't. I was asking if all metainfo sections and attributes are automatically populated with None (or empty lists) so that you don't ever need to do .get() because you won't get an error. Is that true? At what point are they populated, at instantiation?

@ladinesa
Copy link
Collaborator

@JosePizarro3 In terms of system.model_system instead of system.get('model_system'), they return the same thing if system.model_system exists. However, the .get() returns None if it doesn't exist, whereas the former breaks with an error. I guess it is somehow guaranteed that these archive sections are populated with all their possible attributes, but I am just used to using .get() to be safe.
I will change it, just wanted to clarify.

Yeah, this is actually kind of annoying. @TLCFEM @ladinesa could this be implemented or is there any reason why we don't want system.model_system in the example above to return None?

model_system since it is repeating subsection should return an empty list right?

I actually wasn't saying it doesn't. I was asking if all metainfo sections and attributes are automatically populated with None (or empty lists) so that you don't ever need to do .get() because you won't get an error. Is that true? At what point are they populated, at instantiation?

yes it should be the case, I have not encoutered a case where I need to use .get

@JFRudzinski JFRudzinski requested a review from JosePizarro3 June 4, 2024 07:52
@JFRudzinski
Copy link
Collaborator Author

@JosePizarro3 This is ready for another review. No rush though, you can leave it till after the project meeting if you don't have time.

I did make some improvements to the testing, I know it's not exactly what you had in mind, and I still feel that it is not ideal in the sense that the generation of the hierarchy is sort of complicated (I guess I could move this to some template generator?), but I am not exactly sure an alternative.

That being said, I think the functionality is much better now in the sense that I actually test a bunch of cases where quantities might be missing, and I tried to document this in the description.

Copy link
Collaborator

@JosePizarro3 JosePizarro3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, just some final minor details that will make your life easier when writing docstrings for functions.

I really liked the testing, it made it very easy to understand the implementation, and I am surprised that this actually decreased the coverage... I have to ask the details about this package, it is very strange sometimes.

src/nomad_simulations/general.py Show resolved Hide resolved
src/nomad_simulations/general.py Outdated Show resolved Hide resolved
src/nomad_simulations/general.py Outdated Show resolved Hide resolved
src/nomad_simulations/general.py Outdated Show resolved Hide resolved
src/nomad_simulations/utils/utils.py Outdated Show resolved Hide resolved
tests/test_model_system.py Outdated Show resolved Hide resolved
tests/test_model_system.py Outdated Show resolved Hide resolved
@JFRudzinski
Copy link
Collaborator Author

@JosePizarro3 Thanks for the tips, I think I have addressed everything. Let me know if you think we are ready to consider merging.

btw - have you applied ruff to this repo or do I need to do it manually? My auto-ruff formatting is not instigated for this repo, not sure if that is something that I need to address locally or something with the repo.

@JosePizarro3
Copy link
Collaborator

@JosePizarro3 Thanks for the tips, I think I have addressed everything. Let me know if you think we are ready to consider merging.

btw - have you applied ruff to this repo or do I need to do it manually? My auto-ruff formatting is not instigated for this repo, not sure if that is something that I need to address locally or something with the repo.

So in the pyproject.toml ruff is added, so autoformatting should work. You also have a couple of files with conflicts in develop, maybe you need to rebase.

@JFRudzinski JFRudzinski force-pushed the 73-modelsystem-for-h5md branch from 3799690 to c60c66c Compare June 4, 2024 12:25
@coveralls
Copy link

coveralls commented Jun 4, 2024

Pull Request Test Coverage Report for Build 9367142112

Details

  • 49 of 49 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.07%) to 98.749%

Totals Coverage Status
Change from base Build 9365151737: 0.07%
Covered Lines: 947
Relevant Lines: 959

💛 - Coveralls

@JFRudzinski
Copy link
Collaborator Author

@JosePizarro3 Thanks for the tips, I think I have addressed everything. Let me know if you think we are ready to consider merging.
btw - have you applied ruff to this repo or do I need to do it manually? My auto-ruff formatting is not instigated for this repo, not sure if that is something that I need to address locally or something with the repo.

So in the pyproject.toml ruff is added, so autoformatting should work. You also have a couple of files with conflicts in develop, maybe you need to rebase.

ok rebased and applied ruff manually (not sure what's going on there)...looks good?

@JosePizarro3
Copy link
Collaborator

Yeah, please, merge.

Maybe there is something off with your virtual environment and the rules applied (I am guessing some conflict on that direction). We can check it out in the office this week, it should be automatically applied when saving a file.

@JFRudzinski
Copy link
Collaborator Author

Yeah, please, merge.

Maybe there is something off with your virtual environment and the rules applied (I am guessing some conflict on that direction). We can check it out in the office this week, it should be automatically applied when saving a file.

cool, thanks a lot for all your help with this!

@JFRudzinski JFRudzinski merged commit e032021 into develop Jun 4, 2024
4 checks passed
@JFRudzinski JFRudzinski deleted the 73-modelsystem-for-h5md branch June 4, 2024 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ModelSystem for H5MD
4 participants