[JOSS review] Code feedback #112

ml-evs · 2023-02-10T19:28:47Z

Hi @ajmedford and other authors! Apologies for the slightly slow uptake on my review (openjournals/joss-reviews#5035) -- I have now been able to install the package and try the examples and have enough to give a first round of feedback.

I think it would be very helpful if the config passed to AtomsTrainer used in many of the examples could be wrapped into a class (perhaps a dataclass). The structure of the class is basically already provided in https://amptorch.readthedocs.io/en/latest/usage.html but it is currently very easy to make a mistake or get lost amongst default values. This should also aid development going forwards.
The usage documentation should explain the relevant classes involved and requirements for e2e training and evaluating a potential. Currently much of this data is stuffed into comments in the large config code snippet.
Overall the examples could be better motivated (why you might want to do such a thing, step by step), both in the online documentation and in the code itself. Currently they showcase the novel features of Amptorch (GMP, lmdb etc.) but I felt a basic tutorial for just training the "default" network was missing. This is kinda provided on the "usage" page but it takes some work to piece it together in its current state. It would be nice if the examples generated some figures to actually see what is happening, executing the code snippets which provide little feedback.
If this package was used in the several linked preprints about GMP, singleNN etc. then perhaps code examples to generate some of the (simpler?) results from these papers would be appropriate, if at all possible.
The code itself has no docstrings describing the functionality of parameters of any of the classes or functions (that I could find). API documentation is somewhat of a hard requirement for JOSS

The GMP example should be made simpler by specifying the path to the already-provided pseudodensity files:

amptorch/examples/GMP/GMP_example.py

Lines 29 to 31 in 78cb2f1

    
           path_to_psp = "<path>/pseudodensity_psp/" 
        
           # path to the GMP pseudopotential (.g)files 
        
           # please copy the "pseudodensity_psp" folder to somehere and edit the path to it here

as e.g., Path(__file__).parent / "pseudodensity_psp".

With a default installation I was not able to get all of the tests to pass (consistency_test.py fails, as well as several in test_script.py, in both CPU and GPU mode). Could you perhaps provide some documentation for executing the tests, in case it is my set up that is wrong? (test failures hidden below)
The last code release was in July 2021 with the name "initial release". The code should be released in its complete state (which it seems to be!) before publication in JOSS. Any of the suggestions above could then be implemented in a future released version.

Test failures

 =============================================================== test session starts ================================================================platform linux -- Python 3.9.16, pytest-7.2.1, pluggy-1.0.0 -- /home/mevans/.local/conda/envs/amptorch/bin/python                                   cachedir: .pytest_cache
rootdir: /home/mevans/src/amptorch                                                                                                                  collected 23 items

amptorch/tests/consistency_test.py::test_energy_force_consistency FAILED                                                                     [  4%] amptorch/tests/cp_uncertainty_calibration_test.py::test_cp_uncertainty_calibration PASSED                                                    [  8%] amptorch/tests/cutoff_funcs_test.py::test_cutoff_funcs PASSED                                                                                [ 13%]
amptorch/tests/gaussian_descriptor_set_test.py::test_gaussian_descriptor_set PASSED                                                          [ 17%]
amptorch/tests/pretrained_test.py::test_pretrained PASSED                                                                                    [ 21%] amptorch/tests/pretrained_test.py::test_pretrained_no_config PASSED                                                                          [ 26%] amptorch/tests/test_script.py::test_cutoff_funcs PASSED                                                                                      [ 30%]
amptorch/tests/test_script.py::test_gaussian_descriptor_set PASSED                                                                           [ 34%] amptorch/tests/test_script.py::test_pretrained PASSED                                                                                        [ 39%]
amptorch/tests/test_script.py::test_pretrained_no_config PASSED                                                                              [ 43%]
amptorch/tests/test_script.py::test_lmdb_pretrained PASSED                                                                                   [ 47%]
amptorch/tests/test_script.py::test_lmdb_pretrained_no_config PASSED                                                                         [ 52%]
amptorch/tests/test_script.py::test_training PASSED                                                                                          [ 56%]
amptorch/tests/test_script.py::test_training_gmp PASSED                                                                                      [ 60%]
amptorch/tests/test_script.py::test_cp_uncertainty_calibration FAILED                                                                        [ 65%]
amptorch/tests/test_script.py::TestMethods::test_cosine_and_polynomial_cutoff_funcs PASSED                                                   [ 69%] amptorch/tests/test_script.py::TestMethods::test_gds PASSED                                                                                  [ 73%] amptorch/tests/test_script.py::TestMethods::test_load_retrain PASSED                                                                         [ 78%] amptorch/tests/test_script.py::TestMethods::test_load_retrain_lmdb PASSED                                                                    [ 82%] amptorch/tests/test_script.py::TestMethods::test_training_scenarios PASSED                                                                   [ 86%] amptorch/tests/test_script.py::TestMethods::test_training_scenarios_gmp PASSED                                                               [ 91%] amptorch/tests/test_script.py::TestMethods::test_uncertainty_cp FAILED                                                                       [ 95%] amptorch/tests/training_test.py::test_training PASSED                                                                                        [100%]

===================================================================== FAILURES =====================================================================

The text was updated successfully, but these errors were encountered:

ajmedford · 2023-05-19T18:28:19Z

@ml-evs - I think we addressed these comments in openjournals/joss-reviews#5035 by mistake. Let us know if any of these issues still need to be addressed before publication.

ml-evs · 2023-05-19T20:25:10Z

Hi @ajmedford, thanks for pointing this out, and thanks to you and @nicoleyghu for the work implementing my suggestions. I've ticked off the things that are fixed above, and although I think some basic examples would be helpful (e.g., showing plots that describe the output of the sweep example), I'm satisfied that the examples cover your advanced features well.

I've just been going through my initial testing scripts and I've run into a couple of minor issues.

The default fp_scheme seems to have changed since my last attempts, so the docs are now out of date here:

-     "fp_scheme": str,             # Fingerprinting scheme to feature dataset, "gaussian" or "gmp" (default: "gaussian")
+     "fp_scheme": str,             # Fingerprinting scheme to feature dataset, "gaussian", "gmp" or "gmpordernorm" (default: "gmpordernorm")

My old training script didn't set fp_scheme and now fails with the error

  File "/home/mevans/src/amptorch/amptorch/dataset.py", line 49, in __init__
    self.descriptor = construct_descriptor(descriptor_setup)
  File "/home/mevans/src/amptorch/amptorch/dataset.py", line 131, in construct_descriptor
    descriptor = GMPOrderNorm(MCSHs=fp_params, elements=elements)
  File "/home/mevans/src/amptorch/amptorch/descriptor/GMPOrderNorm/__init__.py", line 35, in __init__
    self.default_cutoff()
  File "/home/mevans/src/amptorch/amptorch/descriptor/GMPOrderNorm/__init__.py", line 59, in default_cutoff
    sigmas = self.MCSHs["MCSHs"]["sigmas"]

There is of course no requirement that your API stays fixed during review, but just thought I'd mention it in case it was unintentional! Setting fp_scheme to gaussian allows me to train potentials once again.

ajmedford · 2023-06-05T14:29:32Z

@ml-evs Thanks for these suggestions and pointing out the conflicts in the documentation. I think everything is now taken care of, except for the config data class, which is a work in progress but will hopefully be finished soon. I have opened a separate issue for that here: #123, and, for logistical reasons, I'm hoping we can close this issue and finalize the paper while we finish this up.

Regarding the other suggestions:

usage - we added a short demo of training on a simple dataset with exactly 2 degrees of freedom, and visualized the results with respect to one degree of freedom (the bond length).
examples - we reorganized the /examples folder upon the initial submission, and hope now the example scripts resonate better with usage documentation.

Let us know if you would like to see any additional changes before getting this officially published.

ml-evs · 2023-06-05T16:14:31Z

Thanks @ajmedford and @nicoleyghu for the hard work on this, I think they really help (my own) understanding of the code -- I hope it wasn't too arduous. I'm happy to close this issue and will give my recommendation over in the JOSS issue.

ajmedford · 2023-06-05T16:23:07Z

Thanks @ml-evs for all the detailed feedback! While the changes were a lot of work, I think they made the package much stronger, and were worth the effort. We also appreciate your patience as we navigated this process and learned a lot about open-source software development along the way.

ml-evs mentioned this issue Mar 13, 2023

[REVIEW]: AmpTorch: A Python package for scalable fingerprint-based neural network training on multi-element systems with integrated uncertainty quantification openjournals/joss-reviews#5035

Closed

ianfhunter mentioned this issue May 3, 2023

[JOSS] Issue with examples - inhomogeneous shape #115

Closed

ml-evs closed this as completed Jun 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JOSS review] Code feedback #112

[JOSS review] Code feedback #112

ml-evs commented Feb 10, 2023 •

edited

Loading

ajmedford commented May 19, 2023

ml-evs commented May 19, 2023 •

edited

Loading

ajmedford commented Jun 5, 2023

ml-evs commented Jun 5, 2023

ajmedford commented Jun 5, 2023

[JOSS review] Code feedback #112

[JOSS review] Code feedback #112

Comments

ml-evs commented Feb 10, 2023 • edited Loading

ajmedford commented May 19, 2023

ml-evs commented May 19, 2023 • edited Loading

ajmedford commented Jun 5, 2023

ml-evs commented Jun 5, 2023

ajmedford commented Jun 5, 2023

ml-evs commented Feb 10, 2023 •

edited

Loading

ml-evs commented May 19, 2023 •

edited

Loading