Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting GAMESS output with multistate calculation #40

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

neelravi
Copy link
Contributor

@neelravi neelravi commented Oct 24, 2024

This pull request offers a solution (Issue #39) to handle the conversion of gamess or gaussian multistate calculation into multiple trexio files.

The summary of the work:

  • The original code is put under the loop over nstates for s in range(res.num_states):
  • Trexio files are generated for each state with names ending with _state_0.hdf5 or _state_1.hdf5
  • Tests are modified for these new naming convention
  • The following new block is added to store the state specific information:
        # State group
        # ---------
        trexio.write_state_num(trexio_file,res.num_states)
        trexio.write_state_current_label(trexio_file, "State "+ str(s) + " of " + str(res.num_states-1))
        trexio.write_state_label(trexio_file, [f"state{i}" for i in range(res.num_states)])
        trexio.write_state_file_name(trexio_file, [f"{trexio_basename}_state_{i}.hdf5" for i in range(res.num_states)])

        # CSF group
        # ---------
        try:
            num_csfs = len(res.csf_coefficients[s])
        except:
            num_csfs = len(res.det_coefficients[0])

        offset_file = 0
        trexio.write_csf_coefficient(trexio_file, offset_file, num_csfs, res.csf_coefficients[s])

@neelravi neelravi added the enhancement New feature or request label Oct 24, 2024
Copy link
Member

@q-posev q-posev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neelravi !

I left a few comments. My main concern is about custom _state suffix which breaks the way the converter used to work. Not sure if this is the way to go.

Also, if this PR closes issue #39 , you can add "- Closes #39" line to the PR description and this issue will be closed automatically when the PR is merged.

.github/workflows/test.yml Outdated Show resolved Hide resolved
# ------
# Get the basename of the TREXIO file
try:
trexio_basename = os.path.splitext(os.path.basename(trexio_filename))[0]
Copy link
Member

@q-posev q-posev Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is a good idea to do it this way. The user provided trexio_filename via CLI for a reason - they would expect it to be the name of the output TREXIO file. The moment you start to modify this logic with custom suffixes (like _state used here) - it becomes confusing and requires extra effort from the user. Now the ground state trexio files are all suffixed with _state_0 which basically breaks how the converter used to work and might affect the user workflows.
I personally think that it might be better to provide the state ID via CLI and let the user do the the conversion N times where N is the desired number of states. Yes, it will require N calls to resultsFile but at least the user will have full control of the pipeline. @scemama @neelravi what do you think?

trexio.write_state_file_name(trexio_file, [f"{trexio_basename}_state_{i}.hdf5" for i in range(res.num_states)])

# CSF group
# ---------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it works as expected? What happens if resultsFile instance does not contain csf_coefficients array? What will be written in the TREXIO file in this case when you call write_csf_coefficient?

raise NotImplementedError(f"Please remove the {trexio_filename} directory manually.")
# Check if the TREXIO file already exists, remove it if it does
trexio_basename = os.path.splitext(os.path.basename(trexio_filename))[0]
existing_file = glob.glob(trexio_basename + "_state_" + "*.hdf5")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about custom suffix

@TREX-CoE TREX-CoE deleted a comment from q-posev Oct 31, 2024
@q-posev
Copy link
Member

q-posev commented Nov 1, 2024

@neelravi after a quick chat with @scemama , we believe that the best option would be to

  1. keep the original TREXIO output filename as is for ground states (no state_0 suffix). This way the converter remains backwards compatible + no unexpected surprises for the user
  2. offer to user the possibility to specify the suffix to be added for excited state TREXIO filename. This can be an optional CLI argument (e.g. state_suffix) and the default can be "state" as in this PR.

@q-posev
Copy link
Member

q-posev commented Nov 14, 2024

@neelravi we are planning a PyPI release (for pip install) next week. Do you have time to make the necessary changes in this PR before that?

@neelravi
Copy link
Contributor Author

@q-posev I will work on it today.

@neelravi
Copy link
Contributor Author

  • The updated pull request keeps backward compatibility for the names of trexio files generated.
  • Extra checks to see if csf_coefficients are present before attempting to write
  • User can provide state suffix string, if the GAMESS output file contains multiple states.
  • Test added to check these new additions
  • closes Converting GAMESS output with multistate calculation #39

@q-posev
Copy link
Member

q-posev commented Nov 15, 2024

@neelravi thanks ! I left a few comments.

The updated pull request keeps backward compatibility for the names of trexio files generated.

This is not correct (see my comments) for the ground state files. Please fix and adapt the CI so that I can merge this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Converting GAMESS output with multistate calculation
2 participants