Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions for 1rdms and 2-body exchange hole #165

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

sanchezw17
Copy link

This PR adds functions to evaluate 1-electron rdms and two-body exchange holes as functions of their atomic orbitals. The functions to evaluate the 1rmds are very similar to the preexisting functions which evaluate density; the major difference is the new functions take a list of points as one of their arguments opposed to a single set of points.

I also made small edits in density.py variable names. There were a few occasions where the var_name mentioned in docustrings did not match those used in the actual function bodies.

Checklist

  • Write a good description of what the PR does.
  • Add tests for each unit of code added (e.g. function, class)
  • Update documentation
  • Squash commits that can be grouped together
  • Rebase onto master

Type of Changes

Type
✨ New feature

@PaulWAyers
Copy link
Member

This has showed up in workflows three times in the last year, so I told @sanchezw17 to implement in gbasis rather than reinventing the wheel once more. It also addresses #101

Copy link
Contributor

@msricher msricher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@leila-pujal
Copy link
Collaborator

hi @sanchezw17, thanks a lot for your PR. I am currently going through the code. I was just testing your new function evaluate_dm_using_evaluated_orbs with an oxygen atom fchk using ccpvtz basis (total of 30 atomic orbitals). At the begning I used a random molecular grid of ~5000 grid points and it tried to allocate 190GB of memory to do the einsum in line 151 of density.py. Probably you already were aware of this because the output array for that function is (Npoints_1, Npoints_2, K_orb, K_orb) but I just wanted to write it here. Maybe we should put a warning or an assertion for a limit on how big the output matrix could be? So einsum does not crash? Please write any thoughts you may have on this topic, or anyone else who is tagged to this PR.

Copy link
Collaborator

@leila-pujal leila-pujal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left comments from my first review. Your new tests pass on my laptop but I haven't checked them in detail. Overall it looks really good to me so if we work out the comments, on my side I will be good to merge it. Thanks again for the PR @sanchezw17.

@@ -625,7 +783,7 @@ def evaluate_posdef_kinetic_energy_density(
min_output = np.min(output)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you forgot to change output to posdef_kindetic_energy_density. Also, maybe you have a typo there and wanted to use posdef_kinetic_energy_density? Just double checking


orb_evals = []
# evaluate basi(e)s on the point set(s)
for grid in points_list:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are using a list to pass the points arrays. Lists are mutable objects and I think gbasis tries to reduce the use of them and use more numpy arrays. I was wondering if this this list is ever going to be longer than 2 sets of points? If it is the case it's always going to be either 1 or 2 sets of points (meaning 1 or 2 numpy arrays) I would suggest to provide the sets separately as an argument to the function so you don't have to loop. If not, maybe I would change it to a tuple as it is immutable in python.

if not np.allclose(one_density_matrix, one_density_matrix.T):
raise ValueError("One-electron density matrix must be symmetric.")

# Tensor product for \gamma(\mathbf{r}_1,\mathbf{r}_2) = \sum_{pq} \gamma_{pq} \chi_p(\mathbf{r}_1) \chi_q(\mathbf{r}_2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here probably is that I am not familiar with the math, but based on this equation I would guess the output would be of dimension Npoints_1, Npoints_2, in your notation k,l. But you are returning an array Npoints_1, Npoints_2, K_orb, K_orb. Let me know your thoughts, @sanchezw17, @PaulWAyers



def evaluate_dm_density(one_density_matrix, basis, points_list, transform=None):
r"""Return the density of the given basis set at the given points.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I checked correctly this is the same docstring as in evaluate_density. I know they are similar but here two sets of points are used instead of one. Maybe we can add this in the docstring.

dm_eval = evaluate_dm_density(one_density_matrix, basis, points_list, transform=transform)

# evaluate hole function
numerator = np.einsum("ijkl,ijkl->ijkl", dm_eval, dm_eval)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not wrong, you are multiplying dm_eval * dm_eval right? If this is the case, I was wondering if it would be more efficient to do dm_eval * dm_eval? I know using einsum is clear for notation so I am not against it.

@sanchezw17
Copy link
Author

Hi @leila-pujal, thank you for the review. I can add either a warning or a comment at the very least (whichever you think is most appropriate in this situation). I am still working on implementing these functions, but my idea is to use these functions as callables which are integrated point-by-point, so i believe the entire 4D matrix is never stored. At the moment, I do not see the usefulness of storing the entire matrix other than bug-hunting or testing with small bases.

@leila-pujal
Copy link
Collaborator

Okay I see...just to check if I understood your message you mean you want to use on-the-fly elements of this matrix so you think you are not gonna need to store the whole array? Just to give more information to my message I got the error at this line density = np.einsum("ij,ik,jl->klij", one_density_matrix, orb_eval_list[0], orb_eval_list[1]) , basically when einsum is trying to allocate/compute the 4D matrix. Thinking that your goal is not to save this matrix and that maybe you just need to generate elements klij on the fly, meaning for each pair of orbitals generate k,l maybe you could change the code to do subsets of orbitals?. Maybe I am missing some math here but let me know what you think.

@sanchezw17
Copy link
Author

@leila-pujal you are 100% correct I can have a function pass 2 points instead of 2 sets of points. Since I am in the early stages of working with n_dimentional integration which hasn't been fully tested yet, I want the option to have the full matrix on a grid so it closely matches the shape of "evaluate_density" (which works with grid integration) to help with bug testing. I will add another function that takes two points and returns the density matrix evaluated at that point.

@PaulWAyers
Copy link
Member

I think we should not try to vectorize over orbitals, but just accumulate the sum over the orbitals. Otherwise the memory is just too much. Or loop over the orbitals whenever the number of grid points is very high.

@PaulWAyers
Copy link
Member

I think a reasonable API is to read in a set of 3D or 6D points, and then you export the DM/xHOLE on either the 3Dx3D tensor product grid or the DM/xHOLE on the 6D points.

@PaulWAyers
Copy link
Member

The density matrix is given in terms of its basis set expansion,

$$ \gamma(\mathbf{r},\mathbf{r}') = \sum_{p=1}^{N_{orbs}} \sum_{q=1}^{N_orbs} \gamma_{pq} \phi_p(\mathbf{r})\phi_q(\mathbf{r}') $$

where we have assumed that the orbitals are real. To avoid too much memory, a sensible procedure is:

  1. Evaluate $\phi_p(\mathbf{r})$ at the grid points for $\mathbf{r}$.
  2. If needed, evaluate $\phi_q(\mathbf{r}')$ at the grid points for $\mathbf{r'}$ (usually, maybe always, the same, so not required).
  3. Evaluate density matrix by looping over the orbitals; this saves the memory from forming the gigantic $N_{orbs}^2 \cdot N_{6d-grid}$ object. (I.e., vectorize over points but not orbitals.) It also allows us to screen by not including a contribution from $\phi_p(\mathbf{r})$ if is centered at a point far from $\mathbf{r}$, but that's a task for the future.

It would be ideal (not required) to allow passing a set of 3-D points (in which case the 6D tensor-product grid is formed internally) or a set of 6-D points (where two 3D sets which may not be the same are appended to each other).

@leila-pujal leila-pujal self-requested a review July 3, 2024 20:50
Copy link
Collaborator

@leila-pujal leila-pujal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sanchezw17, thanks a lot for updating your code and sorry for the late reply. I reviewed your code and everything looks good. Just a comment, I think it would be more appropriate to use reduced density matrix (rdm) instead of density matrix. For example I would change the name evaluate_dm_using_evaluated_orbs to evaluate_rdm_using_evaluated_orbs even maybe to be more specific evaluate_1rdm_using_evaluated_orbs. Also I think it would be better to specify it on the docstring too so instead of writing density matrix specify first order reduced density matrix. What do you think @FarnazH, @PaulWAyers ? This is a personal preference suggestion so if you think it is good as it is I don't have any problem.

I also pushed a commit to change using numpy.all() because I was checking the test locally with numpy 2.0 and now np.all for an array of None values returns false so your assertion fails (https://numpy.org/doc/stable/reference/generated/numpy.all.html).

I checked your tests for the first-order reduced density matrix and it looks good but for the exchange hole I don't know why this assertation np.allclose(eh, np.ones((len(points1), len(points1))) * -1) . I am sure @PaulWAyers gave you some literature and this is 100% correct but I just wanted to comment that I only checked the test pass.

I will merge this in the following days probably after I merge the pytest PR #187. Feel free to leave any comments/doubts about my review. Thanks again for working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants