Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Green's Functions #13

Open
wants to merge 39 commits into
base: master
Choose a base branch
from
Open

Green's Functions #13

wants to merge 39 commits into from

Conversation

CarlosMejZ
Copy link
Collaborator

These are the basic functionalities to compute one-body Green's functions for a selected CI wave function, as described in Phys. Rev. B 100, 125165 (2019) and J. Chem. Phys. 154, 121101 (2021). It implements both a continuous fraction based Lanczos and band Lanczos method. The console output needs to be adapted to match the one used in the rest of the code, and unit testing is required (so far, only a small example testing is possible within the standalone_driver.cxx test code).

@CarlosMejZ CarlosMejZ added the enhancement New feature or request label Jun 2, 2023
@CarlosMejZ CarlosMejZ requested a review from wavefunction91 June 2, 2023 15:59
Copy link
Owner

@wavefunction91 wavefunction91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial round of comments - I can get things to compile locally. I'll fix CI this evening

include/macis/gf/bandlan.hpp Outdated Show resolved Hide resolved
include/macis/gf/bandlan.hpp Outdated Show resolved Hide resolved
include/macis/gf/bandlan.hpp Outdated Show resolved Hide resolved
include/macis/gf/bandlan.hpp Outdated Show resolved Hide resolved
include/macis/gf/bandlan.hpp Outdated Show resolved Hide resolved
include/macis/gf/inn_prods.hpp Outdated Show resolved Hide resolved
include/macis/gf/lanczos.hpp Outdated Show resolved Hide resolved
src/macis/gf/bandlan.cxx Outdated Show resolved Hide resolved
src/macis/gf/eigsolver.cxx Outdated Show resolved Hide resolved
src/macis/gf/gf.cxx Outdated Show resolved Hide resolved
@wavefunction91
Copy link
Owner

@CarlosMejZ Do you have any ideas on how you might want to unit test this code? Are there exact results we can compare to / things we could reliably reproduce?

@wavefunction91 wavefunction91 changed the title Feature/gf Green's Functions Jun 4, 2023
CarlosMejZ and others added 2 commits June 5, 2023 17:12
…Band Lanczos routines to use matrices which are guaranteed contiguous in memory.
src/macis/gf/bandlan.cxx Outdated Show resolved Hide resolved
delete[] WORK;
delete[] E;
D.resize(N);
lapack::heev_2stage(JOBZ, UPLO, N, A.data(), LDA, D.data());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call on the 2stage, but that' actually isn't faster for all general cases. Brings up an interesting point about how to select the algorithm though... Might be good to have a switch. I'll give it some thought.

include/macis/gf/eigsolver.hpp Outdated Show resolved Hide resolved
include/macis/gf/gf.hpp Outdated Show resolved Hide resolved
@@ -32,14 +32,7 @@ namespace macis {
*/
inline double MyInnProd(const std::vector<double>& vecR,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point it might make sense to just remove these functions entirely since all overloads call the same blaspp function. Not a high-priority now, but something to keep in mind.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you rather have the full, lengthy blaspp call everytime, or a single wrapper to make it a little easier to read?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Globally, yes, but having local (e.g. lambda) aliases would be fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can make local aliases.

@@ -293,57 +159,42 @@ bool GetEigsysBand(std::vector<std::vector<double> > &mat, int nSupDiag,
eigvals.clear();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good piece of functionality. It might make sense to wrap it up into a nice LAPACK-style function for use elsewhere.

// temp[coeff] -= bandH[it - 1][jt - 1] * qs[band_indx_j][coeff];
temp[coeff] -= bandH[it - 1][jt - 1] * qs[N * band_indx_j + coeff];
temp[coeff] -=
bandH[(it - 1) * nLanIts + jt - 1] * qs[N * band_indx_j + coeff];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you chose (what looks like to me) row-major for bandH?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably just the similarity with how I was linearizing qs (where each vector is stored contiguously). In the case of bandH, since it's a Hermitian matrix, it does not really matter. But I can change it if you prefer to have all matrices in column-major.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All depends on what is the most efficient storage scheme - i.e. which scheme will most likely give you the most opportunities for contiguous data access. It doesn't look like it particularly matters here.

However, it looks like this operation can be made a bit more efficient - correct me if I'm wrong, but your computing something that looks like

temp(:) -= alpha * qs(:,band_idx_j)

where alpha = bandH(it-1, jt-1). That's an axpy. Looks like it can be replaced on line 207 too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

// temp[coeff] -= bandH[it - 1][jt - 1] * qs[band_indx_j][coeff];
temp[coeff] -= bandH[it - 1][jt - 1] * qs[N * band_indx_j + coeff];
temp[coeff] -=
bandH[(it - 1) * nLanIts + jt - 1] * qs[N * band_indx_j + coeff];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All depends on what is the most efficient storage scheme - i.e. which scheme will most likely give you the most opportunities for contiguous data access. It doesn't look like it particularly matters here.

However, it looks like this operation can be made a bit more efficient - correct me if I'm wrong, but your computing something that looks like

temp(:) -= alpha * qs(:,band_idx_j)

where alpha = bandH(it-1, jt-1). That's an axpy. Looks like it can be replaced on line 207 too

* @date 25/04/2022
*/
template <class Cont, class Functor>
void MyBandLan(const Functor &H, std::vector<Cont> &qs,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming conventions of My<Func> should get swapped out for something more meaningful. I think BandLan works fine here.

<< std::endl;
} else {
#pragma omp parallel for
for(size_t coeff = 0; coeff < temp.size(); coeff++)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an axpy with alpha = 1.0/bandH(it-1,jt-1) and beta = 0

bandH.resize(nLanIts * nLanIts);
break;
#pragma omp parallel for
for(size_t coeff = 0; coeff < temp.size(); coeff++)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd either use scal with alpha=0 or a memset here since it's contiguous


for(int it = 1; it <= nLanIts; it++) {
int band_indx_i =
true_indx[it]; // TO WHAT ELEMENT OF THE VECTOR SET DO WE APPLY THIS
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this indirection here? Generally we want to avoid things like this (to e.g. allow for better usage of Level-3 BLAS when possible), but if its completely necessary, it's fine.

Copy link
Collaborator Author

@CarlosMejZ CarlosMejZ Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the use of the effective index true_indx? (I'm afraid I don't know what you mean by indirection)

This is what I came up with to apply the matvecs in the right order to the right vector, since in band Lanczos, unlike in the regular one, at every iteration you act on a different vector in your set of nbands. Why is this a problem for BLAS? Shouldn't it be fine as long as you pass the right memory sector to act on?

I guess that a way to avoid this true_indx variable could be to turn the for loop over it into two, one external loop going over nLanIts // nbands and one internal loop going over nbands. I think that should be completely equivalent, and then the index in the internal loop would be essentially iterating over which vector in the basis to act on with the Hamiltonian. Do you think this would make the code better?

CarlosMejZ and others added 3 commits June 21, 2023 10:08
…ze the evaluation of G[w] for each frequency in the single band case. Removed MyInnProd header, substituted by lambdas for blas::dot.
…d core space determinants into the ASCI ranking (to allow for them to be substituted).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants