-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Green's Functions #13
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial round of comments - I can get things to compile locally. I'll fix CI this evening
@CarlosMejZ Do you have any ideas on how you might want to unit test this code? Are there exact results we can compare to / things we could reliably reproduce? |
…Band Lanczos routines to use matrices which are guaranteed contiguous in memory.
delete[] WORK; | ||
delete[] E; | ||
D.resize(N); | ||
lapack::heev_2stage(JOBZ, UPLO, N, A.data(), LDA, D.data()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on the 2stage
, but that' actually isn't faster for all general cases. Brings up an interesting point about how to select the algorithm though... Might be good to have a switch. I'll give it some thought.
include/macis/gf/inn_prods.hpp
Outdated
@@ -32,14 +32,7 @@ namespace macis { | |||
*/ | |||
inline double MyInnProd(const std::vector<double>& vecR, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point it might make sense to just remove these functions entirely since all overloads call the same blaspp
function. Not a high-priority now, but something to keep in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you rather have the full, lengthy blaspp
call everytime, or a single wrapper to make it a little easier to read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Globally, yes, but having local (e.g. lambda) aliases would be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can make local aliases.
@@ -293,57 +159,42 @@ bool GetEigsysBand(std::vector<std::vector<double> > &mat, int nSupDiag, | |||
eigvals.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good piece of functionality. It might make sense to wrap it up into a nice LAPACK-style function for use elsewhere.
… all matrices to avoid using vector<vector> data types.
include/macis/gf/bandlan.hpp
Outdated
// temp[coeff] -= bandH[it - 1][jt - 1] * qs[band_indx_j][coeff]; | ||
temp[coeff] -= bandH[it - 1][jt - 1] * qs[N * band_indx_j + coeff]; | ||
temp[coeff] -= | ||
bandH[(it - 1) * nLanIts + jt - 1] * qs[N * band_indx_j + coeff]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you chose (what looks like to me) row-major for bandH
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably just the similarity with how I was linearizing qs
(where each vector is stored contiguously). In the case of bandH
, since it's a Hermitian matrix, it does not really matter. But I can change it if you prefer to have all matrices in column-major.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All depends on what is the most efficient storage scheme - i.e. which scheme will most likely give you the most opportunities for contiguous data access. It doesn't look like it particularly matters here.
However, it looks like this operation can be made a bit more efficient - correct me if I'm wrong, but your computing something that looks like
temp(:) -= alpha * qs(:,band_idx_j)
where alpha = bandH(it-1, jt-1)
. That's an axpy
. Looks like it can be replaced on line 207 too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
include/macis/gf/bandlan.hpp
Outdated
// temp[coeff] -= bandH[it - 1][jt - 1] * qs[band_indx_j][coeff]; | ||
temp[coeff] -= bandH[it - 1][jt - 1] * qs[N * band_indx_j + coeff]; | ||
temp[coeff] -= | ||
bandH[(it - 1) * nLanIts + jt - 1] * qs[N * band_indx_j + coeff]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All depends on what is the most efficient storage scheme - i.e. which scheme will most likely give you the most opportunities for contiguous data access. It doesn't look like it particularly matters here.
However, it looks like this operation can be made a bit more efficient - correct me if I'm wrong, but your computing something that looks like
temp(:) -= alpha * qs(:,band_idx_j)
where alpha = bandH(it-1, jt-1)
. That's an axpy
. Looks like it can be replaced on line 207 too
include/macis/gf/bandlan.hpp
Outdated
* @date 25/04/2022 | ||
*/ | ||
template <class Cont, class Functor> | ||
void MyBandLan(const Functor &H, std::vector<Cont> &qs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming conventions of My<Func>
should get swapped out for something more meaningful. I think BandLan
works fine here.
include/macis/gf/bandlan.hpp
Outdated
<< std::endl; | ||
} else { | ||
#pragma omp parallel for | ||
for(size_t coeff = 0; coeff < temp.size(); coeff++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an axpy
with alpha = 1.0/bandH(it-1,jt-1)
and beta = 0
include/macis/gf/bandlan.hpp
Outdated
bandH.resize(nLanIts * nLanIts); | ||
break; | ||
#pragma omp parallel for | ||
for(size_t coeff = 0; coeff < temp.size(); coeff++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd either use scal
with alpha=0
or a memset
here since it's contiguous
|
||
for(int it = 1; it <= nLanIts; it++) { | ||
int band_indx_i = | ||
true_indx[it]; // TO WHAT ELEMENT OF THE VECTOR SET DO WE APPLY THIS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this indirection here? Generally we want to avoid things like this (to e.g. allow for better usage of Level-3 BLAS when possible), but if its completely necessary, it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean the use of the effective index true_indx
? (I'm afraid I don't know what you mean by indirection)
This is what I came up with to apply the matvecs in the right order to the right vector, since in band Lanczos, unlike in the regular one, at every iteration you act on a different vector in your set of nbands
. Why is this a problem for BLAS? Shouldn't it be fine as long as you pass the right memory sector to act on?
I guess that a way to avoid this true_indx
variable could be to turn the for loop over it
into two, one external loop going over nLanIts // nbands
and one internal loop going over nbands
. I think that should be completely equivalent, and then the index in the internal loop would be essentially iterating over which vector in the basis to act on with the Hamiltonian. Do you think this would make the code better?
…ze the evaluation of G[w] for each frequency in the single band case. Removed MyInnProd header, substituted by lambdas for blas::dot.
…d core space determinants into the ASCI ranking (to allow for them to be substituted).
…y models. Test case added.
… be used in GF calculation.
…ackpp was not returning the right eigenvectors, but the eigenvectors to an intermediat tridiagonal matrix.
…ls LOBPCG with random guess vectors
…ompatible with comlpex wave functions. The normal Lanczos calculation of off-diagonal elements is not appropriate for complex wave functions, though.
These are the basic functionalities to compute one-body Green's functions for a selected CI wave function, as described in Phys. Rev. B 100, 125165 (2019) and J. Chem. Phys. 154, 121101 (2021). It implements both a continuous fraction based Lanczos and band Lanczos method. The console output needs to be adapted to match the one used in the rest of the code, and unit testing is required (so far, only a small example testing is possible within the standalone_driver.cxx test code).