Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Abstract the Monte Carlo coordinates #3767

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions src/Particle/ParticleBase/RandomSeqGenerator.h
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,12 @@ inline void makeGaussRandomWithEngine(std::vector<TinyVector<T, D>>& a, RG& rng)
assignGaussRand(&(a[0][0]), a.size() * D, rng);
}

template<typename T, class RG>
inline void makeGaussRandomWithEngine(std::vector<T>& a, RG& rng)
{
assignGaussRand(&(a[0]), a.size(), rng);
}

template<typename T, class RG>
inline void makeGaussRandomWithEngine(ParticleAttrib<T>& a, RG& rng)
{
Expand Down
1 change: 1 addition & 0 deletions src/QMCDrivers/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ set(QMCDRIVERS
QMCUpdateBase.cpp
GreenFunctionModifiers/DriftModifierBuilder.cpp
GreenFunctionModifiers/DriftModifierUNR.cpp
MCCoords.cpp
VMC/VMCUpdatePbyP.cpp
VMC/VMCUpdateAll.cpp
VMC/VMCFactory.cpp
Expand Down
22 changes: 6 additions & 16 deletions src/QMCDrivers/ContextForSteps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
// This file is distributed under the University of Illinois/NCSA Open Source License.
// See LICENSE file in top directory for details.
//
// Copyright (c) 2019 developers.
// Copyright (c) 2022 developers.
//
// File developed by: Peter Doak, [email protected], Oak Ridge National Laboratory
//
Expand All @@ -14,22 +14,12 @@

namespace qmcplusplus
{

ContextForSteps::ContextForSteps(int num_walkers,
int num_particles,
std::vector<std::pair<int, int>> particle_group_indexes,
RandomGenerator& random_gen)
int num_particles,
std::vector<std::pair<int, int>> particle_group_indexes,
RandomGenerator& random_gen)
: particle_group_indexes_(particle_group_indexes), random_gen_(random_gen)
{
/** glambda to create type T with constructor T(int) and put in it unique_ptr
*
* captures num_particles to use as argument to constructor
* gets T for type unique_ptr unique is templated on
*/
auto constructT = [num_particles](auto& unique) {
unique.reset(new typename std::remove_pointer<decltype(unique.get())>::type(num_particles));
};

walker_deltas_.resize(num_walkers * num_particles);
}
{}

} // namespace qmcplusplus
17 changes: 3 additions & 14 deletions src/QMCDrivers/ContextForSteps.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
// This file is distributed under the University of Illinois/NCSA Open Source License.
// See LICENSE file in top directory for details.
//
// Copyright (c) 2019 developers.
// Copyright (c) 2022 developers.
//
// File developed by: Peter Doak, [email protected], Oak Ridge National Laboratory
//
Expand All @@ -19,9 +19,11 @@
#include "Particle/Walker.h"
#include "QMCDrivers/Crowd.h"
#include "ParticleBase/RandomSeqGenerator.h"
#include "MCCoords.hpp"

namespace qmcplusplus
{

/** Thread local context for moving walkers
*
* created once per driver per crowd
Expand All @@ -46,22 +48,9 @@ class ContextForSteps
int get_num_groups() const { return particle_group_indexes_.size(); }
RandomGenerator& get_random_gen() { return random_gen_; }

void nextDeltaRs(size_t num_rs)
{
// hate to repeat this pattern, this should never resize.
walker_deltas_.resize(num_rs);
makeGaussRandomWithEngine(walker_deltas_, random_gen_);
}

std::vector<PosType>& get_walker_deltas() { return walker_deltas_; }
auto deltaRsBegin() { return walker_deltas_.begin(); };

int getPtclGroupStart(int group) const { return particle_group_indexes_[group].first; }
int getPtclGroupEnd(int group) const { return particle_group_indexes_[group].second; }

protected:
std::vector<PosType> walker_deltas_;

/** indexes of start and stop of each particle group;
*
* Seems like these should be iterators but haven't thought through the implications.
Expand Down
84 changes: 56 additions & 28 deletions src/QMCDrivers/DMC/DMCBatched.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#include "Utilities/ProgressReportEngine.h"
#include "QMCDrivers/DMC/WalkerControl.h"
#include "QMCDrivers/SFNBranch.h"
#include "QMCDrivers/ContextForSteps.h"
#include "MemoryUsage.h"

namespace qmcplusplus
Expand Down Expand Up @@ -60,6 +61,7 @@ void DMCBatched::setNonLocalMoveHandler(QMCHamiltonian& golden_hamiltonian)
dmcdriver_input_.get_alpha(), dmcdriver_input_.get_gamma());
}

template<class MCCOORDS>
void DMCBatched::advanceWalkers(const StateForThread& sft,
Crowd& crowd,
DriverTimers& timers,
Expand Down Expand Up @@ -103,13 +105,13 @@ void DMCBatched::advanceWalkers(const StateForThread& sft,
const int num_walkers = crowd.size();

//This generates an entire steps worth of deltas.
step_context.nextDeltaRs(num_walkers * sft.population.get_num_particles());
auto it_delta_r = step_context.deltaRsBegin();
std::size_t num_deltas = num_walkers * sft.population.get_num_particles();

std::vector<TrialWaveFunction::GradType> grads_now(num_walkers, TrialWaveFunction::GradType(0.0));
std::vector<TrialWaveFunction::GradType> grads_new(num_walkers, TrialWaveFunction::GradType(0.0));
std::vector<TrialWaveFunction::PsiValueType> ratios(num_walkers, TrialWaveFunction::PsiValueType(0.0));
std::vector<PosType> drifts(num_walkers, 0.0);
auto deltas = generateDeltas<MCCOORDS>(step_context.get_random_gen(), num_deltas);

std::vector<RealType> log_gf(num_walkers, 0.0);
std::vector<RealType> log_gb(num_walkers, 0.0);
std::vector<RealType> prob(num_walkers, 0.0);
Expand All @@ -130,18 +132,28 @@ void DMCBatched::advanceWalkers(const StateForThread& sft,
ScopedTimer pbyp_local_timer(timers.movepbyp_timer);
for (int ig = 0; ig < step_context.get_num_groups(); ++ig)
{
RealType tauovermass = sft.qmcdrv_input.get_tau() * sft.population.get_ptclgrp_inv_mass()[ig];
RealType oneover2tau = 0.5 / (tauovermass);
RealType sqrttau = std::sqrt(tauovermass);
auto createTaus = [&](int ig) -> Taus<RealType, MCCOORDS::mct> {
if constexpr (std::is_same<MCCOORDS, MCCoords<MCCoordsTypes::RSSPINS>>::value)
return Taus<RealType, MCCOORDS::mct>(sft.qmcdrv_input.get_tau(), sft.population.get_ptclgrp_inv_mass()[ig],
sft.qmcdrv_input.get_spin_mass());
else
return Taus<RealType, MCCOORDS::mct>(sft.qmcdrv_input.get_tau(), sft.population.get_ptclgrp_inv_mass()[ig]);
};

Taus<RealType, MCCOORDS::mct> taus = createTaus(ig);

twf_dispatcher.flex_prepareGroup(walker_twfs, walker_elecs, ig);

int start_index = step_context.getPtclGroupStart(ig);
int end_index = step_context.getPtclGroupEnd(ig);
int start_index = step_context.getPtclGroupStart(ig);
int end_index = step_context.getPtclGroupEnd(ig);
std::size_t delta_offset = 0;
// end_index is one beyond last index to the group
MCCOORDS drifts;
drifts.resize(end_index - start_index - 1);
for (int iat = start_index; iat < end_index; ++iat)
{
auto delta_r_start = it_delta_r + iat * num_walkers;
auto delta_r_end = delta_r_start + num_walkers;
auto delta_start = delta_offset + iat * num_walkers;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to avoid these intermediate _start _end. You know how much I dislike iterators.

auto delta_end = delta_start + num_walkers;

//This is very useful thing to be able to look at in the debugger
#ifndef NDEBUG
Expand All @@ -154,17 +166,22 @@ void DMCBatched::advanceWalkers(const StateForThread& sft,
#endif
//get the displacement
twf_dispatcher.flex_evalGrad(walker_twfs, walker_elecs, iat, grads_now);
sft.drift_modifier.getDrifts(tauovermass, grads_now, drifts);
sft.drift_modifier.getDrifts(taus.tauovermass, grads_now, drifts);

std::transform(drifts.begin(), drifts.end(), delta_r_start, drifts.begin(),
[sqrttau](PosType& drift, PosType& delta_r) { return drift + (sqrttau * delta_r); });

// std::transform(drifts.begin(), drifts.end(), delta_r_start, drifts.begin(),
// [sqrttau](PosType& drift, PosType& delta_r) { return drift + (sqrttau * delta_r); });

// only DMC does this
// TODO: rr needs a real name
std::vector<RealType> rr(num_walkers, 0.0);
assert(rr.size() == delta_r_end - delta_r_start);
std::transform(delta_r_start, delta_r_end, rr.begin(),
[tauovermass](auto& delta_r) { return tauovermass * dot(delta_r, delta_r); });
assert(rr.size() == delta_end - delta_start - 1);
for(std::size_t ip = delta_start; ip < delta_end; ++ip)
{
rr[ip] = taus.tauovermass * dot(deltas.rs[ip], deltas.rs[ip]);
}
//std::transform(delta_start, delta_end, rr.begin(),
// [tauovermass](auto& delta_r) { return tauovermass * dot(delta_r, delta_r); });

// in DMC this was done here, changed to match VMCBatched pending factoring to common source
// if (rr > m_r2max)
Expand All @@ -176,7 +193,7 @@ void DMCBatched::advanceWalkers(const StateForThread& sft,
for (int i = 0; i < rr.size(); ++i)
assert(std::isfinite(rr[i]));
#endif
ps_dispatcher.flex_makeMove(walker_elecs, iat, drifts);
ps_dispatcher.flex_makeMove(walker_elecs, iat, drifts.rs);

twf_dispatcher.flex_calcRatioGrad(walker_twfs, walker_elecs, iat, ratios, grads_new);

Expand All @@ -196,18 +213,25 @@ void DMCBatched::advanceWalkers(const StateForThread& sft,
rr_proposed[iw] += rr[iw];
}

std::transform(delta_r_start, delta_r_end, log_gf.begin(), [](auto& delta_r) {
for(std::size_t ip = delta_start; ip < delta_end; ++ip)
{
constexpr RealType mhalf(-0.5);
return mhalf * dot(delta_r, delta_r);
});
log_gf[ip] = mhalf * dot(deltas.rs[ip], deltas.rs[ip]);
}
// std::transform(delta_r_start, delta_r_end, log_gf.begin(), [](auto& delta_r) {
// constexpr RealType mhalf(-0.5);
// return mhalf * dot(delta_r, delta_r);
// });

sft.drift_modifier.getDrifts(tauovermass, grads_new, drifts);
sft.drift_modifier.getDrifts(taus.tauovermass, grads_new, drifts);

std::transform(crowd.beginElectrons(), crowd.endElectrons(), drifts.begin(), drifts.begin(),
[iat](auto& elecs, auto& drift) { return elecs.get().R[iat] - elecs.get().getActivePos() - drift; });
std::transform(crowd.beginElectrons(), crowd.endElectrons(), drifts.rs.begin(), drifts.rs.begin(),
[iat](auto& elecs, auto& drift) {
return elecs.get().R[iat] - elecs.get().getActivePos() - drift;
});

std::transform(drifts.begin(), drifts.end(), log_gb.begin(),
[oneover2tau](auto& drift) { return -oneover2tau * dot(drift, drift); });
std::transform(drifts.rs.begin(), drifts.rs.end(), log_gb.begin(),
[taus](auto& drift) { return -taus.oneover2tau * dot(drift, drift); });

for (int iw = 0; iw < num_walkers; ++iw)
prob[iw] = std::norm(ratios[iw]) * std::exp(log_gb[iw] - log_gf[iw]);
Expand Down Expand Up @@ -344,7 +368,11 @@ void DMCBatched::runDMCStep(int crowd_id,
// Are we entering the the last step of a block to recompute at?
const bool recompute_this_step = (sft.is_recomputing_block && (step + 1) == max_steps);
const bool accumulate_this_step = true;
advanceWalkers(sft, crowd, timers, dmc_timers, *context_for_steps[crowd_id], recompute_this_step,
if(sft.population.get_golden_electrons()->isSpinor())
advanceWalkers<MCCoords<MCCoordsTypes::RSSPINS>>(sft, crowd, timers, dmc_timers, *context_for_steps[crowd_id], recompute_this_step,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to have advanceWalkers<MCCoordsTypes::RSSPIN>

accumulate_this_step);
else
advanceWalkers<MCCoords<MCCoordsTypes::RS>>(sft, crowd, timers, dmc_timers, *context_for_steps[crowd_id], recompute_this_step,
accumulate_this_step);
}

Expand Down Expand Up @@ -410,7 +438,7 @@ bool DMCBatched::run()
{ // walker initialization
ScopedTimer local_timer(timers_.init_walkers_timer);
ParallelExecutor<> section_start_task;
section_start_task(crowds_.size(), initialLogEvaluation, std::ref(crowds_), std::ref(step_contexts_));
section_start_task(crowds_.size(), initialLogEvaluation, std::ref(crowds_), step_contexts_);
}

print_mem("DMCBatched after initialLogEvaluation", app_summary());
Expand Down Expand Up @@ -444,7 +472,7 @@ bool DMCBatched::run()
{
ScopedTimer local_timer(timers_.run_steps_timer);
dmc_state.step = step;
crowd_task(crowds_.size(), runDMCStep, dmc_state, timers_, dmc_timers_, std::ref(step_contexts_),
crowd_task(crowds_.size(), runDMCStep, dmc_state, timers_, dmc_timers_, step_contexts_,
std::ref(crowds_));

{
Expand Down
5 changes: 3 additions & 2 deletions src/QMCDrivers/DMC/DMCBatched.h
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ class DMCBatched : public QMCDriverNew
void process(xmlNodePtr cur) override;

bool run() override;

// This is the task body executed at crowd scope
// it does not have access to object members by design
static void runDMCStep(int crowd_id,
Expand Down Expand Up @@ -132,6 +132,7 @@ class DMCBatched : public QMCDriverNew
///walker controller for load-balance
std::unique_ptr<WalkerControl> walker_controller_;

template<class MCOORDS>
static void advanceWalkers(const StateForThread& sft,
Crowd& crowd,
DriverTimers& timers,
Expand All @@ -142,7 +143,7 @@ class DMCBatched : public QMCDriverNew

friend class qmcplusplus::testing::DMCBatchedTest;
};

} // namespace qmcplusplus

#endif
8 changes: 6 additions & 2 deletions src/QMCDrivers/GreenFunctionModifiers/DriftModifierBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,13 @@
#include "Particle/ParticleSet.h"
#include "QMCWaveFunctions/TrialWaveFunction.h"
#include "QMCHamiltonians/QMCHamiltonian.h"
#include "MCCoords.hpp"

namespace qmcplusplus
{
/// this class implements drift modification
/** this class implements drift modification
* its a completely pointless type erasue.
*/
class DriftModifierBase
{
public:
Expand All @@ -38,7 +41,8 @@ class DriftModifierBase

virtual void getDrift(RealType tau, const ComplexType& qf, ParticleSet::Scalar_t& drift) const = 0;

virtual void getDrifts(RealType tau, const std::vector<GradType>& qf, std::vector<PosType>&) const = 0;
virtual void getDrifts(RealType tau, const std::vector<GradType>& qf, MCCoords<MCCoordsTypes::RS>&) const = 0;
virtual void getDrifts(RealType tau, const std::vector<GradType>& qf, MCCoords<MCCoordsTypes::RSSPINS>&) const = 0;

virtual bool parseXML(xmlNodePtr cur) { return true; }

Expand Down
12 changes: 10 additions & 2 deletions src/QMCDrivers/GreenFunctionModifiers/DriftModifierUNR.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,19 @@ void DriftModifierUNR::getDrift(RealType tau, const ComplexType& qf, ParticleSet
#endif
}

void DriftModifierUNR::getDrifts(RealType tau, const std::vector<GradType>& qf, std::vector<PosType>& drift) const
void DriftModifierUNR::getDrifts(RealType tau, const std::vector<GradType>& qf, MCCoords<MCCoordsTypes::RS>& drift) const
{
for (int i = 0; i < qf.size(); ++i)
{
getDrift(tau, qf[i], drift[i]);
getDrift(tau, qf[i], drift.rs[i]);
}
}

void DriftModifierUNR::getDrifts(RealType tau, const std::vector<GradType>& qf, MCCoords<MCCoordsTypes::RSSPINS>& drift) const
{
for (int i = 0; i < qf.size(); ++i)
{
getDrift(tau, qf[i], drift.rs[i]);
}
}

Expand Down
7 changes: 5 additions & 2 deletions src/QMCDrivers/GreenFunctionModifiers/DriftModifierUNR.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#define QMCPLUSPLUS_DRIFTMODIFIER_UNR_H

#include "QMCDrivers/GreenFunctionModifiers/DriftModifierBase.h"
#include "QMCDrivers/MCCoords.hpp"

namespace qmcplusplus
{
Expand All @@ -23,8 +24,9 @@ class DriftModifierUNR : public DriftModifierBase
using RealType = QMCTraits::RealType;
using PosType = QMCTraits::PosType;

void getDrifts(RealType tau, const std::vector<GradType>& qf, std::vector<PosType>&) const final;

void getDrifts(RealType tau, const std::vector<GradType>& qf, MCCoords<MCCoordsTypes::RSSPINS>&) const final;
void getDrifts(RealType tau, const std::vector<GradType>& qf, MCCoords<MCCoordsTypes::RS>&) const final;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this pattern much more than conexpr if. Namely, API becomes reactive. Cody has refactored the ParticleSet APIs
mw_makeMoveWithSpin has been replaced with mw_makeSpinMove.
The old mw_makeMoveWithSpin can be achieved with mw_makeMove and mw_makeSpinMove.
Then I think we can introduce

makeMove(MCCoords<MCCoordsTypes::RSSPINS>&)
makeMove(MCCoords<MCCoordsTypes::RS>&)

on top of existing APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much prefer a templated function but because of the unecessary virtual inheritance in the DriftModifiers we are stuck with the overloading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you want is a templates function with specialization without the class being templated. We have to go for CRTP I guess. Then do we really want the whole class being templated? Since we anyway need to specialize the function. Just putting distinct functions is totally fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in many cases we don't want the whole class templated. If we do maybe CRTP is useful maybe we actually really gain something from the virtual inheritance.

But drift modifier would be better without inheritance. I think this a case that clearly would be better written with if constrexpr and no vtable. The overloads are going to cost several vtable calls without repeating code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean a ParticleSet::makeMove overload set?
That I think makes sense.
Hiding the QMC algorithm in another state machine I'm not in favor of.


void getDrift(RealType tau, const GradType& qf, PosType& drift) const final;

void getDrift(RealType tau, const ComplexType& qf, ParticleSet::Scalar_t& drift) const final;
Expand All @@ -38,6 +40,7 @@ class DriftModifierUNR : public DriftModifierBase
RealType a_;
};


} // namespace qmcplusplus

#endif
Loading