Modified CLASS not working with parallelization #590

subhajitghosh-phy · 2024-08-28T16:14:56Z

Hi,

I am modifying the latest version of the CLASS to incorporate massive neutrino self-interaction. I implemented a new tight coupling approximation(tca) for the ncdm species.

My code is working fine on a single core. However, when I turn on parallelization ( by export OMP_NUM_THREADS = n, where n>1) the code fails randomly.

I said randomly because sometimes the code runs and sometimes fails for the same parameter value. The error typically looks like this (below) -- which I am guessing is coming from the new tca for ncdm. It seems that the code is not going through the tca conditions properly (this error is expected if there is no tca).

Error in perturbations_init
=>operator()(L:954) :error in perturbations_solve(ppr, pba, pth, ppt, index_md, index_ic, index_k, &pw);
=>perturbations_solve(L:3264) :error in generic_evolver(perturbations_derivs, interval_limit[index_interval], interval_limit[index_interval+1], ppw->pv->y, ppw->pv->used_in_sources, ppw->pv->pt_size, &ppaw, ppr->tol_perturbations_integration, ppr->smallest_allowed_variation, perturbations_timescale, ppr->perturbations_integration_stepsize, ppt->tau_sampling, tau_actual_size, perturbations_sources, perhaps_print_variables, ppt->error_message);
=>evolver_ndf15(L:468) :condition (absh <= hmin) is true; Step size too small: step:9.0382e-14, minimum:9.0382e-14, in interval: [0.279812:56.4887]

I tried to debug turning on higher perturbation verbose. It seems that when parallelization is on in the index_k loop for some of the k values perturbations_solve gives an error randomly. Sometimes just one k, sometimes multiple k. I reiterate the code is working just fine on a single core when I set 'export OMP_NUM_THREADS = 1'.

Any idea what may be the source of this or any suggestion on how to debug it? To give you more details in the TCA for ncdm I am just restricting l_max for ncdm.

Thanks in advance.

Best,
Subhajit

subhajitghosh-phy · 2024-09-12T23:02:40Z

I have an update regarding this. I choose the rk evolver (setting evolver = 0) the code runs fine.

However, this is not a practical solution for me since after switching off TCA the equations are still a bit stiff for rk evolver for some relevant parameter space regions (for most of the parameter space it works fine).

No this is confusing -- why does rk evolver work with parallelization but ndf15 doesn't in my modified code?

I noticed that in the last few commits the OpenMP parallelization has been modified (actually removed). Can that be related to the error? Help will be very much appreciated.

Best,
Subhajit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modified CLASS not working with parallelization #590

Modified CLASS not working with parallelization #590

subhajitghosh-phy commented Aug 28, 2024

subhajitghosh-phy commented Sep 12, 2024

Modified CLASS not working with parallelization #590

Modified CLASS not working with parallelization #590

Comments

subhajitghosh-phy commented Aug 28, 2024

subhajitghosh-phy commented Sep 12, 2024