Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stochasticity #349

Merged
merged 212 commits into from
Mar 27, 2024
Merged

Stochasticity #349

merged 212 commits into from
Mar 27, 2024

Conversation

daviesje
Copy link
Contributor

@daviesje daviesje commented Dec 4, 2023

DRAFT PR FOR v4

This update focuses around two things:

  • The new sampler for sub-grid halo sources, meant to include stochasticity below the grid scale
  • reorganisation of the C code to be more modular, making things easier for future development

The state of the halo model currently:

  • We can produce halo populations which conform to a given conditional mass function, and which remain self-consistent across cosmic time. There is a drift from the given CMF at very high redshifts (~10% z=20) which is worse for small timesteps. work to improve the accuracy and speed of this method is ongoing.

  • There is an increased memory requirement. While halo catalogues down to 10^8 are of order ~2GB, feedback requires that halo mass catalogues are generated backward in time, while the stellar component is generated forward in time. This can be fixed with a proper purging routine implemented for the HaloField structs (in progress).

  • Halos larger than the cell size are generated with DexM, and there are some issues there. I find that trying to generate these catalogues on the low-density grid produces a factor of ~10 too few halos, and generating with Press-Schechter mass function results in a factor of ~3 too many. Only Sheth-Tormen on the highres grid is consistent, I am working to improve both the threading of this calculation and flexibility with resolution and HMF (while we expect the model to perform worse on low resolution I believe there are also some bugs here).

  • The only available CMF currently is EPS with rescaling to fit the collapsed fraction of another MF. I plan to include fits for a Sheth-Torment CMF, as well as rescaling options which have been used in the literature to create a CMF (both rescalings of the EPS CMF i.e tree augmentation methods, and rescalings of UMF to a CMF i.e analytic models).

  • I have placed test outputs in a single function at the end of Stochasticity.c which obviously needs to be converted to proper tests before the merge

State of the C modularisation

  • The spin temperature code has been the main focus of modularisation. there is still work to be done there but I believe it is much cleaner than before.
  • I have created a new file, interpolation.c, to hold interpolation routines and table functions. This will allow us to combine a few tables which currently hold the same values but have different generation functions. I have moved most of the Ts.c tables here and will continue to move tables over as I cleanup the other functions
  • modularisation of IonisationBox.c has not yet started

Biggest Issues

  • Going backward in time with the halo model makes the recursive generation very painful, everything runs fine with run_coeval or run_lightcone however calling Ionisation_box at z=6 with HALO_STOCHASTICITY on does not work well, it has been disabled until I find a way to do this properly
  • There is a new output object, XraySourceBox, due to the requirement of halos at multiple redshifts I have separated the source box generation (filtered SFRD) from the spin temperature calculation. I aim to move the non-halo case here as well, which will allow us to no-longer rely on linear growth when making these boxes, and use annular filtering for more self-consistency. However the implementation is very hacky at the moment, calling _compute() multiple times over a series of inputs, and setting the _computed_in_mem to False. There should be a better way to calculate a box with multiple input redshifts (In my opinion if we want to move some functions to Jax this is probably a good first candidate).
  • I have implemented minihalos in a way which is similar to the default case, where there are two source components with different stellar/spectral parameters based on halo mass. This should produce similar results but is conceptually strange, where each halo has both ACG and MCG components in different fractions based on the halo mass.
  • The implementation of minihalos also presents a memory issue, as the number of halos can increase dramatically at lower masses. There is a mean source box calculation I can alter to implement a sub-resolution model. I also want to look into chunking via Dask or similar, as the halo calculations are independent.

MISC

  • The photon conservation model has given us a lot of grief. Shifting the redshift of a halo-based source box has required a lot of messy implementation, and its use with recombination and minihalo flags is difficult. As a stopgap there is a very simple model which adjust the escape fraction parameters instead of the redshift and produces expected results, this simple model needs some cleanup, and the original photon conservation model needs some work to be useful with the new model.

@daviesje
Copy link
Contributor Author

When the halo sampler is switched off, results very closely resemble the master branch, I have noticed only < 10^-3 level differences when examining global quantities and individual cells. However there seem to be bigger differences in various power spectra. Attached are paired lightcone and global plots for three cases, default parameters (titled _KS_OLD[_MASTER]), setting USE_MASS_DEPENDENT_ZETA=True (titled _KS[_MASTER]) and USE_MINI_HALOS=TRUE (titled KS_MINI[MASTER]).

lc_global

lc_slice

Most of the tests fail since the interpolation tables and integral structure has changed, causing floating point errors (some float -> double conversion has been done, and many interpolation tables have different binning). The failures without the spin temperature fluctuations have very small differences at the smallest scales in the power spectra.

tests.test_integration_features.py--test_power_spectra_lightcone[mdzeta].pdf

With the spin temperature fluctuations, global values appear identical but power spectra show significant differences at moderate and small scales. This difference is not something I've been able to track down looking at individual cells, but it seems to not affect the history very much.

tests.test_integration_features.py--test_power_spectra_lightcone[tsfluct].pdf

tests.test_integration_features.py--test_power_spectra_lightcone[mini_halos].pdf

Obviously, with USE_HALO_FIELD=True, the results are very different. I am currently performing some runs with just DexM on this branch to examine but these are very slow. I would expect differences from the new gridding structure and the different implementation of the halos in ComputeIonizedBox.

tests.test_integration_features.py--test_power_spectra_lightcone[halo_field].pdf

@steven-murray
Copy link
Member

@daviesje this is looking great, thanks for the plots!

I think we need to bring in @andreimesinger here to weigh in on what kind of tolerance we should have for changes. The good thing is that the simplest cases (no spin temp, no halos) is working great. Happy to give that my stamp of approval.

Even with the spin temperature fluctuations, I think we're doing pretty well. The differences, as expected, are only evident in the fields dependent on Ts. The mini-halos example is a little concerning (~100% differences in some spectra), so I'd like @andreimesinger to weigh in there.

When using the halo field, we expect differences, and we see them. However, we should be careful to understand what kind of differences we might expect, and whether their magnitude is reasonable here. I'm seeing ~20% differences in the globally-averaged quantities which I don't think I expected, but perhaps I'm too naive.

@daviesje
Copy link
Contributor Author

To give a little more context, the halo field differences will always be a bit of an apples-to-oranges comparison. In the master branch, the halo field is converted to collapsed fraction by dividing the halo mass in a cell by the cell mass, then either the constant zeta is applied (if USE_MASS_DEPENDENT_ZETA=False) or it is assumed all that mass is at 10^10 solar and the constant escape and stellar fractions are used (if USE_MASS_DEPENDENT_ZETA=True). The new model assigns stellar mass and sfr to the halos in accordance with the Park+19 model, with or without the lognormal scatter depending on parameters. For debugging I could create an option which uses the constant zeta with the halo boxes but I don't see this ever being useful.

@daviesje daviesje marked this pull request as ready for review March 27, 2024 11:17
@steven-murray steven-murray merged commit b727e9a into 21cmfast:v4-prep Mar 27, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants