You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here, instead of looping over the galaxies with == and finding the matches in group_nr_bound, a more efficient solution needs to be found. The inputs are:
the target galaxies in the region (contained in solution["region_target_indices"], need to be converted to halo catalogue indices by looking up the corresponding rows in self.halo_catalogue.input_halos.halo_catalogue_index);
the particle group membership information, accessible as self._server.gas._particle_dataset.group_nr_bound and similar for other particle types.
The desired output is:
a list of masks ([True, False, False, ...]), one for each particle type, that pick out the particles bound to each galaxy in the list of targets for this region.
This needs to be calculated more efficiently than a loop over == or similar operation for this improvement to make sense. Probably this is a clever usage of numpy.unique(..., return_inverse=True).
A good starting point would be making some dummy data for some target IDs (say an array of ~10 integers) then a big array of integers containing those 10 integers many times each (plus some other integers that are not the ones searched for) and trying to get out the corresponding masks as efficiently as possible (see if numpy.unique outperforms a loop over ==, for example).
All of this optimization only makes sense for the bound_only mask option, so will need to consider if/how to support other modes, and definitely only do this in the bound_only mode.
The text was updated successfully, but these errors were encountered:
In the initial
SWIFTGalaxies
iterator class masks are calculated for each galaxy here:swiftgalaxy/swiftgalaxy/iterator.py
Lines 310 to 312 in 2e9e477
within the loop over galaxies. This means that for each galaxy we evaluate:
swiftgalaxy/swiftgalaxy/halo_catalogues.py
Lines 274 to 289 in 2e9e477
The
==
operation is fairly expensive. Perhaps the masks can be pre-computed for all target galaxies in a region just after the data preloading loop:swiftgalaxy/swiftgalaxy/iterator.py
Line 304 in 2e9e477
Here, instead of looping over the galaxies with
==
and finding the matches ingroup_nr_bound
, a more efficient solution needs to be found. The inputs are:solution["region_target_indices"]
, need to be converted to halo catalogue indices by looking up the corresponding rows inself.halo_catalogue.input_halos.halo_catalogue_index
);self._server.gas._particle_dataset.group_nr_bound
and similar for other particle types.The desired output is:
[True, False, False, ...]
), one for each particle type, that pick out the particles bound to each galaxy in the list of targets for this region.This needs to be calculated more efficiently than a loop over
==
or similar operation for this improvement to make sense. Probably this is a clever usage ofnumpy.unique(..., return_inverse=True)
.A good starting point would be making some dummy data for some target IDs (say an array of ~10 integers) then a big array of integers containing those 10 integers many times each (plus some other integers that are not the ones searched for) and trying to get out the corresponding masks as efficiently as possible (see if
numpy.unique
outperforms a loop over==
, for example).All of this optimization only makes sense for the
bound_only
mask option, so will need to consider if/how to support other modes, and definitely only do this in thebound_only
mode.The text was updated successfully, but these errors were encountered: