-
Notifications
You must be signed in to change notification settings - Fork 16
Neighbourhood methods
Neighbourhood methods compute for each gridpoint a statistic based on values within a neighbourhood surrounding this gridpoint.
The function in gridpp looks like this:
values = gridpp.neighbourhood(input, radius, statistic)
These methods can be computationally expensive. A naiive method has computational complexity of O(GR^2), where G is the number of gridpoints and R is the full-width of the neighbourhood. Gridpp uses several tricks and approximations to improve the computational speed. To improve speed, the main strategy is to find ways that reuse calculations that have been done for one neighbourhood when processing another neighbourhood.
To find the neighbourhood mean, the domain is accumulated from the lower left corner to the upper right. This algorithm is O(G) and is independent of the neighbourhood size. A neighbourhood mean can be computed by the values at the 4 corners of the neighbourhood:
v = (upper_left + lower_left - upper_right - lower_right) / R / R
To find the neighbourhood standard deviation or variance, the following relationship is used:
variance = E[x^2] - E[x]^2
That is, the variance is efficiently compute by calculating two types of neighbourhood means and taking the difference.
To find the neighbourhood minimum or maximum, the domain is split into Nx1 slices. A neighbourhood is then represented by R of these Rx1 slices. The minimum (maximum) is computed for each slice, and then a minimum (maximum) of R of the Rx1 slices is computed. This saves time because the Nx1 minimums (maximums) are reused R times. This algorithm has time complexity O(GRlog(R)). Neighbourhood min, mean, and max for on square neighbourhoods can be calculated as follows:
radius = 7
ovalues = gridpp.neighbourhood(ivalues, radius, gridpp.Mean)
To find a neighbourhood quantile (such as the median), there is no way to reuse the calculations from one neighbourhood when processing the next. This is because the whole neighbourhood must be sorted. Instead, gridpp uses a method that gives an approximate value. Gridpp computes the fraction of cells that exceed a set of T thresholds. The quantile can then be approximated by linearly interpolating between these fractions. Gridpp can also compute the exact neighbourhood quantile, without approximation if desired. This has time complexity O(GT). [Brute force].
Neighbourhoods for ensembles add the ensebmle dimension to the neighbourhood. When computing a statistic, the statistic is calculated by pooling all ensemble members together, creating a single field (instead of one field for each ensemble member).
To calculate quantiles, use the neighbourhood_quantile
method:
radius = 7
quantile = 0.9
num_thresholds = 20
ovalues = gridpp.neighbourhood_quantile(ivalues, quantile, radius, num_thresholds)
To calculate ensemble neighbourhoods, use the _ens
functions:
ivalues_ens = np.random.rand([100, 80, 10]) # 10 member ensemble
radius = 7
ovalues = gridpp.neighbourhood_ens(ivalues_ens, radius, gridpp.Mean)