Skip to content
Cristian Lussana edited this page Nov 21, 2024 · 12 revisions

[Under development]

Overview

The package can be loaded as follows (from the build/swig/R folder):

dyn.load("gridpp.so")
source("gridpp.R")
cacheMetaData(1)

NOTE: A SWIG version newer than 4.1.1 (tested with version 4.2.1) must be used to create the R bindings. Using older versions can cause issues with some gridpp features, such as creating structure functions. There is a bug in SWIG Versions 4.0.2 and 4.2.1 (we have not tested other versions), it creates a build/swig/R/gridpp.R that is not correct. A quick fix is:

cd build/swig/R
for i in {1..50}; do sed -i "s/all(sapply(argv\[\[$i\]\] , is.integer) || sapply(argv\[\[$i\]\], is.numeric))/all(sapply(argv\[\[$i\]\] , is.integer) | sapply(argv\[\[$i\]\], is.numeric))/g" gridpp.R; done

Functions can be called in the same way as in python. Objects can also be created in the same way. However, class member functions are called in a different way.

structure <- BarnesStructure(10000, 0)
p1 <- Point(0, 0)
p2 <- Point(0, 0.1)
correlation <- structure$corr(p1, p2)
print(correlation) 

Alternatively, for Cartesian coordinates:

structure <- BarnesStructure(10000, 0)
p1 <- Point(0, 0, 0, 0, "Cartesian")
p2 <- Point(30000, 0, 0, 0, "Cartesian")
correlation <- structure$corr(p1, p2)
print(correlation) 

Structure Functions Overview

There are five main analytical forms for defining structure functions in gridpp, all of which depend on the distance between two points ($d$) and a scaling length parameter ($L$): Barnes, SOAR, TOAR, Powerlaw, and Cressman.

  • Barnes (Barnes, 1973): This function is of the form $e^{-x^2}$, where $x = d / L$. It resembles a Gaussian function. For $L = 1$, a correlation value of 0.0013 is reached at $d = 3.65$.
  • SOAR (Second-order autoregressive function, Gasparri and Cohn, 1999): Defined as $(1 + x)⋅e^{-x}$, with $x = d / L$. For $L = 1$, a correlation value of 0.0013 is reached at $d = 8.54$.
  • TOAR (Third-order autoregressive function, Gasparri and Cohn, 1999): Has the form $(1 + x + 1/3 ⋅ x^2)⋅e^{-x}$, with $x = d / L$. For $L = 1$, a correlation value of 0.0013 is reached at $d = 9.49$.
  • Powerlaw (Gasparri and Cohn, 1999): This function is defined as $(1 + 1/2 ⋅ x^2)^{-1}$, with $x = d / L$. For $L = 1$, a correlation value of 0.0013 is reached at $d = 39.20$.
  • Cressman: The function takes the form $(L^2 - d^2) / (L^2 + d^2)$. For $L = 1$, a correlation value of 0.0013 is reached at $d = 1.002$.
Examples of structure correlation functions for $L=1$ and using a 1D distance between two points.

The sixth structure function in GridPP is a special linear function, designed for variables with values strictly between 0 and 1, inclusive. A common use case is for variables like land area fraction, which represents the fraction of land surrounding a point. Unlike typical correlation functions, this function returns a damping factor (a value smaller or equal to 1) rather than a correlation, allowing the correlation between two points to be adjusted based on differences in their land area fractions, for example. In this context, $d$ represents the difference in land area fraction between two points, and the function is defined as: $1 − (1−min)⋅∣d∣$ where $min$ is the minimum value of the damping coefficient. If two points have identical land area fractions, the damping coefficient is 1. If the difference is maximal (e.g., one point has a fraction of 0 and the other 1), the damping coefficient is set to $min$. This structure function allows users to reduce the correlation between points with differing land area fractions while maintaining some level of correlation, particularly if the points are spatially close.

The R-code used to obtain the figure above is reported in the following:

coord  <- seq(0,10,by=0.1)
ncoord <- length(coord)
structure_Barnes   <- BarnesStructure(1, 0, 0)  # Example: horizontal structure
structure_Soar     <- SoarStructure(1, 0, 0)  # Example: horizontal structure
structure_Toar     <- ToarStructure(1, 0, 0)  # Example: horizontal structure
structure_Powerlaw <- PowerlawStructure(1, 0, 0)  # Example: horizontal structure
structure_Cressman <- CressmanStructure(1, 0, 0)  # Example: horizontal structure
correlation_Barnes   <- vector( mode="numeric", length=ncoord)
correlation_Soar     <- vector( mode="numeric", length=ncoord)
correlation_Toar     <- vector( mode="numeric", length=ncoord)
correlation_Powerlaw <- vector( mode="numeric", length=ncoord)
correlation_Cressman <- vector( mode="numeric", length=ncoord)
p1 <- Point(0, 0, 0, 0, "Cartesian")
for (i in 1:ncoord) {
  x <- coord[i]
  p2 <- Point(x, 0, 0, 0, "Cartesian")
  correlation_Barnes[i]   <- structure_Barnes$corr(p1, p2)
  correlation_Soar[i]     <- structure_Soar$corr(p1, p2)
  correlation_Toar[i]     <- structure_Toar$corr(p1, p2)
  correlation_Powerlaw[i] <- structure_Powerlaw$corr(p1, p2)
  correlation_Cressman[i] <- structure_Cressman$corr(p1, p2)
}
png(file="correlations.png",width=800,height=600)
par(mar=c(5,5,1,1))
plot(coord,correlation_Barnes, ylim=c(0,1),xlim=c(0,10),col="white",cex.axis=2,xlab="Spatial 1D Coordinate", ylab="Correlation", cex.lab=2)
abline(v=0:100,h=seq(0,1,by=0.1),lty=2,col="gray")
abline(v=0,h=c(0,1),lty=1,col="darkgray")
lines(coord,correlation_Barnes,lwd=15,col="Blue")
lines(coord,correlation_Soar,lwd=12,col="Sienna")
lines(coord,correlation_Toar,lwd=10,col="Tan")
lines(coord,correlation_Powerlaw,lwd=10,col="Plum2")
lines(coord,correlation_Cressman,lwd=10,col="Chartreuse4")
legend(x="topright",col=c("Blue","Sienna","Tan","Plum2","Chartreuse4"),lwd=15,legend=c("Barnes","SOAR","TOAR","Powerlaw","Cressman"),cex=2)
dev.off()

Setup for OI examples

To prepare for the following examples, run the code below to set up the necessary variables. This code generates a grid with Easting and Northing coordinates, where elevation increases with the Easting coordinate. The grid is split into land and sea: points where $x&gt;y$ are classified as land, and all others as sea. Additionally, 30 observations are generated at random locations on the grid. The background values at grid points are set to 0, and the observations are all set to 1. In this setup, the analysis corresponds to the Integral Data Influence (IDI, Uboldi et al. 2008).

dyn.load("/home/cristianl/projects/gridpp/build/swig/R/gridpp.so")
source("/home/cristianl/projects/gridpp/build/swig/R/gridpp.R")
cacheMetaData(1)
# set grid parameters
min_x <- 1
max_x <- 10
min_y <- 1
max_y <- 10
min_elev <- 0
max_elev <- 20
by_x <- .1
by_y <- .1
# elevation is assumed to be a function of x coord
m <- (max_elev - min_elev) / (max_x - min_x)
q <- min_elev - m * min_x
# generate the regular 2D grid
grid_x_coord <- seq( min_x, max_x, by=by_x)
grid_y_coord <- seq( min_y, max_y, by=by_y)
ngrid_x <- length(grid_x_coord)
ngrid_y <- length(grid_y_coord)
grid_points <- expand.grid( grid_x_coord, grid_y_coord)
grid_x <- as.numeric(grid_points[,1])
grid_y <- as.numeric(grid_points[,2])
ngrid <- length( grid_x)
grid_z <- m * grid_x + q
grid_laf <- rep(0, ngrid)
grid_laf[which(grid_x>grid_y)] <- 1
pgrid  <- Points( grid_x, grid_y, grid_z, grid_laf, "Cartesian")
# generate observations
nobs <- 30
set.seed(1)
obs_x <- runif( nobs, min=min_x, max=max_y)
set.seed(2)
obs_y <- runif( nobs, min=min_y, max=max_y)
set.seed(3)
obs_z <- m * obs_x + q + runif( nobs, min=-3, max=3)
obs_laf <- rep(0, nobs)
obs_laf[which(obs_x>obs_y)] <- 1
points <- Points( obs_x, obs_y, obs_z, obs_laf, "Cartesian")
# set background to 0 everywhere
background <- rep(0, ngrid)
pbackground <- rep(0, nobs)
# set all observations to 1 
obs <- rep(1, nobs)
# OI parameters
ratios <- rep( 0.1, nobs)
max_points <- 5

OI example: Combining Multiple Structure Functions

In this example, we illustrate how to use the optimal_interpolation() function and define a multiple structure function that allows the user to combine three different structure functions into a single one.

To define a multiple structure function, you can use one of the following approaches. Typically, a Barnes structure function is a good choice for describing correlations based on vertical distances (elevation differences) between points. The linear structure function can be used to adjust correlations between points in different surroundings, such as between inland and island locations. Finally, different structure functions (e.g., Barnes, TOAR, Cressman) can be applied to describe correlations based on the radial or horizontal distance between two points.

# Create multiple structure functions
multiple_structure_Barnes <- MultipleStructure( BarnesStructure(0.2, 0, 0), BarnesStructure(0, 1, 0), LinearStructure(0, 0, 0.5))
multiple_structure_Soar <- MultipleStructure( SoarStructure(0.2, 0, 0), BarnesStructure(0, 1, 0), LinearStructure(0, 0, 0.5))
multiple_structure_Toar <- MultipleStructure( ToarStructure(0.2, 0, 0), BarnesStructure(0, 1, 0), LinearStructure(0, 0, 0.5))
multiple_structure_Powerlaw <- MultipleStructure( PowerlawStructure(0.2, 0, 0), BarnesStructure(0, 1, 0), LinearStructure(0, 0, 0.5))
multiple_structure_Cressman <- MultipleStructure( CressmanStructure(0.2, 0, 0), BarnesStructure(0, 1, 0), LinearStructure(0, 0, 0.5))

Gridpp allows you to perform OI by calling the optimal_interpolation() function.

# analysis, in this case is equal to the Integral Data Influence IDI 
analysis_Barnes <- optimal_interpolation(pgrid, background, points, obs, ratios, pbackground, multiple_structure_Barnes, max_points)
analysis_Soar <- optimal_interpolation(pgrid, background, points, obs, ratios, pbackground, multiple_structure_Soar, max_points)
analysis_Toar <- optimal_interpolation(pgrid, background, points, obs, ratios, pbackground, multiple_structure_Toar, max_points)
analysis_Powerlaw <- optimal_interpolation(pgrid, background, points, obs, ratios, pbackground, multiple_structure_Powerlaw, max_points)
analysis_Cressman <- optimal_interpolation(pgrid, background, points, obs, ratios, pbackground, multiple_structure_Cressman, max_points)

Once defined, you can plot the analysis fields obtained from the interpolation. In the analysis, values will be close to 1 near observation locations (indicated by warm colors in the figures) and approach 0 further from the observations (shown by yellow-ish colors, with beige indicating values smaller than 0.1).

The analysis fields effectively show the correlation between grid points near observations and their surrounding grid points. The spatial patterns of different horizontal correlations are clearly marked in the figures. Given the same value of $L$, Cressman provides the smallest neighborhoods influenced by observations, while TOAR or Powerlaw extend the influence over larger areas. The right panels of the figures show how elevation differences affect the correlations. The dashed line in the left panels marks the sharp boundary between land and sea, and you can see how correlations diminish after crossing this boundary.

idi_Barnes <- array(data=analysis_Barnes, dim=c(ngrid_x,ngrid_y))
idi_Soar <- array(data=analysis_Soar, dim=c(ngrid_x,ngrid_y))
idi_Toar <- array(data=analysis_Toar, dim=c(ngrid_x,ngrid_y))
idi_Powerlaw <- array(data=analysis_Powerlaw, dim=c(ngrid_x,ngrid_y))
idi_Cressman <- array(data=analysis_Cressman, dim=c(ngrid_x,ngrid_y))
# Figures
breaks <- c( -1, seq( 0,1, by=0.1), 2)
col <- c( "beige", rev( heat.colors(length(breaks)-2)))
for (i in 1:5) {
  if (i == 1) { str <- "Barnes"; idi <- idi_Barnes; analysis <- analysis_Barnes } else 
  if (i == 2) { str <- "SOAR"; idi <- idi_Soar; analysis <- analysis_Soar } else 
  if (i == 3) { str <- "TOAR"; idi <- idi_Toar; analysis <- analysis_Toar } else 
  if (i == 4) { str <- "Powerlaw"; idi <- idi_Powerlaw; analysis <- analysis_Powerlaw } else 
  if (i == 5) { str <- "Cressman"; idi <- idi_Cressman; analysis <- analysis_Cressman }
  png( file=paste0("fa_",str,".png"), width=800, height=800)
  par( mar = c( 5, 5, 1, 1))
  image( z=idi, x=grid_x_coord, y=grid_y_coord, breaks=breaks, col=col,
         xlab="Easting Coordinate", ylab="Northing Coordinate", cex.lab=2, cex.axis=2)
  points( obs_x, obs_y, pch=21, bg="cornflowerblue", cex=2)
  lines(-1000:1000,-1000:1000,lty=2)
  text(x=8.8,y=8,labels="Land",cex=2.5)
  text(x=8,y=8.7,labels="Sea",cex=2.5)
  dev.off()
  #
  png( file=paste0("fb_",str,".png"), width=800, height=800)
  par( mar = c( 5, 5, 1, 1))
  plot( grid_x, grid_z,
        xlab="Easting Coordinate", ylab="Elevation", cex.lab=2, cex.axis=2)
  for (i in 1:length(col)) {
    if (length(ix <- which(analysis >= breaks[i] & analysis < breaks[i+1]))>0) {
      points( grid_x[ix], grid_z[ix], col=col[i], bg=col[i], pch=21, cex=4)
    }
  }
  text(x=3,y=18,labels=str,cex=4.5)
  points(obs_x,obs_z,pch=21, bg="cornflowerblue", cex=3)
  dev.off()
  system( paste0("convert +append fa_",str,".png fb_",str,".png fig_",str,".png"))
  system( paste0("rm fa_",str,".png fb_",str,".png"))
}
Example of OI using multiple structure functions: Barnes as the radial structure function (to handle spatial correlations); Barnes as the vertical structure function (to handle vertical correlations); Linear damping to adjust the correlation based on the land area fraction. Example of OI using multiple structure functions: SOAR as the radial structure function (to handle spatial correlations); Barnes as the vertical structure function (to handle vertical correlations); Linear damping to adjust the correlation based on the land area fraction. Example of OI using multiple structure functions: TOAR as the radial structure function (to handle spatial correlations); Barnes as the vertical structure function (to handle vertical correlations); Linear damping to adjust the correlation based on the land area fraction. Example of OI using multiple structure functions: Powerlaw as the radial structure function (to handle spatial correlations); Barnes as the vertical structure function (to handle vertical correlations); Linear damping to adjust the correlation based on the land area fraction. Example of OI using multiple structure functions: Powerlaw as the radial structure function (to handle spatial correlations); Barnes as the vertical structure function (to handle vertical correlations); Linear damping to adjust the correlation based on the land area fraction.

EnSI example: Ensemble-based Statistical Interpolation

Setup for EnSI examples

To prepare for the following examples, run the code below to set up the necessary variables. This code generates a grid with Easting and Northing coordinates, where elevation increases with the Easting coordinate. The grid is split into land and sea: points where $x&lt;2$ are classified as sea, and all others as land. Sea points have elevation equal to 0 m. A true field for the variable used (i.e. truth) is generated on the regular grid. This truth will serve as the basis for generating observed and background values and is modeled as a multi-variate normal 2D field, as described by Wilks (2019), cap 12.4.1 "Simulating independent MVN Variates" (Eq.(12.19)). Note that we are using two different de-correlation length for Easting and Northing distances. Thirty observations are generated at random locations on the grid, with their values extracted from the true field at those locations. The observation errors are spatially independent, following a normal distribution with mean 0 and standard deviation 0.1 units. The ensemble background consists of 30 members. We assume the background has a location-dependent bias, modeled as a linear function of the Easting coordinate. Additionally, the background members include spatially coherent noise and differ from the truth by random values drawn from a normal distribution with mean 0 and standard deviation 2 units.

# functions
vec2listXY <- function(vec, nx, ny) {
    ar <- array( data=vec, dim=c(nx,ny))
    q <- lapply(seq(dim(ar)[2]), function(x) ar[ , x])
    return(q)
}
# set fixed parameters
min_x <- 1
max_x <- 10
min_y <- 1
max_y <- 10
min_elev <- 0
max_elev <- 20
by_x <- .25
by_y <- .25
land_sea_xlim <- 2
nens <- 30
nobs <- 30
#==============================================================================
# generate the regular 2D grid
grid_x_coord <- seq( min_x, max_x, by=by_x)
grid_y_coord <- seq( min_y, max_y, by=by_y)
ngrid_x <- length(grid_x_coord)
ngrid_y <- length(grid_y_coord)
grid_points <- expand.grid( grid_x_coord, grid_y_coord)
grid_x <- as.numeric(grid_points[,1])
grid_y <- as.numeric(grid_points[,2])
ngrid <- length( grid_x)
# elevation is assumed to be a linear function of x coord
m <- max_elev / (max_x - land_sea_xlim)
q <- -m * land_sea_xlim
grid_z <- m * grid_x + q
grid_laf <- rep(1, ngrid)
grid_laf[which(grid_x<land_sea_xlim)] <- 0
grid_z[which(grid_laf==0)] <- 0
pgrid  <- Points( grid_x, grid_y, grid_z, grid_laf, "Cartesian")
bgrid  <- Grid( vec2listXY(grid_x, ngrid_x, ngrid_y), 
                vec2listXY(grid_y, ngrid_x, ngrid_y), 
                vec2listXY(grid_z, ngrid_x, ngrid_y), 
                vec2listXY(grid_laf, ngrid_x, ngrid_y), "Cartesian")
#==============================================================================
# generate observation locations
set.seed(1)
obs_x <- runif( nobs, min=min_x, max=max_y)
set.seed(2)
obs_y <- runif( nobs, min=min_y, max=max_y)
set.seed(3)
obs_z <- m * obs_x + q + runif( nobs, min=-3, max=3)
obs_laf <- rep(1, nobs)
obs_laf[which(obs_x<land_sea_xlim)] <- 0
obs_z[which(obs_laf==0)] <- 0
obs_z[which(obs_z<0)] <- 0
points <- Points( obs_x, obs_y, obs_z, obs_laf, "Cartesian")
#==============================================================================
# generate truth
# see Wilks (2019) cap 12.4.1 "Simulating independen MVN Variates" (Eq.(12.19))
dhx_truth <- 4
dhy_truth <- 2
var_truth <- 1
mu <- rep(0, ngrid) 
distx2 <- outer( grid_x, grid_x, FUN="-")**2
disty2 <- outer( grid_y, grid_y, FUN="-")**2
CovMat_truth <- var_truth * exp(-0.5 * distx2 / dhx_truth**2 -0.5 * disty2 / dhy_truth**2)
CovMat_eig_truth <- eigen(CovMat_truth,symmetric=T)
if (any(CovMat_eig_truth$values<0)) CovMat_eig_truth$values[CovMat_eig_truth$values<0]<-0
CovMat_sqrt_truth <- tcrossprod( tcrossprod( CovMat_eig_truth$vectors, diag( sqrt(CovMat_eig_truth$values))), CovMat_eig_truth$vectors)
set.seed(0)
truth <- mu + as.vector( crossprod(CovMat_sqrt_truth, rnorm(ngrid))) - 0.5 * grid_z 
#==============================================================================
# generate background ensemble
bkg_bias_at_min_x <- 5
bkg_bias_at_max_x <- -2
# background bias is assumed to be a linear function of x coord
m_bkg_bias <- (bkg_bias_at_max_x - bkg_bias_at_min_x) / (max_x - min_x)
q_bkg_bias <- bkg_bias_at_min_x - m_bkg_bias * min_x
# add a MVN noise to the background
dhx_ens <- 0.5
dhy_ens <- 0.25
var_ens <- 0.1
CovMat_ens <- var_ens * exp(-0.5 * distx2 / dhx_ens**2 -0.5 * disty2 / dhy_ens**2)
CovMat_eig_ens <- eigen(CovMat_ens,symmetric=T)
if (any(CovMat_eig_ens$values<0)) CovMat_eig_ens$values[CovMat_eig_ens$values<0]<-0
CovMat_sqrt_ens <- tcrossprod( tcrossprod( CovMat_eig_ens$vectors, diag( sqrt(CovMat_eig_ens$values))), CovMat_eig_ens$vectors)
background_ens <- array(data=NA, dim=c(ngrid, nens))
pbackground_ens <- array(data=NA, dim=c(nobs, nens))
background_bias <- m_bkg_bias * grid_x + q_bkg_bias
# background = truth + location dependent bias + bias + spatially coherent noise
for (e in 1:nens) {
  set.seed(e)
  background_ens[,e] <- background_bias + truth + rnorm( 1, mean=0, sd=2) + as.vector( crossprod(CovMat_sqrt_ens, rnorm(ngrid))) 
  pbackground_ens[,e] <- bilinear(bgrid, points, vec2listXY(background_ens[,e], ngrid_x, ngrid_y))
}
background_std <- mean( apply( background_ens, FUN=function(x){sd(x)}, MAR=1), na.rm=T)
pbackground_std <- mean( apply( pbackground_ens, FUN=function(x){sd(x)}, MAR=1), na.rm=T)
#==============================================================================
# generate (perturbed) observations
var_obs <- 0.1
obs <- array(data=NA, dim=c(nobs, nens))
obs_aux <- bilinear(bgrid, points, vec2listXY(truth, ngrid_x, ngrid_y))
set.seed(nens+1)
for (e in 1:nens) { set.seed(nens+e); obs[,e] <- obs_aux + rnorm(nobs, mean=0, sd=sqrt(var_obs))}
obs_mean <- rowMeans(obs)
obs1 <- obs
for (e in 2:nens) obs1[,e] <- obs[,1]

The following figures display the truth and the background ensemble mean field. The left panels show the fields on the regular 2D grid, while the right panels depict the vertical profile. Dots indicate observation locations. The colors represent the values of the variable. The x-axis in the right panels also serves as a legend.

Truth Background ensemble mean

EnSI examples

Define the multiple structure function as it follows:

dh_multi <- 4
dz_multi <- 3
lafmin_multi <- 0.8
multiple_structure_Soar <- MultipleStructure( SoarStructure(dh_multi, 0, 0), BarnesStructure(0, dz_multi, 0), LinearStructure(0, 0, lafmin_multi))

Run EnSI using the function optimal_interpolation_ensi(). The spatial analysis method is described by Lussana et al. (2019, DOI: 10.1002/qj.3646), and it combines the background ensemble with the observations, utilizing covariances derived from the background ensemble. The user must specify the standard deviation of the observation error (obs_standard_deviations), which is a vector matching the number of observations:

vec2listPE <- function(vec, np, ne) {
    ar <- array( data=vec, dim=c(np,ne))
    q <- lapply( seq(dim(ar)[1]), function(x) ar[x,])
    return(q)
}
max_points <- 30
obs_standard_deviations <- rep( 0.1, nobs)
res <- optimal_interpolation_ensi(pgrid, vec2listPE(background_ens, ngrid, nens), points, as.vector(obs[,1]), obs_standard_deviations, vec2listPE(pbackground_ens, nobs, nens), multiple_structure_Soar, max_points)
ensi <- matrix( unlist(res), ncol=nens, byrow=T)

The next examples will focus on the EnSI-multi method. Methodologically, the main difference between EnSI and EnSI-multi is that EnSI calculates the analysis using the ensemble background covariance, while EnSI-multi relies on the ensemble background correlations. There are additional differences as well. EnSI-multi can be applied iteratively, where the output of one iteration becomes the input for the next, in a more effective way that EnSI. EnSI-multi is designed to adjust the background of a variable at a specific observation time by considering the background from different variables or observation times. To achieve this, EnSI-multi uses up to four types of background information: 1) background, the background field to be updated; 2) pbackground, the background values at observation locations, containing the new information for updating background (must correspond to the observations in units and time); 3) background_corr, the background ensemble at grid points, used to compute correlations between background errors at grid points and pbackground errors at observation points; 4) pbackground_corr, the background ensemble at observation points, used in conjunction with background_corr to compute the correlations. Note that background and pbackground can have different units or belong to different times. Therefore, background_corr and pbackground_corr are used to compute their respective correlations on the fly. An additional feature of EnSI-multi, not available in EnSI, is the use of perturbed observations. This allows the analysis to incorporate observation uncertainty through user-specified perturbations of the observed values.

Run EnSI using the function optimal_interpolation_ensi_multi_ebesc(). This method is nearly equivalent to applying OI independently to each ensemble member, with the key difference being that it supports the use of perturbed observations. It is part of the EnSI-multi method because it enables the use of hybrid correlation matrices (partially flow-dependent and partially static) in a straightforward way:

bratios <- rep( 1, ngrid)
pratios <- rep( 0.1, nobs)
max_points <- 30
res <- optimal_interpolation_ensi_multi_ebesc(pgrid, bratios, vec2listPE(background_ens, ngrid, nens), points, vec2listPE(obs, nobs, nens), pratios, vec2listPE(pbackground_ens, nobs, nens), multiple_structure_Soar, max_points)
ensi_multista <- matrix( unlist(res), ncol=nens, byrow=T)

Notes on: 1) bratios, vector representing the ratio of background error standard deviation at grid points (background) to that at observation points (pbackground). This vector contains coefficients (0-1) that adjust the analysis at grid points, accounting for differences in units and variability between the innovations (observation minus background) and the background at grid points. For example, if trusting the ensemble spread, bratios can be set as the ratio between the ensemble spread of the background to be updated and that used to compute the innovations. If the ensemble spread is not trusted at specific times or grid points, the bratios can be based on a typical expected ratio of spreads from multiple ensemble background realizations.

Run EnSI-multi using the function optimal_interpolation_ensi_multi_ebe(). In this case, the correlations used for OI are derived from the background ensemble, as described earlier. In this example, EnSI-multi is not being applied in a loop:

bratios <- rep( 1, ngrid)
pratios <- rep( 0.1, nobs)
max_points <- 30
res <- optimal_interpolation_ensi_multi_ebe(pgrid, bratios, vec2listPE(background_ens, ngrid, nens), vec2listPE(background_ens, ngrid, nens), points, vec2listPE(obs, nobs, nens), pratios, vec2listPE(pbackground_ens, nobs, nens), vec2listPE(pbackground_ens, nobs, nens), multiple_structure_Soar, max_points)
ensi_multidyn <- matrix( unlist(res), ncol=nens, byrow=T)

Run EnSI-multi by combining optimal_interpolation_ensi_multi_ebe() and optimal_interpolation_ensi_multi_ebesc() to apply hybrid correlations. This approach can be advantageous when the ensemble background is overconfident, causing the analysis to stay too close to the background, while observations in the same area may suggest that the truth lies outside the ensemble envelope. Note that in this case, we increase pratios:

bratios <- rep( 1, ngrid)
pratios <- rep( 1, nobs)
pensi_multidyn <- array(data=NA, dim=c(nobs, nens))
res <- optimal_interpolation_ensi_multi_ebe(pgrid, bratios, vec2listPE(background_ens, ngrid, nens), vec2listPE(background_ens, ngrid, nens), points, vec2listPE(obs, nobs, nens), pratios, vec2listPE(pbackground_ens, nobs, nens), vec2listPE(pbackground_ens, nobs, nens), multiple_structure_Soar, max_points)
ensi_multidyn <- matrix( unlist(res), ncol=nens, byrow=T)
for (e in 1:nens) 
  pensi_multidyn[,e] <- bilinear(bgrid, points, vec2listXY(ensi_multidyn[,e], ngrid_x, ngrid_y))
res <- optimal_interpolation_ensi_multi_ebesc(pgrid, bratios, vec2listPE(ensi_multidyn, ngrid, nens), points, vec2listPE(obs, nobs, nens), pratios, vec2listPE(pensi_multidyn, nobs, nens), multiple_structure_Soar, max_points)
ensi_multihyb <- matrix( unlist(res), ncol=nens, byrow=T)

The following figures display the fields as indicated. The left panels show the fields on the regular 2D grid, while the right panels depict the vertical profile. Dots indicate observation locations. The colors represent the values of the variable. The x-axis in the right panels also serves as a legend.

Truth Background ensemble mean EnSI ensemble mean EnSI-multi static ensemble mean EnSI-multi dyn ensemble mean EnSI-multi hyb ensemble mean

The following figures are similar to the figure above but represent an individual ensemble member:

Truth Bkg member 1 Ensi member 1 Ensi-multi stat member 1 Ensi-multi dyn member 1 Ensi-multi hyb member 1

Code used for plotting

plot_field <- function( ffout, x, col1=NA, breaks1=NA) {
  if (any(is.na(breaks1))) { col1 <- col; breaks1 <- breaks}
  png( file=ffout, width=800, height=800)
  par(mar=c(4,5,1,1))
  image( x=grid_x_coord, y=grid_y_coord, 
         z=array( data=x, dim=c(ngrid_x,ngrid_y)), 
         breaks=breaks1, col=col1,
         xlab="Easting Coordinate (elevation increases from West to East)", ylab="Northing Coordinate", cex.lab=2, cex.axis=2)
  for (i in 1:length(col1)) {
    if (length(ix <- which(obs[,1] >= breaks1[i] & obs[,1] < breaks1[i+1]))>0) {
      points( obs_x[ix], obs_y[ix], bg=col1[i], pch=21, cex=3, lwd=3)
    }
  }
  abline(v=land_sea_xlim, lty=2, lwd=3)
  mtext( side=1, line=-2, at=1.5, text="Sea", cex=3)
  mtext( side=1, line=-2, at=2.6, text="Land", cex=3)
  devnull <- dev.off()
  print(ffout)
}

plot_vertical <- function( ffout, x, str=NA, str1=NA) {
  png( file=ffout, width=800, height=800)
  par( mar = c( 5, 5, 1, 1))
  plot( x, grid_z, xlim=xlim, ylim=ylim,
        xlab="Variable values", ylab="Elevation", col="white", cex.lab=2, cex.axis=2)
  for (i in 1:length(col)) {
    if (length(ix <- which(x >= breaks[i] & x < breaks[i+1]))>0) {
      points( x[ix], grid_z[ix], col="lightgray", bg=col[i], pch=22, cex=3)
    }
  }
  for (i in 1:length(col)) {
    if (length(ix <- which(obs[,1] >= breaks[i] & obs[,1] < breaks[i+1]))>0) {
      points( obs[ix,1], obs_z[ix], bg=col[i], pch=21, col="black", lwd=3, cex=4)
    }
  }
  if (!is.na(str)) {
    text(x=-6, y=20, str, cex=4, adj = c(0,0))
  }
  if (!is.na(str1)) {
    text(x=-6, y=18.5, str1, cex=3, adj = c(0,0))
  }
  devnull <- dev.off()
  print(ffout)
}
bkg_mean <- rowMeans(background_ens)
ensi_mean <- rowMeans(ensi)
ensi_multista_mean <- rowMeans(ensi_multista)
ensi_multidyn_mean <- rowMeans(ensi_multidyn)
ensi_multihyb_mean <- rowMeans(ensi_multihyb)

all <- c( truth, bkg_mean, ensi_mean, ensi_multista_mean, ensi_multidyn_mean, ensi_multihyb_mean)
xlim <- range( all, na.rm=T)
ylim <- range( c( grid_z, obs_z), na.rm=T)

breaks <- seq( floor( min( all, na.rm=T)), ceiling( max( all, na.rm=T)), by=2 )
breaks[1] <- (-100); breaks[length(breaks)] <- 100
col <- rev(rainbow(length(breaks)-1))

fftruth <- paste0("map_01_truth.png")
plot_field( fftruth, truth)
fftruthv <- paste0("vert_01_truth.png")
plot_vertical( fftruthv, truth, str="Truth")

ffbkg <- paste0("map_02_bkg.png")
plot_field( ffbkg, bkg_mean)
ffbkgv <- paste0("vert_02_bkg.png")
plot_vertical( ffbkgv, bkg_mean, str="Background", str1="ensemble mean")

ffensi <- paste0("map_03_ensi.png")
plot_field( ffensi, ensi_mean)
ffensiv <- paste0("vert_03_ensi.png")
plot_vertical( ffensiv, ensi_mean, str="EnSI", str1="ensemble mean")

ffensi_multista <- paste0("map_04_ensi_multista.png")
plot_field( ffensi_multista, ensi_multista_mean)
ffensi_multistav <- paste0("vert_04_ensi_multista.png")
plot_vertical( ffensi_multistav, ensi_multista_mean, str="EnSI multi Static", str1="ensemble mean")

ffensi_multidyn <- paste0("map_05_ensi_multidyn.png")
plot_field( ffensi_multidyn, ensi_multidyn_mean)
ffensi_multidynv <- paste0("vert_05_ensi_multidyn.png")
plot_vertical( ffensi_multidynv, ensi_multidyn_mean, str="EnSI multi Dynamic", str1="ensemble mean")

ffensi_multihyb <- paste0("map_06_ensi_multihyb.png")
plot_field( ffensi_multihyb, ensi_multihyb_mean)
ffensi_multihybv <- paste0("vert_06_ensi_multihyb.png")
plot_vertical( ffensi_multihybv, ensi_multihyb_mean, str="EnSI multi Hybrid", str1="ensemble mean")

system( paste0("convert +append map_01_truth.png vert_01_truth.png combo_01_truth.png"))
system( paste0("convert +append map_02_bkg.png vert_02_bkg.png combo_02_bkg.png"))
system( paste0("convert +append map_03_ensi.png vert_03_ensi.png combo_03_ensi.png"))
system( paste0("convert +append map_04_ensi_multista.png vert_04_ensi_multista.png combo_04_ensi_multista.png"))
system( paste0("convert +append map_05_ensi_multidyn.png vert_05_ensi_multidyn.png combo_05_ensi_multidyn.png"))
system( paste0("convert +append map_06_ensi_multihyb.png vert_06_ensi_multihyb.png combo_06_ensi_multihyb.png"))
system( paste0("rm map_0*.png vert_0*.png"))

From the previous Wiki

Required setup for examples

To get ready for the examples in the next sections, run the code below to set up necessary variables (you will also need the test datasets). This retrieves air temperature and precipitation from the observation, analysis, and forecast files as well as metadata about the grids.

dyn.load("R/gridpp.so")
source("R/gridpp.R")
cacheMetaData(1)

require(ncdf4)

array2list = function(ar) {
    q = lapply(seq(dim(ar)[2]), function(x) ar[ , x])
    return(q)
}

nc = nc_open("analysis.nc")
ilats = ncvar_get(nc, 'latitude')
ilons = ncvar_get(nc, 'longitude')
ielevs = ncvar_get(nc, 'surface_geopotential')
igrid = Grid(array2list(ilats), array2list(ilons), array2list(ielevs))
temp_analysis = ncvar_get(nc, 'air_temperature_2m')
nc_close(nc)

nc = nc_open("output.nc")
olats = ncvar_get(nc, 'latitude')
olons = ncvar_get(nc, 'longitude')
oelevs = ncvar_get(nc, 'altitude')
ogrid = Grid(array2list(olats), array2list(olons), array2list(oelevs))
nc_close(nc)

nc = nc_open("obs.nc")
plats = ncvar_get(nc, 'latitude')
plons = ncvar_get(nc, 'longitude')
pelevs = ncvar_get(nc, 'altitude')
points = Points(plats, plons, pelevs)
temp_obs = ncvar_get(nc, 'air_temperature_2m')
precip_obs = ncvar_get(nc, 'precipitation_amount')
nc_close(nc)

Optimal interpolation example

We have verified that the following example works with swig version 4.2.1 while it does not work with version 4.0.2 (our best guess: there is a problem with the way cpp shared pointers are handled in 4.0.2, which has been fixed after version 4.1.1).

structure = BarnesStructure(10000, 0)
pbackground = nearest(igrid, points, array2list(temp_analysis[,,1]))
ratios = rep(0.1 , length(pbackground))
max_points = 5
ovalues = optimal_interpolation(igrid, array2list(temp_analysis[,,1]), points, temp_obs, ratios, pbackground, structure, max_points)
Clone this wiki locally