You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have two xarray.DataArray and was interested in the overall correlation between them. They are both of 2D arrays of shape 6001 x 6001 that align spatially with the same CRS and contain NaNs i.e.,:
arr1:
arr2:
When quantifying their correlation, I receive very different results depending on if I set the dim parameter in xr.corr().
Without setting dim, i.e., xr.corr(arr1, arr2) I receive a correlation of array(0.08744669).
When setting dim, i.e., xr.corr(arr1, arr2, dim = ['x', 'y']) I receive a correlation of array(0.4891213, dtype=float32).
I checked pearsonr from scipy.stats...
from scipy.stats import pearsonr
# If null in either arr, set to null in both
arr1 = xr.where((arr1 .isnull() | arr2.isnull()), np.nan, arr1)
arr2 = xr.where((arr1 .isnull() | arr2.isnull()), np.nan, arr2)
# Flatten
arr1 = arr1.to_numpy().flatten()
arr2 = arr2.to_numpy().flatten()
# Drop null
arr1 = arr1[~np.isnan(arr1)]
arr2 = arr2[~np.isnan(arr2)]
# Calculate pearson r
pearsonr(arr1 , arr2) # PearsonRResult(statistic=np.float64(0.4891210562797157), pvalue=np.float32(0.0))
This indicates that setting the dim explicitly achieves the correct result (within rounding error of pearsonr result).
I am not sure if this is a bug in xr.corr or me misunderstanding the function. From limited testing, I found differences (although much smaller) if I subset to tiny portions of each array and portions without an NaNs.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello,
I have two
xarray.DataArray
and was interested in the overall correlation between them. They are both of 2D arrays of shape 6001 x 6001 that align spatially with the same CRS and contain NaNs i.e.,:arr1
:arr2
:When quantifying their correlation, I receive very different results depending on if I set the
dim
parameter inxr.corr()
.Without setting
dim
, i.e.,xr.corr(arr1, arr2)
I receive a correlation ofarray(0.08744669)
.When setting
dim
, i.e.,xr.corr(arr1, arr2, dim = ['x', 'y'])
I receive a correlation ofarray(0.4891213, dtype=float32)
.I checked
pearsonr
fromscipy.stats
...This indicates that setting the dim explicitly achieves the correct result (within rounding error of
pearsonr
result).I am not sure if this is a bug in
xr.corr
or me misunderstanding the function. From limited testing, I found differences (although much smaller) if I subset to tiny portions of each array and portions without an NaNs.Any comments appreciated!
Beta Was this translation helpful? Give feedback.
All reactions