minmax reports strange values, when all the array is masked (2.12) #20

jypeter · 2018-03-22T10:10:32Z

I have come across some slightly anomalous data (see CDAT/cdms#235 for details) that led to a completely masked array. But minmax reports very big values instead of masked

>>> genutil.minmax(U_avg)
(1.7976931348623157e+308, -1.7976931348623157e+308)

2 interesting things to note here:

1.7976931348623157e+308 is the exact max value for IEEE real*8
the max reported above, -1.797... is negative and therefore lower than the min...

Should minmax report (np.ma.masked, np.ma.masked) instead ?

Input data for the file

The following lines generated masked data

>>> import cdms2, MV2, cdutil, genutil, numpy as np
>>> f = cdms2.open('/home/scratch01/jypeter/time_counter_bounds_pb.nc')
>>> U = f('U')
>>> f.close()
>>> genutil.minmax(U)
(-31.189350128173828, 23.517915725708008)
>>> U.shape
(10, 1, 90, 180)
>>> 10*90*180
162000
>>> MV2.count(U)
162000
>>> U_avg = cdutil.averager(U, axis='t')
>>> U_avg.shape
(1, 90, 180)
>>> MV2.count(U_avg)
0
>>> genutil.minmax(U_avg)
(1.7976931348623157e+308, -1.7976931348623157e+308)
>>> U_avg.min()
masked
>>> U_avg.max()
masked

Note that I also get the same crazy big values when working with some dummy data

>>> U.dtype
dtype('float32')
>>> U_avg.dtype
dtype('float64')
>>> dummy_ma = np.ma.zeros(U_avg.shape, U_avg.dtype)
>>> MV2.count(dummy_ma)
16200
>>> dummy_ma[...] = np.ma.masked
>>> MV2.count(dummy_ma)
0
>>> genutil.minmax(dummy_ma)
(1.7976931348623157e+308, -1.7976931348623157e+308)
>>> dummy_ma.min()
masked
>>> dummy_ma.max()
masked

Oh, I also get the same crazy big values when working with real*4 data

>>> dummy_ma = np.ma.zeros(U_avg.shape, np.float32)
>>> dummy_ma[...] = np.ma.masked
>>> MV2.count(dummy_ma)
0
>>> dummy_ma.dtype
dtype('float32')
>>> genutil.minmax(dummy_ma)
(1.7976931348623157e+308, -1.7976931348623157e+308)

The text was updated successfully, but these errors were encountered:

durack1 · 2018-04-03T18:30:42Z

@dnadeau4 I am wondering whether this masking behaviour is related to the other issues we've been having with masks with regridding?

@gleckler1 @doutriaux1 @taylor13

github-actions · 2020-08-27T16:18:38Z

Marking issue as stale, since there has been no activity in 30 days.

Unless the issue is updated or the 'stale' tag is removed, this issue will be closed in 7 days.

durack1 · 2020-08-27T17:14:07Z

@jypeter is this still an issue with CDAT 8.2.1?

jypeter · 2020-08-28T12:47:59Z

@durack1 I have not installed 8.2.1 yet (and no time for that now, unfortunately). I have shared again my time_counter_bounds_pb.nc test file, with a valid link

Can somebody give this a try

The strange behavior is still here with the python 3 version of CDAT 8.1:

>>> import cdms2, MV2, cdutil, genutil, numpy as np
>>> f = cdms2.open('/home/scratch01/jypeter/time_counter_bounds_pb.nc')
>>> U = f('U')
>>> f.close()
>>> genutil.minmax(U)
(-31.189350128173828, 23.517915725708008)
>>> MV2.count(U)
162000
>>> U_avg = cdutil.averager(U, axis='t')
/home/share/unix_files/cdat/miniconda3/envs/cdatm_py3/lib/python3.6/site-packages/numpy/ma/core.py:3174: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  dout = self.data[indx]
>>> U_avg.shape
(1, 90, 180)
>>> MV2.count(U_avg)
0
>>> genutil.minmax(U_avg)
(1.7976931348623157e+308, -1.7976931348623157e+308)
>>> U_avg.min()
masked
>>> U_avg.max()
masked

jasonb5 · 2020-08-31T16:08:49Z

I've been able to replicate this with the latest version 8.2.1.

You are correct the returned values in this case are the min and max float32 values. This behavior can be traced back to these lines.

genutil/Lib/minmax.py

Lines 34 to 35 in d0ed149

 if count(d) == 0: 

 return mx, mn

I don't think this is correct behavior as I assume a masked values don't exist. I believe the goal of this function is to accept any number or array of numbers and return a package-agnostic value for min and max, returning np.ma.masked would not work in this case.

I think returning (None, None) would be more appropriate, alternatively we could do (inf, -inf).

jypeter · 2020-08-31T16:26:06Z

Thanks for testing this!

You could also return (np.masked, np.masked). Best is to try to stay consistent with numpy. Could you check the return value of U_avg.asma().min() (and .max())

taylor13 · 2020-09-01T15:14:43Z

It seems that the order of the returned values described in #20 (comment) is inconsistent with the order in #20 (comment) , which probably explains why in #20 (comment) the reported max is less than the reported min.

jypeter mentioned this issue Mar 22, 2018

cdutil.averager could do a sanity check of the delta of the dimension bounds CDAT/cdms#235

Open

github-actions bot added the stale label Aug 27, 2020

github-actions bot removed the stale label Aug 27, 2020

jasonb5 added the kind/bug Categorizes issue related to bug. label Sep 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

minmax reports strange values, when all the array is masked (2.12) #20

minmax reports strange values, when all the array is masked (2.12) #20

jypeter commented Mar 22, 2018

durack1 commented Apr 3, 2018

github-actions bot commented Aug 27, 2020

durack1 commented Aug 27, 2020

jypeter commented Aug 28, 2020

jasonb5 commented Aug 31, 2020

jypeter commented Aug 31, 2020

taylor13 commented Sep 1, 2020

minmax reports strange values, when all the array is masked (2.12) #20

minmax reports strange values, when all the array is masked (2.12) #20

Comments

jypeter commented Mar 22, 2018

durack1 commented Apr 3, 2018

github-actions bot commented Aug 27, 2020

durack1 commented Aug 27, 2020

jypeter commented Aug 28, 2020

jasonb5 commented Aug 31, 2020

jypeter commented Aug 31, 2020

taylor13 commented Sep 1, 2020