Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: add_missing_bounds not working when time type is numpy.timedelta64 #659

Closed
lee1043 opened this issue Jun 6, 2024 · 4 comments · Fixed by #660
Closed

[Bug]: add_missing_bounds not working when time type is numpy.timedelta64 #659

lee1043 opened this issue Jun 6, 2024 · 4 comments · Fixed by #660
Assignees
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@lee1043
Copy link
Collaborator

lee1043 commented Jun 6, 2024

What happened?

xarray's tutorial sample data for air temperature has its time axis typed numpy.timedelta64 while numpy.timedelta is expected from xcdat

What did you expect to happen? Are there are possible answers you came across?

The issue can be addressed by converting numpy.timedelta64 to numpy.timedelta if needed when checking the difference between the interval of the second and first time steps.

Minimal Complete Verifiable Example (MVCE)

import xcdat as xc
import xarray as xr

# Load xarray tutorial sample data
ds = xr.tutorial.load_dataset("air_temperature")
data_var = "air"

# Add axis attributes to lat / lon axes, which was missing in the sample data
new_y = ds.lat.assign_attrs(axis='Y')
new_x = ds.lon.assign_attrs(axis='X')
ds = ds.assign_coords(lat=new_y, lon=new_x)

# Add bounds
ds = ds.bounds.add_missing_bounds(["X", "Y", "T"])

Relevant log output

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[7], line 1
----> 1 ds = ds.bounds.add_missing_bounds(["X", "Y", "T"])

File ~/mambaforge/envs/xcdat_dev_20240401/lib/python3.11/site-packages/xcdat/bounds.py:189, in BoundsAccessor.add_missing_bounds(self, axes)
    187         bounds = create_bounds(axis, coord)
    188     elif axis == "T":
--> 189         bounds = self._create_time_bounds(coord)
    190 except (ValueError, TypeError) as e:
    191     logger.warning(e)

File ~/mambaforge/envs/xcdat_dev_20240401/lib/python3.11/site-packages/xcdat/bounds.py:607, in BoundsAccessor._create_time_bounds(self, time, freq, daily_subfreq, end_of_month)
    605 if daily_subfreq is None:
    606     diff = time.values[1] - time.values[0]
--> 607     hrs = diff.seconds / 3600
    608     daily_subfreq = int(24 / hrs)  # type: ignore
    610 time_bnds = self._create_daily_time_bounds(timesteps, obj_type, freq=daily_subfreq)  # type: ignore

AttributeError: 'numpy.timedelta64' object has no attribute 'seconds'

Anything else we need to know?

Narrow down to the issue:

time = ds.time
diff = time.values[1] - time.values[0]
diff.seconds

'numpy.timedelta64' object has no attribute 'seconds'

type(diff)

numpy.timedelta64

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.8 | packaged by conda-forge | (main, Feb 16 2024, 20:51:20) [Clang 16.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 23.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2024.3.0
pandas: 2.2.1
numpy: 1.26.4
scipy: 1.12.0
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: 1.4.1
iris: None
bottleneck: None
dask: 2024.3.1
distributed: 2024.3.1
matplotlib: 3.8.3
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.3.1
cupy: None
pint: None
sparse: 0.15.1
flox: None
numpy_groupies: None
setuptools: 69.2.0
pip: 24.0
conda: None
pytest: 8.1.1
mypy: 1.4.0
IPython: 8.22.2
sphinx: 7.2.6

@lee1043 lee1043 added the type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Jun 6, 2024
@lee1043 lee1043 self-assigned this Jun 6, 2024
@lee1043
Copy link
Collaborator Author

lee1043 commented Jun 6, 2024

This will be resolved by #655, with its commit 27b13bd.

@lee1043 lee1043 linked a pull request Jun 6, 2024 that will close this issue
9 tasks
@pochedls
Copy link
Collaborator

pochedls commented Jun 6, 2024

@lee1043 – what type of objects are in the time axis itself (I am guessing they are not np.timedelta64)?

Note that xcdat's missing bounds functionality expects cftime of datetime64 objects for the time axis (here).

Is it possible to convert the time axis before using add_missing_bounds? Something like this could work:

my_dataset['time'] = X(my_dataset.time) (where X is some pandas/numpy/cftime operation that might convert the time axis)

I think this would be preferable to adding in pandas functionality in the missing bounds routines. If you do add this functionality, it would be ideal to do it in a separate PR.

@pochedls
Copy link
Collaborator

pochedls commented Jun 6, 2024

@lee1043 : does this resolve this issue:

ds = xr.tutorial.load_dataset("air_temperature", use_cftime=True)

@lee1043
Copy link
Collaborator Author

lee1043 commented Jun 6, 2024

@pochedls Thank you for the thoughts. add_missing_bounds is an example here to show the issue, while the initial intent was to do temporal averaging. type(ds.time.values[0]) shows they are numpy.datetime64.

Good to learn about use_cftime=True, which resolved the issue.

The current bounds.py already has a part to convert object to timedelta at below, so to make it consistent I think it won't be harmful to add it where it was missed (as I did in 27b13bd).

diffs = pd.to_timedelta(diffs)

I however see @tomvothecoder left a comment below the above line

    # FIXME: These lines produces the warning: `PerformanceWarning:
    # Adding/subtracting object-dtype array to TimedeltaArray not
    # vectorized` after converting diffs to `timedelta`. I (Tom) was not
    # able to find an alternative, vectorized solution at the time of this
    # implementation.

I will open a separate PR as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants