Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow zoneinfo objects #916

Merged
merged 8 commits into from
Apr 22, 2024
Merged

Allow zoneinfo objects #916

merged 8 commits into from
Apr 22, 2024

Conversation

mroeschke
Copy link

closes #915

raise ValueError("Time-zone information could not be serialised: "
"%s, please use another" % str(dtype.tz)) from e
elif isinstance(dtype, pd.DatetimeTZDtype):
if isinstance(dtype.tz, zoneinfo.ZoneInfo):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if isinstance(dtype.tz, zoneinfo.ZoneInfo):
if getattr(dtype.tz, "zone", False):

?

@martindurant
Copy link
Member

update cibuildwheel to 2.16.5 ?

Aside from the wheels, a name is coming out as type int in writer, which is not allowed by parquet (in test_tz_zoneinfo)

@mroeschke
Copy link
Author

Sorry about that delay here. I fix the test where the column name was coming out as an int

@martindurant
Copy link
Member

There is some dask config option that says "don't use dask-expr" we'll need to set, since they didn't bother to implement fastparquet stuff (that already existed).

@martindurant
Copy link
Member

Still hitting dask-expr code somehow:

../../../micromamba-root/envs/test_env/lib/python3.10/site-packages/dask_expr/_collection.py:3152: in to_parquet
    return to_parquet(self, path, **kwargs)
../../../micromamba-root/envs/test_env/lib/python3.10/site-packages/dask_expr/io/parquet.py:335: in to_parquet
    engine = _set_parquet_engine(engine=engine, meta=df._meta)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

engine = 'fastparquet', meta = Empty DataFrame
Columns: [unique, id]
Index: []

    def _set_parquet_engine(engine=None, meta=None):
        # Use `engine` or `meta` input to set the parquet engine
        if engine == "fastparquet":
>           raise NotImplementedError("Fastparquet engine is not supported")
E           NotImplementedError: Fastparquet engine is not supported

@mroeschke
Copy link
Author

Added an xfail on test_sorted_row_group_columns_with_filters for now since it doesn't seem directly related to the zoneinfo support directly

@martindurant
Copy link
Member

Would you also like to fix the

ValueError: assignment destination is read-only

errors from new pandas?

@mroeschke
Copy link
Author

I would prefer to address that in a follow up to avoid to scope creep in this PR

@martindurant martindurant merged commit ec26733 into dask:main Apr 22, 2024
20 of 21 checks passed
@mroeschke mroeschke deleted the enh/zoneinfo branch April 22, 2024 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support zoneinfo.ZoneInfo timezones
2 participants