-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dagster-snowflake] config to determine how to handle timestamp data #13097
[dagster-snowflake] config to determine how to handle timestamp data #13097
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 2 Ignored Deployments
|
Current dependencies on/for this PR: This comment was auto-generated by Graphite. |
python_modules/libraries/dagster-snowflake/dagster_snowflake/snowflake_io_manager.py
Outdated
Show resolved
Hide resolved
...libraries/dagster-snowflake-pandas/dagster_snowflake_pandas/snowflake_pandas_type_handler.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Content-wise, this looks good to me 👍
c6f515a
to
74458b4
Compare
Deploy preview for dagster ready! Preview available at https://dagster-cq7fphxcv-elementl.vercel.app Direct link to changed pages: |
@benpankow @sryza the last thing to decide for this PR is what version to list as the deprecation version for converting to strings. some options: dagster-snowflake 0.20.0 (or 0.21.0 etc) |
4391816
to
fd3ec99
Compare
Deploy preview for dagit-core-storybook ready! ✅ Preview Built with commit 1251aea. |
@benpankow @sryza I'd like to get this in for the 1.3 code freeze (so ideally merge tomorrow afternoon). I'm getting buildkite passing after some nasty merge conflicts, could you do a final review pass and let me know if you have opinions on a deprecation date? I lean toward punting or doing |
I think we can punt on specifying a deprecation version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last thing: I think it would be helpful for the errors to include the names of the tables. Otherwise, this looks great.
2378f8a
to
1251aea
Compare
Deploy preview for dagit-storybook ready! ✅ Preview Built with commit 1251aea. |
Summary & Motivation
There is an issue with storing pandas Timestamp values in snowflake where the year will get converted to a non-valid year (example 48399). In pr #8760 i got around this by storing pandas timestamps as strings, but i finally found this snowflakedb/snowflake-connector-python#319 that indicates that the real fix is to include timezone information in your pandas timestamps snowflakedb/snowflake-connector-python#319 (comment).
This PR adds a new configuration value to the snowflake io manager that allows the user to specify if they want to convert timestamp data to strings, or add a timezone to it. We also check the column types of the table to ensure that the chosen conversion aligns with the type of the corresponding table.
Right now I have deprecation warnings in place so that we can remove the string conversion at some point. I'm not sure what version to choose as the deprecation version, or if committing to deprecating makes sense right now.
I also added documentation explaining the behavior of the configuration value and providing SQL commands that will allow a user to migrate a column from string to timestamp if they wish to move from converting timestamp data to strings to using timezones.
See #12190 for additional context
How I Tested These Changes