Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust usage of utcnow() and utcfromtimestamp() for Python 3 #440

Closed
wants to merge 5 commits into from

Conversation

amotl
Copy link
Member

@amotl amotl commented Jul 21, 2022

Motivation

This patch further sets the stage for #437.

About

On the transition from Python 2 to Python 3, this subtle adjustment sometimes slips through. It is needed for properly working with the now() variants of native Python date and datetime objects.

  • Instead of datetime.utcnow(), use datetime.now(timezone.utc)
  • Instead of utcfromtimestamp(), use fromtimestamp(..., tz=timezone.utc)

See also: https://blog.ganssle.io/articles/2019/11/utcnow.html

Please note that this patch still reflects the fact that only timezone-unaware (naive) datetime objects are handled yet. In order to compensate that, a number of .replace(tzinfo=None) calls have been added along the lines.

On the transition from Python 2 to Python 3, this subtle adjustment
needed when working with date and datetime objects sometimes slips
through.

- Instead of `datetime.utcnow()`, use `datetime.now(timezone.utc)`
- Instead of `utcfromtimestamp()`, use `fromtimestamp(..., tz=timezone.utc)`

See also: https://blog.ganssle.io/articles/2019/11/utcnow.html

Please note that this patch still reflects the fact that only
timezone-unaware (naive) datetime objects are handled yet. In order to
compensate that, a number of `.replace(tzinfo=None)` calls have been
added along the lines.
@amotl amotl marked this pull request as ready for review July 21, 2022 23:35
Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @amotl left a couple of comments.

@@ -56,6 +55,7 @@
)
from .sqlalchemy.tests import test_suite as sqlalchemy_test_suite
from .sqlalchemy.types import ObjectArray
from ..testing.util import datetime_now_utc_naive, date_now_utc_naive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is relative?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyCharm created that import line for me. The import statements above are also relative.

from datetime import datetime, timezone


def datetime_now_utc_naive():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you choose naive in the naming?

Copy link
Member Author

@amotl amotl Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation where this patch is based upon also uses the same nomenclature "aware" vs. "naive" for refering to "timezone-aware" vs. "timezone-unaware" native Python datetime objects 12.

Date and time objects may be categorized as “aware” or “naive” depending on whether or not they include timezone information.

Footnotes

  1. https://docs.python.org/3/library/datetime.html

  2. https://blog.ganssle.io/articles/2019/11/utcnow.html

@amotl amotl requested a review from matriv July 22, 2022 09:56
>>> from datetime import datetime
>>> now = datetime.utcnow()
>>> from datetime import datetime, timezone
>>> now = datetime.now(timezone.utc).replace(tzinfo=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the change here. Isn't now(timezone.utc).replace(tzinfo=None) the same as utcnow()?

Copy link
Member Author

@amotl amotl Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is really sad. In situations when the system time zone setting, defined through the TZ environment variable, is expressed in UTC, everything is fine. However, it is not the case if the system time zone is anything other than UTC.

With c1f183e, I've tried to improve the situation to outline the problem, on behalf of some test cases. 5894c2e quotes and references all important information about this topic, from both the Python documentation and articles by other capacities.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the case.

utcnow:

Return the current UTC date and time, with tzinfo None.

Yes, it is a naive datetime without timezone info attached, but it is using utc.

>>> print(datetime.utcnow(), '\n', datetime.now(tz=timezone.utc).replace(tzinfo=None), '\n', datetime.now())
2022-07-22 18:27:27.055192
 2022-07-22 18:27:27.055200
 2022-07-22 20:27:27.055213
↪  date +"%Z %z"
CEST +0200

Copy link
Member

@mfussenegger mfussenegger Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also as the articles point out, the problem is the use of the naive datetime (lack of tzinfo) in subsequent operations. If we remove the tzinfo anyways via .replace(tzinfo=None) we don't gain anything by using now(tz=timezone.utc) but only make things more complicated.

And I'm not sure if attaching a timezone is even the correct thing to do, because the timestamp stored in CrateDB also doesn't have one attached. Assuming that timestamp values returned by the server are UTC may not be safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. You convinced me that this goes too far. I've submitted #441 instead, which solely focuses on mitigating a fluke I regularly occasionally observed when running the test suite on CI after midnight.

@@ -90,7 +90,8 @@ def process(value):
if not value:
return
try:
return datetime.utcfromtimestamp(value / 1e3).date()
return datetime.fromtimestamp(value / 1e3, tz=timezone.utc) \
.replace(tzinfo=None).date()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we throw away the tzinfo why not keep using utcfromtimestamp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is to be a preparation step to actually allow the user to pass a tzinfo here, later on?



def datetime_now_utc_naive():
return datetime.now(timezone.utc).replace(tzinfo=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this just utcnow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I far from expert in python but I also read this: https://blog.ganssle.io/articles/2019/11/utcnow.html

Those test cases try to outline the situation why 5a94e36 is needed.
They specifically exercise some scenarios where the system time zone
is expressed in UTC, or not, and shows the corresponding deviances.

Other test cases attempt to outline that the new helper functions
`datetime_now_utc_naive` and `date_now_utc_naive` do the right thing
in all situations, regardless of any corresponding system locale or time
zone setting.
@amotl amotl force-pushed the amo/remove-datetime-utcnow branch from 5894c2e to 10a2c7b Compare July 22, 2022 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants