-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(web-analytics): Turn timezones into UTC offsets instead of timezone names #26409
feat(web-analytics): Turn timezones into UTC offsets instead of timezone names #26409
Conversation
properties={"$session_id": session_id, "$pathname": f"/path{i}", "$timezone": distinct_id}, | ||
) | ||
|
||
with self.assertRaisesRegex(Exception, "Cannot load time zone"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable error?
@@ -1006,50 +1006,67 @@ def test_cohort_test_filters(self): | |||
assert results == [["/path1", 1, 1]] | |||
|
|||
def test_timezone_filter(self): | |||
d1, s1 = "d1", str(uuid7("2024-07-30")) | |||
d2, s2 = "d2", str(uuid7("2024-07-30")) | |||
date = "2024-07-30" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me to be suuuper confidant this works, I'd like to see some tests that cover before / after the daylight savings change for the same timezone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added it, good call
Size Change: 0 B Total Size: 1.16 MB ℹ️ View Unchanged
|
# Get the difference between the UNIX timestamp at UTC and the UNIX timestamp at the event's timezone | ||
# Value is in milliseconds, turn it to hours, works even for fractional timezone offsets (I'm looking at you, Australia) | ||
return parse_expr( | ||
"if(isNull(properties.$timezone), NULL, (toUnixTimestamp64Milli(parseDateTimeBestEffort(assumeNotNull(toString(timestamp, properties.$timezone)))) - toUnixTimestamp64Milli(timestamp)) / 3600000)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a test for key not present, value is NULL, and value is an empty string, I think you might need some extra logic to handle the empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one more test IMO but happy for this to be a follow-up :)
Timezone names on their own aren't that useful because they're basically a harder-to-use version of displaying events by city/region. Displaying them by offsets, however, can prove to be valuable. We're doing some magic to calculate the offset instead of simply using something like `timezoneOffset` because `toString` is one of the only functions in Clickhouse that allows us to use a dynamic timezone versus a hardcoded one. This might have some performance implications, we'll figure it out soon :). To workaround that, we're converting to a string, then parsing that back to a date, and getting the unix timestamp. Comparing the UNIX timestamp in a specific timezone versus the value at UTC allows us to get the offset at that day. See #26376
4a43be8
to
d6361fd
Compare
Timezone names on their own aren't that useful because they're basically a harder-to-use version of displaying events by city/region. Displaying them by offsets, however, can prove to be valuable. We're doing some magic to calculate the offset instead of simply using something like
timezoneOffset
becausetoString
is one of the only functions in Clickhouse that allows us to use a dynamic timezone versus a hardcoded one. This might have some performance implications, we'll figure it out soon :). To workaround that, we're converting to a string, then parsing that back to a date, and getting the unix timestamp. Comparing the UNIX timestamp in a specific timezone versus the value at UTC allows us to get the offset at that day.See #26376
Changes
Does this work well for both Cloud and self-hosted?
Yep
How did you test this code?
Comprehensive unit test + manual UI testing