Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(data-warehouse): integrating data warehouse with trends insight #20320

Merged
merged 91 commits into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
ec4f4e8
wip with new datawarehousenode type in trendsquerybuilder
EDsCODE Feb 12, 2024
1b3dc17
split out query builder
EDsCODE Feb 13, 2024
74b8056
add test
EDsCODE Feb 13, 2024
ed55bd9
add property support
EDsCODE Feb 14, 2024
637d82d
add test for entity property and non entity property filter
EDsCODE Feb 14, 2024
4cf27a2
basic breakdown working
EDsCODE Feb 14, 2024
ff669d0
merge master
EDsCODE Feb 20, 2024
388a81f
typing
EDsCODE Feb 20, 2024
6c778a4
typing
EDsCODE Feb 20, 2024
4017a43
more typing
EDsCODE Feb 20, 2024
7395667
use default args
EDsCODE Feb 20, 2024
b28ac36
more typing
EDsCODE Feb 20, 2024
df6d297
resolved mypy
EDsCODE Feb 20, 2024
e118dba
Merge branch 'master' into dw-test-insight-integration
EDsCODE Feb 20, 2024
99dc8f4
Update query snapshots
github-actions[bot] Feb 20, 2024
7b649ae
remove config
EDsCODE Feb 20, 2024
620a98a
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 20, 2024
3a0ee07
generate schema properly
EDsCODE Feb 20, 2024
4c72ca0
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 20, 2024
b9bb5c9
add breakdown type
EDsCODE Feb 20, 2024
28ffe84
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 20, 2024
bc1e18f
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 20, 2024
cd12eb3
trim data warehouse query builder
EDsCODE Feb 20, 2024
a60ce92
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 20, 2024
07aab0f
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 20, 2024
4d23026
update schema.json
EDsCODE Feb 20, 2024
06b5d89
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 20, 2024
2f02e7f
add ci config
EDsCODE Feb 20, 2024
4b3618a
try localhost
EDsCODE Feb 20, 2024
8c8e1aa
mapping for linux
EDsCODE Feb 20, 2024
e746769
Update query snapshots
github-actions[bot] Feb 20, 2024
eb2594c
more typing
EDsCODE Feb 21, 2024
e8b38f5
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 21, 2024
1e77abd
types
EDsCODE Feb 21, 2024
bfe3490
frontend typing
EDsCODE Feb 21, 2024
bc76a82
backend typing
EDsCODE Feb 21, 2024
cacbc3c
add data warehouse logic path to funnel
EDsCODE Feb 21, 2024
16df2aa
typo
EDsCODE Feb 21, 2024
d92fb55
typing
EDsCODE Feb 21, 2024
3630645
more typo
EDsCODE Feb 21, 2024
326cfda
more types
EDsCODE Feb 21, 2024
3ec2020
sync
EDsCODE Feb 21, 2024
ebfc5fd
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 21, 2024
21bd400
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 21, 2024
602d5e8
Update UI snapshots for `webkit` (2)
github-actions[bot] Feb 21, 2024
041ab8c
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 21, 2024
ba5bb6d
finally
EDsCODE Feb 21, 2024
9dc04ed
Update UI snapshots for `webkit` (2)
github-actions[bot] Feb 21, 2024
34f0068
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 21, 2024
8144cc5
Update query snapshots
github-actions[bot] Feb 21, 2024
b273f93
Update query snapshots
github-actions[bot] Feb 21, 2024
8275372
add breakdown prop test
EDsCODE Feb 22, 2024
207f064
update tests
EDsCODE Feb 22, 2024
bff4a05
add test
EDsCODE Feb 22, 2024
5081bc2
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 22, 2024
f0bf1dc
add type check
EDsCODE Feb 22, 2024
e546b65
Merge branch 'master' into dw-test-insight-integration
EDsCODE Feb 22, 2024
4d9bde8
Update query snapshots
github-actions[bot] Feb 22, 2024
f70e43c
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 22, 2024
d43b40a
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 22, 2024
302acd9
fix tests
EDsCODE Feb 22, 2024
c84d866
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 22, 2024
9f6d568
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 22, 2024
91dce70
Merge branch 'master' into dw-test-insight-integration
EDsCODE Feb 22, 2024
ae41ec4
abstract class
EDsCODE Feb 22, 2024
7e53759
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 22, 2024
77f762b
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 22, 2024
51a7f27
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 22, 2024
77aba07
Update query snapshots
github-actions[bot] Feb 22, 2024
02fd6a0
more tpying..
EDsCODE Feb 22, 2024
71d7c48
Update query snapshots
github-actions[bot] Feb 22, 2024
735dd55
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 22, 2024
03be49e
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 22, 2024
a693548
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 22, 2024
67b0148
match signature
EDsCODE Feb 22, 2024
4b831f0
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 22, 2024
6e2165a
match signature
EDsCODE Feb 22, 2024
e19bc4a
update test
EDsCODE Feb 22, 2024
9345626
Update query snapshots
github-actions[bot] Feb 22, 2024
f5d20ea
add missing fields
EDsCODE Feb 27, 2024
46b2a0e
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 27, 2024
fac83eb
add schema
EDsCODE Feb 27, 2024
6277d39
update typing
EDsCODE Feb 28, 2024
37df9fd
Merge branch 'master' into dw-test-insight-integration
EDsCODE Feb 28, 2024
8154be1
Update UI snapshots for `chromium` (2)
github-actions[bot] Feb 28, 2024
e1d8672
merge master
EDsCODE Feb 28, 2024
dfb6933
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 28, 2024
45806c2
Update query snapshots
github-actions[bot] Feb 28, 2024
66c4cd1
Merge branch 'master' into dw-test-insight-integration
EDsCODE Feb 29, 2024
4b35e03
Merge branch 'master' into dw-test-insight-integration
EDsCODE Feb 29, 2024
11f7671
Merge branch 'dw-test-insight-integration' of github.com:PostHog/post…
EDsCODE Feb 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions posthog/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ class AvailableFeature(str, Enum):

TREND_FILTER_TYPE_ACTIONS = "actions"
TREND_FILTER_TYPE_EVENTS = "events"
TREND_FILTER_TYPE_DATA_WAREHOUSE = "data_warehouse"

SESSION_RECORDINGS_FILTER_IDS = "session_ids"

Expand Down Expand Up @@ -117,6 +118,7 @@ class AvailableFeature(str, Enum):
ENTITIES = "entities"
ACTIONS = "actions"
EVENTS = "events"
DATA_WAREHOUSE_ENTITIES = "data_warehouse_entities"
EXCLUSIONS = "exclusions"
PROPERTIES = "properties"
PROPERTY_GROUPS = "property_groups"
Expand Down
2 changes: 1 addition & 1 deletion posthog/hogql/functions/mapping.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ class HogQLFunctionMeta:
"parseDateTime64BestEffortOrNull",
1,
1,
overloads=[((ast.DateTimeType, ast.DateType), "toDateTime")],
overloads=[((ast.DateTimeType, ast.DateType, ast.IntegerType), "toDateTime")],
tz_aware=True,
),
"toUUID": HogQLFunctionMeta("toUUIDOrNull", 1, 1),
Expand Down
8 changes: 7 additions & 1 deletion posthog/hogql/property.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,11 @@
if property.type == "hogql":
return parse_expr(property.key)
elif (
property.type == "event" or property.type == "feature" or property.type == "person" or property.type == "group"
property.type == "event"
or property.type == "feature"

Check failure on line 125 in posthog/hogql/property.py

View workflow job for this annotation

GitHub Actions / Python code quality checks

Non-overlapping equality check (left operand type: "Literal['person', 'cohort', 'element', 'static-cohort', 'precalculated-cohort', 'group', 'recording', 'behavioral', 'session', 'data_warehouse']", right operand type: "Literal['feature']")
or property.type == "person"
or property.type == "group"
or property.type == "data_warehouse"
):
if scope == "person" and property.type != "person":
raise NotImplementedException(
Expand All @@ -134,6 +138,8 @@
chain = ["person", "properties"]
elif property.type == "group":
chain = [f"group_{property.group_type_index}", "properties"]
elif property.type == "data_warehouse":
chain = []
else:
chain = ["properties"]
field = ast.Field(chain=chain + [property.key])
Expand Down
23 changes: 17 additions & 6 deletions posthog/hogql_queries/insights/trends/aggregation_operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
from posthog.hogql.parser import parse_expr, parse_select
from posthog.hogql_queries.utils.query_date_range import QueryDateRange
from posthog.models.team.team import Team
from posthog.schema import ActionsNode, EventsNode
from posthog.schema import SeriesType, DataWarehouseNode
from posthog.models.filters.mixins.utils import cached_property


class QueryAlternator:
Expand Down Expand Up @@ -49,14 +50,14 @@ def replace_select_from(self, join_expr: ast.JoinExpr) -> None:

class AggregationOperations:
team: Team
series: EventsNode | ActionsNode
series: SeriesType
query_date_range: QueryDateRange
should_aggregate_values: bool

def __init__(
self,
team: Team,
series: EventsNode | ActionsNode,
series: SeriesType,
query_date_range: QueryDateRange,
should_aggregate_values: bool,
) -> None:
Expand All @@ -65,11 +66,18 @@ def __init__(
self.query_date_range = query_date_range
self.should_aggregate_values = should_aggregate_values

@cached_property
def _id_field(self) -> ast.Expr:
if isinstance(self.series, DataWarehouseNode):
return ast.Field(chain=["e", self.series.id_field])

return ast.Field(chain=["e", "uuid"])

def select_aggregation(self) -> ast.Expr:
if self.series.math == "hogql" and self.series.math_hogql is not None:
return parse_expr(self.series.math_hogql)
elif self.series.math == "total":
return parse_expr("count(e.uuid)")
return parse_expr("count({id_field})", placeholders={"id_field": self._id_field})
elif self.series.math == "dau":
actor = "e.distinct_id" if self.team.aggregate_users_by_distinct_id else "e.person.id"
return parse_expr(f"count(DISTINCT {actor})")
Expand Down Expand Up @@ -99,7 +107,9 @@ def select_aggregation(self) -> ast.Expr:
elif self.series.math == "p99":
return self._math_quantile(0.99, None)

return parse_expr("count(e.uuid)") # All "count per actor" get replaced during query orchestration
return parse_expr(
"count({id_field})", placeholders={"id_field": self._id_field}
) # All "count per actor" get replaced during query orchestration

def requires_query_orchestration(self) -> bool:
math_to_return_true = [
Expand Down Expand Up @@ -330,13 +340,14 @@ def _events_query(
query = parse_select(
"""
SELECT
count(e.uuid) AS total
count({id_field}) AS total
FROM events AS e
SAMPLE {sample}
WHERE {events_where_clause}
GROUP BY {person_field}
""",
placeholders={
"id_field": self._id_field,
"events_where_clause": where_clause_combined,
"sample": sample_value,
"person_field": ast.Field(
Expand Down
9 changes: 4 additions & 5 deletions posthog/hogql_queries/insights/trends/breakdown.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,17 @@
from posthog.hogql_queries.insights.trends.display import TrendsDisplay
from posthog.hogql_queries.insights.trends.utils import (
get_properties_chain,
series_event_name,
)
from posthog.hogql_queries.utils.query_date_range import QueryDateRange
from posthog.models.filters.mixins.utils import cached_property
from posthog.models.team.team import Team
from posthog.schema import ActionsNode, EventsNode, HogQLQueryModifiers, InCohortVia, TrendsQuery
from posthog.schema import SeriesType, HogQLQueryModifiers, InCohortVia, TrendsQuery


class Breakdown:
query: TrendsQuery
team: Team
series: EventsNode | ActionsNode
series: SeriesType
query_date_range: QueryDateRange
timings: HogQLTimings
modifiers: HogQLQueryModifiers
Expand All @@ -34,7 +33,7 @@
self,
team: Team,
query: TrendsQuery,
series: EventsNode | ActionsNode,
series: SeriesType,
query_date_range: QueryDateRange,
timings: HogQLTimings,
modifiers: HogQLQueryModifiers,
Expand Down Expand Up @@ -222,7 +221,7 @@
with self.timings.measure("breakdown_values_query"):
breakdown = BreakdownValues(
team=self.team,
event_name=series_event_name(self.series) or "",
series=self.series,
events_filter=self.events_filter,
chart_display_type=self._trends_display().display_type,
breakdown_filter=self.query.breakdownFilter,
Expand Down Expand Up @@ -281,7 +280,7 @@
@cached_property
def _properties_chain(self):
return get_properties_chain(
breakdown_type=self.query.breakdownFilter.breakdown_type,

Check failure on line 283 in posthog/hogql_queries/insights/trends/breakdown.py

View workflow job for this annotation

GitHub Actions / Python code quality checks

Argument "breakdown_type" to "get_properties_chain" has incompatible type "BreakdownType | Any | None"; expected "Literal['person', 'session', 'group', 'event', 'data_warehouse']"
breakdown_field=self.query.breakdownFilter.breakdown,
group_type_index=self.query.breakdownFilter.breakdown_group_type_index,
)
Expand Down
30 changes: 23 additions & 7 deletions posthog/hogql_queries/insights/trends/breakdown_values.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
from posthog.hogql.query import execute_hogql_query
from posthog.hogql_queries.insights.trends.utils import get_properties_chain
from posthog.models.team.team import Team
from posthog.schema import BreakdownFilter, BreakdownType, ChartDisplayType
from posthog.schema import BreakdownFilter, BreakdownType, ChartDisplayType, SeriesType, DataWarehouseNode
from functools import cached_property

BREAKDOWN_OTHER_STRING_LABEL = "$$_posthog_breakdown_other_$$"
BREAKDOWN_OTHER_NUMERIC_LABEL = 9007199254740991 # pow(2, 53) - 1, for JS compatibility
Expand All @@ -15,7 +16,7 @@

class BreakdownValues:
team: Team
event_name: str
series: SeriesType
breakdown_field: Union[str, float, List[Union[str, float]]]
breakdown_type: BreakdownType
events_filter: ast.Expr
Expand All @@ -28,13 +29,13 @@ class BreakdownValues:
def __init__(
self,
team: Team,
event_name: str,
series: SeriesType,
events_filter: ast.Expr,
chart_display_type: ChartDisplayType,
breakdown_filter: BreakdownFilter,
):
self.team = team
self.event_name = event_name
self.series = series
self.breakdown_field = breakdown_filter.breakdown # type: ignore
self.breakdown_type = breakdown_filter.breakdown_type # type: ignore
self.events_filter = events_filter
Expand Down Expand Up @@ -88,9 +89,8 @@ def get_breakdown_values(self) -> List[str | int]:
"""
SELECT
{select_field},
count(e.uuid) as count
FROM
events e
count({id_field}) as count
FROM {table} e
WHERE
{events_where}
GROUP BY
Expand All @@ -104,6 +104,8 @@ def get_breakdown_values(self) -> List[str | int]:
"events_where": self.events_filter,
"select_field": select_field,
"breakdown_limit": ast.Constant(value=breakdown_limit),
"table": self._table,
"id_field": self._id_field,
},
)

Expand Down Expand Up @@ -166,3 +168,17 @@ def _to_bucketing_expression(self) -> ast.Expr:
qunatile_expression = f"quantiles({','.join([f'{quantile:.2f}' for quantile in quantiles])})(value)"

return parse_expr(f"arrayCompact(arrayMap(x -> floor(x, 2), {qunatile_expression}))")

@cached_property
def _id_field(self) -> ast.Field:
if isinstance(self.series, DataWarehouseNode):
return ast.Field(chain=["e", self.series.id_field])

return ast.Field(chain=["e", "uuid"])

@cached_property
def _table(self) -> ast.Field:
if isinstance(self.series, DataWarehouseNode):
return ast.Field(chain=[self.series.table_name])

return ast.Field(chain=["events"])
Loading
Loading