Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: selecting from cohort people #17610

Merged
merged 7 commits into from
Sep 26, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,9 @@ export function DatabaseTables<T extends DatabaseSceneRow>({
// TODO: Use `hogql` tag?
query: `SELECT ${obj.columns
.filter(({ table, fields, chain }) => !table && !fields && !chain)
.map(({ key }) => key)} FROM ${table} LIMIT 100`,
.map(({ key }) => key)} FROM ${
table === 'numbers' ? 'numbers(0, 10)' : table
} LIMIT 100`,
},
}
return (
Expand Down
6 changes: 1 addition & 5 deletions posthog/hogql/database/schema/cohort_people.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,7 @@ def select_from_cohort_people_table(requested_fields: Dict[str, List[str]]):

table_name = "raw_cohort_people"

requested_fields = {"person_id": ["person_id"], "cohort_id": ["cohort_id"], **requested_fields}

fields: List[ast.Expr] = [
ast.Alias(alias=name, expr=ast.Field(chain=[table_name] + chain)) for name, chain in requested_fields.items()
]
fields: List[ast.Expr] = [ast.Field(chain=[table_name] + chain) for name, chain in requested_fields.items()]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was there so that we always request the person and cohort fields. Without them, some queries and/or joins will just fail.

Thus, if you now select just the person_id, or the cohort_id or just something else that's not those two, will this fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work the same with those two fields always bolted back on... e.g. select team_id from cohort_people works either way.

Is there an obvious test case to try?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I've put the line ensuring they're present back in)

Copy link
Collaborator

@mariusandra mariusandra Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... actually I'll take that back. The issue I was remembering is valid only in the case of automatic joins (e.g. events.person.bla). The join constraint would fail if we didn't always select the fields we're going to use in the constraint. However... that's not relevant here, so insert homer bush gif here 🤣

Probably good/ok to select them anyway, so nothing to change here now... 🤷


return ast.SelectQuery(
select=fields,
Expand Down
16 changes: 16 additions & 0 deletions posthog/hogql/database/test/test_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from unittest.mock import patch
import pytest
from django.test import override_settings
from parameterized import parameterized

from posthog.hogql.database.database import create_hogql_database, serialize_database
from posthog.test.base import BaseTest
Expand All @@ -26,6 +27,21 @@ def test_serialize_database_with_person_on_events_enabled(self):
serialized_database = serialize_database(create_hogql_database(team_id=self.team.pk))
assert json.dumps(serialized_database, indent=4) == self.snapshot

@parameterized.expand([False, True])
def test_can_select_from_each_table_at_all(self, poe_enabled: bool) -> None:
with override_settings(PERSON_ON_EVENTS_OVERRIDE=poe_enabled):
serialized_database = serialize_database(create_hogql_database(team_id=self.team.pk))
for table, possible_columns in serialized_database.items():
if table == "numbers":
execute_hogql_query("SELECT number FROM numbers(10) LIMIT 100", self.team)
else:
columns = [
x["key"]
for x in possible_columns
if "table" not in x and "chain" not in x and "fields" not in x
]
execute_hogql_query(f"SELECT {','.join(columns)} FROM {table}", team=self.team)

@patch("posthog.hogql.query.sync_execute", return_value=(None, None))
@pytest.mark.usefixtures("unittest_snapshot")
def test_database_with_warehouse_tables(self, patch_execute):
Expand Down
Loading