Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(hogql): Allow a placeholder to be used in place of a select statement #19767

Merged
merged 13 commits into from
Jan 18, 2024

Conversation

Gilbert09
Copy link
Member

Problem

We weren't able to use a {placeholder} in place of a SELECT statement. We only added support for use in table expressions which sadly doesn't cover things like a SELECT in a UNION ALL. The workaround was to use SELECT * FROM {placeholder}, where {placeholder} was another SELECT ... statement, causing unnecessary SQL wrapping

Changes

  • Creates a new antlr rule selectStmtWithPlaceholder which is used to union selectStmt and placeholder
  • Python and C++ versions of the code implemented
  • Updates the stickiness query runner to remove the workaround as a testing ground for this change

How did you test this code?

  • Used the new antlr rule in a query

@Gilbert09 Gilbert09 requested review from mariusandra, Twixes and a team January 15, 2024 18:20
@posthog-bot
Copy link
Contributor

Hey @Gilbert09! 👋
This pull request seems to contain no description. Please add useful context, rationale, and/or any other information that will help make sense of this change now and in the distant Mars-based future.

@posthog-bot
Copy link
Contributor

It looks like the code of hogql-parser has changed since last push, but its version stayed the same at 1.0.2. 👀
Make sure to resolve this in hogql_parser/setup.py before merging!

@Gilbert09 Gilbert09 temporarily deployed to pypi-hogql-parser January 15, 2024 18:39 — with GitHub Actions Inactive
Comment on lines 12 to 13
selectStmtWithParens: selectStmtWithPlaceholder | LPAREN selectUnionStmt RPAREN;
selectStmtWithPlaceholder: selectStmt | placeholder;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit to splitting selectStmtWithPlaceholder out as another node in the hierarchy? Because we could also do selectStmtWithParens: selectStmt | LPAREN selectUnionStmt RPAREN | placeholder;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question, no real reason. Happy to change if we think this is important?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's critical (this works after all), but i'd definitely be nice. Fewer new nodes means fewer potential issues 🤷.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this is done now ✅

@Gilbert09 Gilbert09 requested a review from Twixes January 17, 2024 17:14
Copy link
Member

@Twixes Twixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of readability suggestions with some C++ context, but that's very minor. Feel free to merge!

@@ -172,14 +172,16 @@ def visitSelectUnionStmt(self, ctx: HogQLParser.SelectUnionStmtContext):
flattened_queries.append(query)
elif isinstance(query, ast.SelectUnionQuery):
flattened_queries.extend(query.select_queries)
elif isinstance(query, ast.Placeholder):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify this case by folding it into the first condition: just replace if isinstance(query, ast.SelectQuery): with if isinstance(query, (ast.SelectQuery, ast.Placeholder)):

@@ -344,6 +350,9 @@ class HogQLParseTreeConverter : public HogQLParserBaseVisitor {
int extend_code = X_PyList_Extend(flattened_queries, sub_select_queries);
if (extend_code == -1) goto select_queries_loop_py_error;
Py_DECREF(sub_select_queries);
} else if (is_ast_node_instance(query, "Placeholder")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in the Python version, this would be more readable if we did || is_ast_node_instance(query, "Placeholder") in the first if () branch. (I know readability is not the C++ code's strong suit, but that makes reining it in all the more important.)

BTW is_ast_node_instance(query, "SelectQuery"); is called twice in this function, which is an oversight on my side. It's intentionally assigned to is_select_query so that we can check against the error value (-1), which is great for perfect error handling. However, two lines later repeat that call without the check…
These operations are very low risk, so I think in this case I suggest we remove lines 341 and 342. Because if were to run the same error checks for all branches in an else if fashion, we'd have to do quite some if nesting…

@Gilbert09 Gilbert09 merged commit d946f66 into master Jan 18, 2024
95 checks passed
@Gilbert09 Gilbert09 deleted the tom/hogql-union-placeholder branch January 18, 2024 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants