-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: Normalizing literals in SELECT is broken #15043
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
@@ -2398,7 +2398,7 @@ type ( | |||
FuncExpr struct { | |||
Qualifier IdentifierCS | |||
Name IdentifierCI | |||
Exprs SelectExprs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Back in the days, we used FuncExpr
for all expressions, including count(*)
. Since then, we have replaced most functions with custom AST types (such as CountStar
), and this means that we no longer need to parse *
as an argument to a function call.
Using pure expressions simplifies a lot of code and makes it possible to know that when we are looking at an *sqlparser.AliasedExpr
, we know we are in the SELECT clause and nowhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this change
This comment was marked as outdated.
This comment was marked as outdated.
1bd8dae
to
2a18da2
Compare
@@ -170,7 +170,7 @@ func TestSubqueryInAggregation(t *testing.T) { | |||
mcmp.Exec("insert into t1(id1, id2) values(0,0),(1,1)") | |||
mcmp.Exec("insert into t2(id3, id4) values(1,2),(5,7)") | |||
mcmp.Exec(`SELECT max((select min(id2) from t1)) FROM t2`) | |||
mcmp.Exec(`SELECT max((select group_concat(id1, id2) from t1 where id1 = 1)) FROM t1 where id1 = 1`) | |||
mcmp.Exec(`SELECT max((select group_concat(id1, id2) from t1 where id1 = 1)) as x FROM t1 where id1 = 1`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why an alias is needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the alias we will not normalize the subquery, since it is in a SELECT expression. We are however normalizing the WHERE
clause, so the planner doesn't know that the outer query and the inner query are going to the same shard. That turns this into a correlated query that we don't support. By adding the alias, we are free to normalize both and the planner knows it safe to merge the two sides of the subquery
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a failing query to the unsupported
list if it is not already there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not very easy to do. this is unsupported only after the normalizer has had a go at it. it's unsupported because it's a correlated subquery when we can't merge the two sides, and we already have other examples of correlated subqueries not being supported at the moment
Signed-off-by: Andres Taylor <[email protected]>
Signed-off-by: Andres Taylor <[email protected]>
de8fc91
to
3c92aed
Compare
Needs a backport to 19.0 label? |
Signed-off-by: Andres Taylor <[email protected]>
Not sure we want to backport this. @deepthi @harshit-gangal wdyt? |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #15043 +/- ##
==========================================
- Coverage 67.29% 67.27% -0.02%
==========================================
Files 1560 1560
Lines 192123 192109 -14
==========================================
- Hits 129283 129249 -34
- Misses 62840 62860 +20 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM!
fd38aab
to
e831863
Compare
Signed-off-by: Andres Taylor <[email protected]>
Signed-off-by: Andres Taylor <[email protected]>
e831863
to
96c3ed3
Compare
I feel like we should given this is a bug fix, but i have no strong opinions. EDIT: I think we should also backport to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good change. Using sqlparser.Expr
instead of sqlparser.SelectExpr
interface is awesome
some part of it is regression from v17 and some are new support. So, we should backport it till v18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with 1 comment about adding an unsupported
query test.
Signed-off-by: Andres Taylor <[email protected]>
Signed-off-by: Andres Taylor <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, this will cause regression for aggregation subquery without an alias.
Description
Vitess does auto-parameterization of queries. This means replacing literal values with arguments, so that a plan cache can be re-used between queries, if the only difference is the literal value.
Sometimes, this rewrite messes with queries, such as in the issue reported below.
In this PR, I wanted to stop the
Normalizer
from rewriting expressions if they are inside unalised select expressions.To do that, first I needed to clean up our AST a little bit.
Back in the days, we used FuncExpr for all expressions, including
count(*)
. Since then, we have replaced most functions with custom AST types (such as CountStar), and this means that we no longer need to parse*
as an argument to a function call.Using pure expressions simplifies a lot of code and makes it possible to know that when we are looking at an
*sqlparser.AliasedExpr
, we know we are in the SELECT clause and nowhere else.Related Issue(s)
Fixes #15020
Checklist
Deployment Notes