-
-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query properties rule filtering #2613
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general idea is sound and good, but this implementation conflates two separate concerns, which is undesirable:
- The analyzer wants to skip / alter certain steps based on previously determined facts about the plan
- The row spooler wants to shave some time by avoiding the overhead of a loop when it's not necessary
The first concern should be encapsulated in a data structure local to a given analyzer invocation. Leaking it to the query context as a whole is not ideal. It considerably tightens the analyzer / context contract in a way that will be easy to introduce bugs into.
The second concern seems more appropriate for this implementation, but it's still not great. A better solution imo would be something like a QueryProps top-level node that the analyzer uses to communicate things of this nature to the rest of the engine. A new return type from analysis that contains [Node, Props]
would also work well.
But I definitely want to push back on cramming more things into the context / session, it's already quite crowded in there. And more importantly, the lifecycle of that object is much longer / broader than the narrow concerns you're trying to address here.
server/handler.go
Outdated
@@ -499,6 +501,33 @@ func resultForEmptyIter(ctx *sql.Context, iter sql.RowIter, resultFields []*quer | |||
return &sqltypes.Result{Fields: resultFields}, nil | |||
} | |||
|
|||
// resultForMax1RowIter ensures that an empty iterator returns at most one row | |||
func resultForMax1RowIter(ctx *sql.Context, schema sql.Schema, iter sql.RowIter, resultFields []*querypb.Field) (*sqltypes.Result, error) { | |||
defer trace2.StartRegion(ctx, "Handler.resultForMax1RowIter").End() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use a better import alias for this
sql/analyzer/costed_index_scan.go
Outdated
// Strict index lookup without a join or subquery scope will return | ||
// at most one row. We could also use some sort of scope counting | ||
// to check for single scope. | ||
ctx.QProps.Set(sql.QPropMax1Row) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of concerned about the lifecycle of these properties introducing correctness issues over time.
Do you have a good sense about e.g.
- contexts are always created fresh for every new query (even in multiquery mode)
- subcontexts don't inherit them?
sql/analyzer/indexed_joins.go
Outdated
@@ -155,6 +155,8 @@ func replanJoin(ctx *sql.Context, n *plan.JoinNode, a *Analyzer, scope *plan.Sco | |||
j := memo.NewJoinOrderBuilder(m) | |||
j.ReorderJoin(n) | |||
|
|||
ctx.QProps.Set(sql.QPropInnerJoin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just means there's one or more inner joins somewhere in the query?
sql/query_props.go
Outdated
|
||
package sql | ||
|
||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should document these with comments
…o-mysql-server into max/query-props-rule-filtering
@zachmu I refactored this to pass flags through rule handlers. Equivalent behavior, but the lifecycle is separate from the context now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one note on this implementation
qp.Flags.Add(flag) | ||
} | ||
|
||
func (qp *QueryFlags) IsSet(flag int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Everything is set if it's nil" is kind of insane bro
If you want this kind of default behavior, put it in a function up one abstraction layer (in the analyzer package), let this be a dumb data object.
Edit most of the analyzer interfaces to pass a new context object that accumulates query specific properties. Currently the object is called
QueryFlags
, and accumulates information about the query to inform better rule filtering and more efficient spooling strategies.The change that has the biggest effect on
oltp_point_select
perf is thesql.QFlagMax1Row
setting, which lets us skip the default results iter boilerplate when we're only returning one row. Added a couple other skips for rules that are easy to whitelist correctly and show prominently on CPU profiles, like aggregations and subqueries.