Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Presto SQLGlot Optimizers #1300

Merged
merged 3 commits into from
Jul 31, 2023

Conversation

kgopal492
Copy link
Contributor

  • Create a PrestoOptimizingValidator that extends the PrestoExplainValidator to run validators that provide suggestions to rewrite & optimize the query the query

  • New suggestions being made to make query run more quickly:

    • Use APPROX_DISTINCT(x) instead of COUNT(DISTINCT x)
    • Combine multiple LIKE clauses to use REGEXP_LIKE
    • Use UNION ALL instead of UNION
  • Suggestions are made by tokenizing query using sqlglot library and searching for matching patterns

  • Update query/validation endpoint's return type QueryValidationResult to allow for a suggestion type (specifies start coordinate & end coordinate that is changed, and the suggestion text)

  • Updated logic in Implement developer assistance query rewriting #1298 to return start/end coordinates of text to change, and a suggestion string

@kgopal492 kgopal492 requested review from czgu and jczhong84 July 27, 2023 19:30
return Trino.Tokenizer().tokenize(query)


class BaseSQLGlotValidator(metaclass=ABCMeta):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this base class only be used for the presto optimizing validators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's only used for Presto, but good point. I moved the BaseSQLGlotValidator to a separate file so it can be used for other custom implementations

Comment on lines +150 to +151
end_line: number | null;
end_ch: number | null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are end_line and end_ch used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They'll be used in the upcoming UI PR (since we'll need start and end coordinates of text that needs to be changed)

@kgopal492 kgopal492 merged commit 9a984c9 into pinterest:dev-query-optimization Jul 31, 2023
4 checks passed
kgopal492 added a commit that referenced this pull request Aug 14, 2023
* Implement Presto SQLGlot Optimizers (#1300)

Create a PrestoOptimizingValidator that extends the PrestoExplainValidator to run validators that provide suggestions to rewrite & optimize the query the query

* Implement UI for query optimization suggestions (#1302)

Create a new tooltip for users to accept query optimization suggestions

* Update querybook version, fix PrestoOptimizingValidator (#1304)

* Fix minor UI suggestions (#1305)
aidenprice pushed a commit to arrowtail-precision/querybook that referenced this pull request Jan 3, 2024
* Implement Presto SQLGlot Optimizers (pinterest#1300)

Create a PrestoOptimizingValidator that extends the PrestoExplainValidator to run validators that provide suggestions to rewrite & optimize the query the query

* Implement UI for query optimization suggestions (pinterest#1302)

Create a new tooltip for users to accept query optimization suggestions

* Update querybook version, fix PrestoOptimizingValidator (pinterest#1304)

* Fix minor UI suggestions (pinterest#1305)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants