-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate PPL-builtin functions to Spark-builtin functions #448
Translate PPL-builtin functions to Spark-builtin functions #448
Conversation
Signed-off-by: Lantao Jin <[email protected]>
@@ -263,17 +263,20 @@ logicalExpression | |||
|
|||
comparisonExpression | |||
: left = valueExpression comparisonOperator right = valueExpression # compareExpr | |||
| valueExpression IN valueList # inExpr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added inExpr
back since it is required by position
functions.
Signed-off-by: Lantao Jin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add additional documentation (markdown reference page) for each new command
assertEquals(expectedPlan, logPlan) | ||
} | ||
|
||
test("test arithmetic: + - * / %") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a few more complex test for such operators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
import org.apache.spark.sql.catalyst.expressions.{EqualTo, Literal} | ||
import org.apache.spark.sql.catalyst.plans.logical.{Filter, Project} | ||
|
||
class PPLLogicalPlanMathFunctionsTranslatorTestSuite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add additional IT tests that use these commands with a more complex context of full fledge query use cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
import org.apache.spark.sql.catalyst.expressions.{EqualTo, Like, Literal, NamedExpression} | ||
import org.apache.spark.sql.catalyst.plans.logical.{Filter, Project} | ||
|
||
class PPLLogicalPlanStringFunctionsTranslatorTestSuite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add additional IT tests that use these commands with a more complex context of full fledge query use cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
import org.apache.spark.sql.catalyst.expressions.{Alias, ExprId} | ||
import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Project} | ||
|
||
/** | ||
* general utility functions for ppl to spark transformation test | ||
*/ | ||
trait LogicalPlanTestUtils { | ||
trait LogicalPlanTestUtils extends AnalysisTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what / where will this trait be use is ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I tried to checkAnalysis
in unit tests I added but failed with table not found. It never used, will remove it.
Which file or where? |
Signed-off-by: Lantao Jin <[email protected]>
Part of expressions removed by #478 are added back in order to support builtin functions translation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level question.
What is the relationship between OpenSearch PPL and Spark PPL? How do we ensure the syntax remains consistent with OpenSearch PPL? Should the function specifications be aligned?
There is no "Spark PPL" at all as far as I'm concerned, the source of truth of PPL should be "OpenSearch PPL". I don't know why opensearch-spark copied out the PPL syntax from SQL repository. From long-term perspective, they should be generated by an unity grammar code.
This PR only translates the PPL-builtin functions to Spark-builtin functions, so required by opensearch-spark repository only. It won't introduce any split-syntax between SQL and openseach-spark repositories as long as the builtin functions in two repositories are same. If we want to add new builtin functions for any reason, for short-term, submitting PRs to this repository as well as SQL's. |
+1, we should do it asap. otherwise the grammer will diverage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work - thanks !!
FYI, I think this repo is for that purpose: https://github.com/opensearch-project/piped-processing-language For the long term solution, probably something like https://github.com/partiql/partiql-scribe is an option to consider. |
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/opensearch-spark/backport-0.5 0.5
# Navigate to the new working tree
pushd ../.worktrees/opensearch-spark/backport-0.5
# Create a new branch
git switch --create backport/backport-448-to-0.5
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 1f81ddf88ac96cd0130c9bd8f32abc4454e90699
# Push it to GitHub
git push --set-upstream origin backport/backport-448-to-0.5
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/opensearch-spark/backport-0.5 Then, create a pull request where the |
Description
Add a
BuiltinFunctionTranslator
to translate PPL builtin functions to Spark builtin functions in Parser.This PR only allows the PPL-builtin functions to be translated. In another words, using non-PPL-builtin functions in PPL query will throw an exception, even they are native Spark-builtin functions. In order to support more functions in PPL, we should update the PPL builtin functions later.
For example, following PPL builtin functions will be translated to Spark builtin functions:
String functions:
Math Functions:
Time Functions:
Issues Resolved
Resolves #419, #420, #422, #423, #424, #425, #426, #427, #428, #491, #492, #480, #481, #469, #466, #459, #457, #456, #455
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.