Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ppl spark join command #35

Closed
wants to merge 43 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
62c12cc
adding support for containerized flint with spark / Livy docker-compo…
YANG-DB Aug 23, 2023
9e6ecfc
adding support for containerized flint with spark / Livy docker-compo…
YANG-DB Aug 23, 2023
0808ea5
adding support for containerized flint with spark / Livy docker-compo…
YANG-DB Sep 1, 2023
1b2ae06
Merge branch 'main' into containerize_flint
YANG-DB Sep 1, 2023
91defa0
adding support for containerized flint with spark / Livy docker-compo…
YANG-DB Sep 1, 2023
0febc09
update ppl ast builder
YANG-DB Sep 1, 2023
18cd83f
add ppl ast components
YANG-DB Sep 1, 2023
605f1bf
populate ppl test suit for covering different types of PPL queries
YANG-DB Sep 1, 2023
d54530d
update additional tests
YANG-DB Sep 1, 2023
72dc5f7
separate ppl-spark code into a dedicated module
YANG-DB Sep 6, 2023
d953b19
add ppl translation of simple filter and data-type literal expression
YANG-DB Sep 6, 2023
9fce31e
remove none-used ppl ast builder
YANG-DB Sep 6, 2023
a299bdf
add log-plan test results validation
YANG-DB Sep 6, 2023
019f690
add support for multiple table selection using union
YANG-DB Sep 6, 2023
0c7ccec
add support for multiple table selection using union
YANG-DB Sep 7, 2023
14fa7e5
update sbt with new IT test suite for PPL module
YANG-DB Sep 7, 2023
d55b774
update ppl IT suite test
YANG-DB Sep 7, 2023
8bbe0d9
update ppl IT suite dependencies
YANG-DB Sep 7, 2023
af065f7
add tests for ppl IT with
YANG-DB Sep 7, 2023
5819dc7
update literal transformations according to catalyst's convention
YANG-DB Sep 7, 2023
7db7213
separate unit-tests into a dedicated file per each test category
YANG-DB Sep 7, 2023
32573ab
add IT tests for additional filters
YANG-DB Sep 7, 2023
eec0e4a
mark unsatisfied tests as ignored until supporting code is ready
YANG-DB Sep 7, 2023
3f9d9d1
add README.md design and implementation details
YANG-DB Sep 8, 2023
3dbb5bb
Merge branch 'main' into ppl-spark-translation
YANG-DB Sep 8, 2023
67fd56a
remove docker related files
YANG-DB Sep 8, 2023
89dd114
add text related unwrapping bug - fix
YANG-DB Sep 10, 2023
65f4372
add AggregatorTranslator support
YANG-DB Sep 11, 2023
00e2a76
resolve group by issues
YANG-DB Sep 11, 2023
17e93fb
add generic ppl extension chain which registers a chain of parsers
YANG-DB Sep 11, 2023
69df8ad
update some tests
YANG-DB Sep 11, 2023
4a4d73a
add filter test with stats
YANG-DB Sep 12, 2023
ca5ec65
add support for AND / OR
YANG-DB Sep 12, 2023
ce70d19
add additional unit tests support for AND / OR
YANG-DB Sep 12, 2023
d5f33b0
Merge branch 'main' into ppl-spark-translation
YANG-DB Sep 12, 2023
fe11134
add Max,Min,Count,Sum aggregation functions support
YANG-DB Sep 12, 2023
7e5e0d1
add basic span support for aggregate based queries
YANG-DB Sep 13, 2023
dbfd82a
update supported PPL and roadmap for future support ppl commands...
YANG-DB Sep 13, 2023
eaa4e33
update readme doc
YANG-DB Sep 13, 2023
157bbb7
add `head` support
YANG-DB Sep 13, 2023
20385c1
add support for sort command
YANG-DB Sep 14, 2023
7c0fd36
update supported command in readme
YANG-DB Sep 14, 2023
aaa4831
add initial join command for ppl grammar
YANG-DB Sep 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 42 additions & 4 deletions build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ lazy val commonSettings = Seq(
Test / test := ((Test / test) dependsOn testScalastyle).value)

lazy val root = (project in file("."))
.aggregate(flintCore, flintSparkIntegration, sparkSqlApplication)
.aggregate(flintCore, flintSparkIntegration, pplSparkIntegration, sparkSqlApplication)
.disablePlugins(AssemblyPlugin)
.settings(name := "flint", publish / skip := true)

Expand All @@ -61,8 +61,46 @@ lazy val flintCore = (project in file("flint-core"))
exclude ("com.fasterxml.jackson.core", "jackson-databind")),
publish / skip := true)

lazy val pplSparkIntegration = (project in file("ppl-spark-integration"))
.enablePlugins(AssemblyPlugin, Antlr4Plugin)
.settings(
commonSettings,
name := "ppl-spark-integration",
scalaVersion := scala212,
libraryDependencies ++= Seq(
"com.amazonaws" % "aws-java-sdk" % "1.12.397" % "provided"
exclude ("com.fasterxml.jackson.core", "jackson-databind"),
"org.scalactic" %% "scalactic" % "3.2.15" % "test",
"org.scalatest" %% "scalatest" % "3.2.15" % "test",
"org.scalatest" %% "scalatest-flatspec" % "3.2.15" % "test",
"org.scalatestplus" %% "mockito-4-6" % "3.2.15.0" % "test",
"com.stephenn" %% "scalatest-json-jsonassert" % "0.2.5" % "test",
"com.github.sbt" % "junit-interface" % "0.13.3" % "test"),
libraryDependencies ++= deps(sparkVersion),
// ANTLR settings
Antlr4 / antlr4Version := "4.8",
Antlr4 / antlr4PackageName := Some("org.opensearch.flint.spark.ppl"),
Antlr4 / antlr4GenListener := true,
Antlr4 / antlr4GenVisitor := true,
// Assembly settings
assemblyPackageScala / assembleArtifact := false,
assembly / assemblyOption ~= {
_.withIncludeScala(false)
},
assembly / assemblyMergeStrategy := {
case PathList(ps @ _*) if ps.last endsWith ("module-info.class") =>
MergeStrategy.discard
case PathList("module-info.class") => MergeStrategy.discard
case PathList("META-INF", "versions", xs @ _, "module-info.class") =>
MergeStrategy.discard
case x =>
val oldStrategy = (assembly / assemblyMergeStrategy).value
oldStrategy(x)
},
assembly / test := (Test / test).value)

lazy val flintSparkIntegration = (project in file("flint-spark-integration"))
.dependsOn(flintCore)
.dependsOn(flintCore, pplSparkIntegration)
.enablePlugins(AssemblyPlugin, Antlr4Plugin)
.settings(
commonSettings,
Expand Down Expand Up @@ -102,7 +140,7 @@ lazy val flintSparkIntegration = (project in file("flint-spark-integration"))

// Test assembly package with integration test.
lazy val integtest = (project in file("integ-test"))
.dependsOn(flintSparkIntegration % "test->test")
.dependsOn(flintSparkIntegration % "test->test", pplSparkIntegration % "test->test" )
.settings(
commonSettings,
name := "integ-test",
Expand All @@ -115,7 +153,7 @@ lazy val integtest = (project in file("integ-test"))
"com.stephenn" %% "scalatest-json-jsonassert" % "0.2.5" % "test",
"org.testcontainers" % "testcontainers" % "1.18.0" % "test"),
libraryDependencies ++= deps(sparkVersion),
Test / fullClasspath += (flintSparkIntegration / assembly).value)
Test / fullClasspath ++= Seq((flintSparkIntegration / assembly).value, (pplSparkIntegration / assembly).value))

lazy val standaloneCosmetic = project
.settings(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,13 @@

package org.apache.spark

import org.opensearch.flint.spark.FlintSparkExtensions

import org.apache.spark.sql.catalyst.expressions.CodegenObjectFactoryMode
import org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation
import org.apache.spark.sql.flint.config.FlintConfigEntry
import org.apache.spark.sql.flint.config.FlintSparkConf.HYBRID_SCAN_ENABLED
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.test.SharedSparkSession
import org.opensearch.flint.spark.FlintSparkExtensions

trait FlintSuite extends SharedSparkSession {
override protected def sparkConf = {
Expand Down
Loading
Loading