Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC]Perform a Join flint based PPL query using OpenSearch indices #998

Open
YANG-DB opened this issue Dec 20, 2024 · 0 comments
Open

[POC]Perform a Join flint based PPL query using OpenSearch indices #998

YANG-DB opened this issue Dec 20, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request Lang:PPL Pipe Processing Language support Roadmap:Ease of Use Project-wide roadmap label untriaged

Comments

@YANG-DB
Copy link
Member

YANG-DB commented Dec 20, 2024

Is your feature request related to a problem?
Create a POC to perform a spark flint based join query with tables that are mapped to OpenSearch indices.
This will demonstrate how spark can be leveraged to perform OpenSearch indices join using spark engine without the need to use the legacy OpenSearch-hadoop plugin.

What solution would you like?
Flint has today the following capabilities with respect to communicating with OpenSearch:

  • Use the OpenSearchCatalog class which allows Spark to interact with OpenSearch indices as tables. It supports read and write operations, enabling seamless data processing and querying across Spark and OpenSearch.
# To configure and initialize the catalog in your Spark session, set the following configurations:

spark.conf.set("spark.sql.catalog.dev", "org.apache.spark.opensearch.catalog.OpenSearchCatalog")
spark.conf.set("spark.sql.catalog.dev.opensearch.port", "9200")
spark.conf.set("spark.sql.catalog.dev.opensearch.scheme", "http")
spark.conf.set("spark.sql.catalog.dev.opensearch.auth", "noauth")
val df = spark.sql("source=dev.default.customer | join ON c_custkey = o_custkey dev.default.orders | join ON c_nationkey = n_nationkey dev.default.nation | fields c_custkey, c_mktsegment, o_orderkey, o_orderstatus, o_totalprice, n_name | head 10")
...

Do you have any additional context?

@YANG-DB YANG-DB added enhancement New feature or request untriaged Roadmap:Ease of Use Project-wide roadmap label Lang:PPL Pipe Processing Language support labels Dec 20, 2024
@YANG-DB YANG-DB self-assigned this Dec 20, 2024
@YANG-DB YANG-DB moved this to Design in PPL Commands Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Lang:PPL Pipe Processing Language support Roadmap:Ease of Use Project-wide roadmap label untriaged
Projects
Status: New
Status: Design
Development

No branches or pull requests

1 participant