Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] OpenSearch and Apache Spark Integration #3

Closed
penghuo opened this issue Jan 26, 2023 · 2 comments
Closed

[Feature] OpenSearch and Apache Spark Integration #3

penghuo opened this issue Jan 26, 2023 · 2 comments
Assignees
Labels
Meta Meta issue, not directly linked to a PR

Comments

@penghuo
Copy link
Collaborator

penghuo commented Jan 26, 2023

More reading at #4

Phase-0: Proof of Concepts

Goals

  • Verify the solution for end to end production use case
  • Demo the solution

Tasks

Deliverables


Phase-1: Spark Connector and Flint API Support

Goals

  • OpenSearch Release: 2.9.0
    • Support create spark-submit datasource in OpenSearch.
    • Support build visualization in OpenSearch Dashboard using spark datasource.
  • OpenSearch Spark Extension Release.
    • Support create skipping index for Hive Table.
// 1. define the context
USING FLINT
PROPERTIES (
  OPENSEARCH = "https://my-opensearch.com"
)

// 2. create skipping index
CREATE SKIPPING INDEX index_name ON TABLE [db_name].table_name
FOR COLUMNS (col1 INDEX_TYPE, ...)
[REFRESH_ON_CREATE]            <- auto refresh the table after table create
[AUTO_REFRESH TRUE]            <- auto refresh the index when new data appended.

Tasks

Deliverables


Phase-2: Covering Index and Materialized View Support

Goals

Non-Goals

  • Covering index and MV are not accessible in Spark and thus no query rewrite support

Tasks

@penghuo penghuo added the enhancement New feature or request label Jan 26, 2023
@penghuo penghuo removed the enhancement New feature or request label Jan 26, 2023
@ps48 ps48 self-assigned this Mar 7, 2023
@dai-chen dai-chen changed the title Integrate with Spark [Feature] OpenSearch and Apache Spark Integration Apr 3, 2023
@dai-chen dai-chen transferred this issue from opensearch-project/sql Jul 11, 2023
@dai-chen dai-chen added the Meta Meta issue, not directly linked to a PR label Aug 30, 2023
@dai-chen dai-chen pinned this issue Aug 31, 2023
@dai-chen
Copy link
Collaborator

dai-chen commented Oct 31, 2023

Phase-3: Flint Index Enhancement and Optimization

Goals

Non-Goals

  • N/A

Tasks

@dai-chen
Copy link
Collaborator

dai-chen commented Jun 3, 2024

Follow-up tracking issue: #365

@dai-chen dai-chen closed this as completed Jun 3, 2024
@dai-chen dai-chen unpinned this issue Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta Meta issue, not directly linked to a PR
Projects
None yet
Development

No branches or pull requests

5 participants