Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add create, drop and refresh covering index SQL support #32

Merged

Conversation

dai-chen
Copy link
Collaborator

@dai-chen dai-chen commented Sep 6, 2023

Description

  1. Added SQL support for CREATE/DROP/REFRESH statement for covering index.
  2. Updated user manual: https://github.com/dai-chen/opensearch-spark/blob/add-covering-index-sql-support/docs/index.md#covering-index
  3. Refactor Flint SQL AST builder to separate and mix-in AST builder for different derived dataset

TODO

Will publish PR for SHOW/DESC statement separately for review convenience.

Example

Create covering index with auto refresh enabled and drop it when not needed:

spark-sql> CREATE INDEX orderkey_and_quantity 
... ON stream.lineitem_tiny (l_orderkey, l_quantity)
... WITH (auto_refresh = true);

spark-sql> DROP INDEX orderkey_and_quantity ON stream.lineitem_tiny;

OpenSearch index looks like:

GET flint_stream_lineitem_tiny_orderkey_and_quantity_index/_mapping
{
  "flint_stream_lineitem_tiny_orderkey_and_quantity_index": {
    "mappings": {
      "_meta": {
        "name": "orderkey_and_quantity",
        "source": "stream.lineitem_tiny",
        "kind": "covering",
        "indexedColumns": [
          {
            "columnType": "bigint",
            "columnName": "l_orderkey"
          },
          {
            "columnType": "float",
            "columnName": "l_quantity"
          }
        ]
      },
      "properties": {
        "l_orderkey": {
          "type": "long"
        },
        "l_quantity": {
          "type": "float"
        }
      }
    }
  }
}

GET flint_stream_lineitem_tiny_orderkey_and_quantity_index/_search
{
     ...
    "max_score": 1,
    "hits": [
      {
        "_index": "flint_stream_lineitem_tiny_orderkey_and_quantity_index",
        "_id": "a_hXqooBWdGHpYUCwr6x",
        "_score": 1,
        "_source": {
          "l_orderkey": 22828645,
          "l_quantity": 26
        }
      },
      {
        "_index": "flint_stream_lineitem_tiny_orderkey_and_quantity_index",
        "_id": "bPhXqooBWdGHpYUCwr6x",
        "_score": 1,
        "_source": {
          "l_orderkey": 59498081,
          "l_quantity": 16
        }
      },
      ...

Issues Resolved

#23

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dai-chen dai-chen added the enhancement New feature or request label Sep 6, 2023
@dai-chen dai-chen self-assigned this Sep 6, 2023
@dai-chen dai-chen changed the title Add create covering index SQL support Add create, drop and refresh covering index SQL support Sep 6, 2023
@dai-chen dai-chen force-pushed the add-covering-index-sql-support branch from 88ca8dd to ad7f353 Compare September 18, 2023 20:51
Signed-off-by: Chen Dai <[email protected]>
@dai-chen dai-chen marked this pull request as ready for review September 18, 2023 22:14
@@ -10,10 +10,11 @@ A Flint index is ...

### Feature Highlights

- Skipping Index
- Skipping Index: accelerate data scan by maintaining compact aggregate data structure which includes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be very helpful if u can add a diagram showing the content of the skipping index and how it relates to the original table ...

Copy link
Collaborator Author

@dai-chen dai-chen Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've created test for doc update. Will add doctest with examples in #28. Thanks!

@dai-chen dai-chen merged commit bd68653 into opensearch-project:main Sep 20, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants