Skip to content

Commit

Permalink
Update doc
Browse files Browse the repository at this point in the history
Signed-off-by: Chen Dai <[email protected]>
  • Loading branch information
dai-chen committed Sep 18, 2023
1 parent ad7f353 commit 8779c66
Showing 1 changed file with 45 additions and 2 deletions.
47 changes: 45 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,11 @@ A Flint index is ...

### Feature Highlights

- Skipping Index
- Skipping Index: accelerate data scan by maintaining compact aggregate data structure which includes
- Partition: skip data scan by maintaining and filtering partitioned column value per file.
- MinMax: skip data scan by maintaining lower and upper bound of the indexed column per file.
- ValueSet: skip data scan by building a unique value set of the indexed column per file.
- Covering Index: create index for selected columns within the source dataset to improve query performance

Please see the following example in which Index Building Logic and Query Rewrite Logic column shows the basic idea behind each skipping index implementation.

Expand Down Expand Up @@ -117,7 +118,7 @@ High level API is dependent on query engine implementation. Please see Query Eng

### SQL

DDL statement:
#### Skipping Index

```sql
CREATE SKIPPING INDEX
Expand Down Expand Up @@ -157,6 +158,38 @@ DESCRIBE SKIPPING INDEX ON alb_logs
DROP SKIPPING INDEX ON alb_logs
```

#### Covering Index

```sql
CREATE INDEX name ON <object>
( column [, ...] )
WHERE <filter_predicate>
WITH (auto_refresh = (true|false))

REFRESH INDEX name ON <object>

SHOW INDEX ON <object>

DESCRIBE INDEX name ON <object>

DROP INDEX name ON <object>
```

Example:

```sql
CREATE INDEX elb_and_requestUri
ON alb_logs ( elb, requestUri )

REFRESH INDEX elb_and_requestUri ON alb_logs

SHOW INDEX ON alb_logs

DESCRIBE INDEX elb_and_requestUri ON alb_logs

DROP INDEX elb_and_requestUri ON alb_logs
```

## Index Store

### OpenSearch
Expand Down Expand Up @@ -264,6 +297,7 @@ Here is an example for Flint Spark integration:
```scala
val flint = new FlintSpark(spark)

// Skipping index
flint.skippingIndex()
.onTable("alb_logs")
.filterBy("time > 2023-04-01 00:00:00")
Expand All @@ -273,6 +307,15 @@ flint.skippingIndex()
.create()

flint.refresh("flint_alb_logs_skipping_index", FULL)

// Covering index
flint.coveringIndex()
.name("elb_and_requestUri")
.onTable("alb_logs")
.addIndexColumns("elb", "requestUri")
.create()

flint.refresh("flint_alb_logs_elb_and_requestUri_index")
```

#### Skipping Index Provider SPI
Expand Down

0 comments on commit 8779c66

Please sign in to comment.