forked from opensearch-project/documentation-website
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add documentation for star tree index feature (opensearch-project#8598)
* Adding documentation for star tree index feature Signed-off-by: Bharathwaj G <[email protected]> * addressing comments Signed-off-by: Bharathwaj G <[email protected]> * addressing comments Signed-off-by: Bharathwaj G <[email protected]> * fixes and addressing comments Signed-off-by: Bharathwaj G <[email protected]> * addressing comments Signed-off-by: Bharathwaj G <[email protected]> * addressing comments Signed-off-by: Bharathwaj G <[email protected]> * addressing comments Signed-off-by: Bharathwaj G <[email protected]> * fixing json Signed-off-by: Bharathwaj G <[email protected]> * fixing json Signed-off-by: Bharathwaj G <[email protected]> * addressing comments Signed-off-by: Bharathwaj G <[email protected]> * addressing comments Signed-off-by: Bharathwaj G <[email protected]> * Add edits for star tree field page Signed-off-by: Naarcha-AWS <[email protected]> * Add index edit Signed-off-by: Naarcha-AWS <[email protected]> * Update improving-search-performance.md Signed-off-by: Naarcha-AWS <[email protected]> * Update star-tree-index.md Signed-off-by: Naarcha-AWS <[email protected]> * Update star-tree.md Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Update _field-types/supported-field-types/star-tree.md Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Update star-tree-index.md Signed-off-by: Naarcha-AWS <[email protected]> --------- Signed-off-by: Bharathwaj G <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> Co-authored-by: Naarcha-AWS <[email protected]> Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Eric Pugh <[email protected]>
- Loading branch information
Showing
5 changed files
with
393 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,199 @@ | ||
--- | ||
layout: default | ||
title: Star-tree | ||
nav_order: 61 | ||
parent: Supported field types | ||
--- | ||
|
||
# Star-tree field type | ||
|
||
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, join the discussion on the [OpenSearch forum](https://forum.opensearch.org/). | ||
{: .warning} | ||
|
||
A [star-tree index](https://docs.pinot.apache.org/basics/indexing/star-tree-index) precomputes aggregations, accelerating the performance of aggregation queries. | ||
If a star-tree index is configured as part of an index mapping, the star-tree index is created and maintained as data is ingested in real time. | ||
|
||
OpenSearch will automatically use the star-tree index to optimize aggregations if the queried fields are part of star-tree index dimension fields and the aggregations are on star-tree index metric fields. No changes are required in the query syntax or the request parameters. | ||
|
||
For more information, see [Star-tree index]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index/). | ||
|
||
## Prerequisites | ||
|
||
To use a star-tree index, follow the instructions in [Enabling a star-tree index]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index#enabling-a-star-tree-index). | ||
|
||
## Limitations | ||
|
||
The star-tree index feature has the following limitations: | ||
|
||
- A star-tree index should only be enabled on indexes whose data is not updated or deleted because standard updates and deletions are not accounted for in a star-tree index. | ||
- Currently, only `one` star-tree index can be created per index. Support for multiple star-trees will be added in a future version. | ||
|
||
## Examples | ||
|
||
The following examples show how to use a star-tree index. | ||
|
||
### Star-tree index mappings | ||
|
||
Define star-tree index mappings in the `composite` section in `mappings`. | ||
|
||
The following example API request creates a corresponding star-tree index for all `request_aggs`. To compute metric aggregations for `request_size` and `latency` fields with queries on `port` and `status` fields, configure the following mappings: | ||
|
||
```json | ||
PUT logs | ||
{ | ||
"settings": { | ||
"index.number_of_shards": 1, | ||
"index.number_of_replicas": 0, | ||
"index.composite_index": true | ||
}, | ||
"mappings": { | ||
"composite": { | ||
"request_aggs": { | ||
"type": "star_tree", | ||
"config": { | ||
"max_leaf_docs": 10000, | ||
"skip_star_node_creation_for_dimensions": [ | ||
"port" | ||
], | ||
"ordered_dimensions": [ | ||
{ | ||
"name": "status" | ||
}, | ||
{ | ||
"name": "port" | ||
} | ||
], | ||
"metrics": [ | ||
{ | ||
"name": "request_size", | ||
"stats": [ | ||
"sum", | ||
"value_count", | ||
"min", | ||
"max" | ||
] | ||
}, | ||
{ | ||
"name": "latency", | ||
"stats": [ | ||
"sum", | ||
"value_count", | ||
"min", | ||
"max" | ||
] | ||
} | ||
] | ||
} | ||
} | ||
}, | ||
"properties": { | ||
"status": { | ||
"type": "integer" | ||
}, | ||
"port": { | ||
"type": "integer" | ||
}, | ||
"request_size": { | ||
"type": "integer" | ||
}, | ||
"latency": { | ||
"type": "scaled_float", | ||
"scaling_factor": 10 | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
|
||
|
||
## Star-tree mapping parameters | ||
|
||
Specify any star-tree configuration mapping options in the `config` section. Parameters cannot be modified without reindexing documents. | ||
|
||
The star-tree `config` section supports the following property. | ||
|
||
| Parameter | Required/Optional | Description | | ||
| :--- | :--- | :--- | | ||
| `name` | Required | The name of the field. The field name should be present in the `properties` section as part of the index `mapping`. Ensure that the `doc_values` setting is `enabled` for any associated fields. | ||
|
||
### Ordered dimensions | ||
|
||
The `ordered_dimensions` parameter contains fields based on which metrics will be aggregated in a star-tree index. The star-tree index will be selected for querying only if all the fields in the query are part of the `ordered_dimensions`. | ||
|
||
When using the `ordered_dimesions` parameter, follow these best practices: | ||
|
||
- The order of dimensions matters. You can define the dimensions ordered from the highest cardinality to the lowest cardinality for efficient storage and query pruning. | ||
- Avoid using high-cardinality fields as dimensions. High-cardinality fields adversely affect storage space, indexing throughput, and query performance. | ||
- Currently, fields supported by the `ordered_dimensions` parameter are all [numeric field types](https://opensearch.org/docs/latest/field-types/supported-field-types/numeric/), with the exception of `unsigned_long`. For more information, see [GitHub issue #15231](https://github.com/opensearch-project/OpenSearch/issues/15231). | ||
- Support for other field types, such as `keyword` and `ip`, will be added in future versions. For more information, see [GitHub issue #16232](https://github.com/opensearch-project/OpenSearch/issues/16232). | ||
- A minimum of `2` and a maximum of `10` dimensions are supported per star-tree index. | ||
|
||
The `ordered_dimensions` parameter supports the following property. | ||
|
||
| Parameter | Required/Optional | Description | | ||
| :--- | :--- | :--- | | ||
| `name` | Required | The name of the field. The field name should be present in the `properties` section as part of the index `mapping`. Ensure that the `doc_values` setting is `enabled` for any associated fields. | | ||
|
||
|
||
### Metrics | ||
|
||
Configure any metric fields on which you need to perform aggregations. `Metrics` are required as part of a star-tree configuration. | ||
|
||
When using `metrics`, follow these best practices: | ||
|
||
- Currently, fields supported by `metrics` are all [numeric field types](https://opensearch.org/docs/latest/field-types/supported-field-types/numeric/), with the exception of `unsigned_long`. For more information, see [GitHub issue #15231](https://github.com/opensearch-project/OpenSearch/issues/15231). | ||
- Supported metric aggregations include `Min`, `Max`, `Sum`, `Avg`, and `Value_count`. | ||
- `Avg` is a derived metric based on `Sum` and `Value_count` and is not indexed when a query is run. The remaining base metrics are indexed. | ||
- A maximum of `100` base metrics are supported per star-tree index. | ||
|
||
If `Min`, `Max`, `Sum`, and `Value_count` are defined as `metrics` for each field, then up to 25 such fields can be configured, as shown in the following example: | ||
|
||
```json | ||
{ | ||
"metrics": [ | ||
{ | ||
"name": "field1", | ||
"stats": [ | ||
"sum", | ||
"value_count", | ||
"min", | ||
"max" | ||
], | ||
..., | ||
..., | ||
"name": "field25", | ||
"stats": [ | ||
"sum", | ||
"value_count", | ||
"min", | ||
"max" | ||
] | ||
} | ||
] | ||
} | ||
``` | ||
|
||
|
||
#### Properties | ||
|
||
The `metrics` parameter supports the following properties. | ||
|
||
| Parameter | Required/Optional | Description | | ||
| :--- | :--- | :--- | | ||
| `name` | Required | The name of the field. The field name should be present in the `properties` section as part of the index `mapping`. Ensure that the `doc_values` setting is `enabled` for any associated fields. | | ||
| `stats` | Optional | A list of metric aggregations computed for each field. You can choose between `Min`, `Max`, `Sum`, `Avg`, and `Value Count`.<br/>Default is `Sum` and `Value_count`.<br/>`Avg` is a derived metric statistic that will automatically be supported in queries if `Sum` and `Value_Count` are present as part of metric `stats`. | ||
|
||
### Star-tree configuration parameters | ||
|
||
The following parameters are optional and cannot be modified following index creation. | ||
|
||
| Parameter | Description | | ||
| :--- | :--- | | ||
| `max_leaf_docs` | The maximum number of star-tree documents that a leaf node can point to. After the maximum number of documents is reached, the nodes will be split based on the value of the next dimension. Default is `10000`. A lower value will use more storage but result in faster query performance. Inversely, a higher value will use less storage but result in slower query performance. For more information, see [Star-tree indexing structure]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index/#star-tree-index-structure). | | ||
| `skip_star_node_creation_for_dimensions` | A list of dimensions for which a star-tree index will skip star node creation. When `true`, this reduces storage size at the expense of query performance. Default is `false`. For more information about star nodes, see [Star-tree indexing structure]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index/#star-tree-index-structure). | | ||
|
||
## Supported queries and aggregations | ||
|
||
For more information about supported queries and aggregations, see [Supported queries and aggregations for a star-tree index]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index/#supported-queries-and-aggregations). | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.