Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional star tree feedback. #8650

Merged
merged 15 commits into from
Nov 5, 2024
Merged

Conversation

Naarcha-AWS
Copy link
Collaborator

Adds additional feedback post merging #8131.

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

github-actions bot commented Nov 4, 2024

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

@@ -109,13 +109,8 @@ PUT logs

## Star-tree mapping parameters

Specify any star-tree configuration mapping options in the `config` section. Parameters cannot be modified without reindexing documents.
Specify any star-tree configuration mapping options in the `mappings` section. Parameters cannot be modified without reindexing documents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we differentiate from mappings since its present a bit deeper in config section.

under corresponding STIX config section under mappings

or something similar ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, let me rework the order of this section.


| Parameter | Description |
| :--- | :--- |
| `max_leaf_docs` | The maximum number of star-tree documents that a leaf node can point to. After the maximum number of documents is reached, the nodes will be split based on the value of the next dimension. Default is `10000`. A lower value will use more storage but result in faster query performance. Inversely, a higher value will use less storage but result in slower query performance. For more information, see [Star-tree indexing structure]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index/#star-tree-index-structure). |
| `skip_star_node_creation_for_dimensions` | A list of dimensions for which a star-tree index will skip star node creation. When `true`, this reduces storage size at the expense of query performance. Default is `false`. For more information about star nodes, see [Star-tree indexing structure]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index/#star-tree-index-structure). |
| `max_leaf_docs` | The maximum number of STIX documents that a leaf node can point to. After the maximum number of documents is reached, children nodes of the leaf node will be created based on the unique `doc_value`. Default is `10000`. A lower value will use more storage but result in faster query performance. Inversely, a higher value will use less storage but result in slower query performance. For more information, see [Star-tree indexing structure]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index/#star-tree-index-structure). |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum number of STIX documents that a leaf node can point to. Once the node crosses this threshold, children nodes will be created based on the unique values of the next field in `ordered_dimension` (if any).

@@ -39,9 +39,9 @@ The following image illustrates a standard star-tree index structure.

<img src="{{site.url}}{{site.baseurl}}/images/star-tree-index.png" alt="A star-tree index containing two dimensions and two metrics" width="700">

Sorted and aggregated star-tree documents are backed by `doc_values` in an index. `doc_values` use the following pattern:
Sorted and aggregated STIX documents are backed by `doc_values` in an index. `doc_values` use the following pattern:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorted and aggregated STIX documents are backed by doc_values in an index. The columnar data part of doc_values are stored with following properties :

- When the port equals `8443` and the status equals `200`. Because the status equals `200`, the query does not traverse through a star node, and the aggregated metric is stored at the end of a star-tree document.
- When the status equals `200`. The query traverses through a star node in the `port` dimension because `port` is not present as part of the query.
- When the port equals `5600`. The query traverses through a star node in the `status` dimension because `status` is not present as part of the query.
- **Blue**: In a `terms` query searching for the computed average request size aggregation, the port equals `8443` and the status equals `200`. Because the query contains values in both the `status` and `port` dimensions, the query returns the aggregation from a non-star node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Blue: Example of a terms query searching for the average request size aggregation, with query of port equals 8443 and the status equals 200. The query traverses through 200 node in status dimension and 8443 node in port dimensions, and returns the aggregation result.

- When the status equals `200`. The query traverses through a star node in the `port` dimension because `port` is not present as part of the query.
- When the port equals `5600`. The query traverses through a star node in the `status` dimension because `status` is not present as part of the query.
- **Blue**: In a `terms` query searching for the computed average request size aggregation, the port equals `8443` and the status equals `200`. Because the query contains values in both the `status` and `port` dimensions, the query returns the aggregation from a non-star node.
- **Green**: In a `term` query searching for the computed count of requests aggregation, the status equals `200`. Because the query only contains a value from the `status` dimension, the query returns the aggregation from a star node in the `port` dimension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing

- When the status equals `200`. The query traverses through the child star node of `200` node as it contains the aggregated value of all other children port nodes.

- When the port equals `5600`. The query traverses through a star node in the `status` dimension because `status` is not present as part of the query.
- **Blue**: In a `terms` query searching for the computed average request size aggregation, the port equals `8443` and the status equals `200`. Because the query contains values in both the `status` and `port` dimensions, the query returns the aggregation from a non-star node.
- **Green**: In a `term` query searching for the computed count of requests aggregation, the status equals `200`. Because the query only contains a value from the `status` dimension, the query returns the aggregation from a star node in the `port` dimension.
- **Red**: In a `term` query searching for the computed average request size aggregation, the port equals `5600`. Because the query does not contain a value from the `status` dimension, the query traverses through a star node in the `status` dimension and returns the aggregation from a star node in the `port` dimension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""
When the port equals 5600. The query traverses through a star node in the status dimension because status is not present as part of the query and returns aggregated result from the 5600 child node
"""

@@ -73,7 +73,7 @@ To use a star-tree index, modify the following settings:

## Example mapping

In the following example, index mappings define the star-tree configuration. This star-tree index precomputes aggregations in the `log` index. The aggregations are calculated using the `size` and `latency` fields for all the combinations of values indexed in the `port` and `status` fields:
In the following example, index mappings define the star-tree configuration. This star-tree index precomputes aggregations in the `logs` index. The aggregations are calculated using the `size` and `latency` fields for all the combinations of values indexed in the `port` and `status` fields:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:
In the following example, index mappings define the star-tree configuration. This star-tree index precomputes aggregations in the logs index. The aggregations are calculated for the size and latency fields for all the combinations of values indexed in the port and status fields:

Signed-off-by: Archer <[email protected]>
@Naarcha-AWS
Copy link
Collaborator Author

@bharath-techie: Ready for a second review.

Signed-off-by: Archer <[email protected]>
## Star-tree mapping parameters
| Parameter | Description |
| :--- | :--- |
| `mappings` | A list of [mapping](#mapping-parameters) parameters. Required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inaccurate, the config is part of mapping. And config contains 4 parameters , 2 simple ones max leaf docs and skip star node. And 2 complex ones, ordered dimensions and metrics.

If needed we can give a brief about ordered dimensions and metrics in the table here. And then we can continue with separate sections for ordered dimensions and metrics as already present

Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
@Naarcha-AWS Naarcha-AWS added 5 - Editorial review PR: Editorial review in progress and removed 3 - Tech review PR: Tech review in progress labels Nov 5, 2024
@@ -23,10 +23,10 @@ To use a star-tree index, follow the instructions in [Enabling a star-tree index

## Limitations

The star-tree index feature has the following limitations:
The STIX feature has the following limitations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can either have all the limitations here or we can link up the limitations section in the other page.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move all the limitations here.

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Naarcha-AWS Please see my comments and changes and let me know if you have any questions. Globally, for consistency, please define STIX where it first appears in a file and then use only the acronym thereafter (most often, "a STIX"). Additionally, please ensure that any remaining instances of "traverses to" or "traverses through" are changed to just "traverses". Tag me for approval when addressed. Thanks!

_field-types/supported-field-types/star-tree.md Outdated Show resolved Hide resolved
_field-types/supported-field-types/star-tree.md Outdated Show resolved Hide resolved
_field-types/supported-field-types/star-tree.md Outdated Show resolved Hide resolved
_field-types/supported-field-types/star-tree.md Outdated Show resolved Hide resolved
_search-plugins/star-tree-index.md Outdated Show resolved Hide resolved
_search-plugins/star-tree-index.md Outdated Show resolved Hide resolved
_search-plugins/star-tree-index.md Outdated Show resolved Hide resolved
_search-plugins/star-tree-index.md Outdated Show resolved Hide resolved
_search-plugins/star-tree-index.md Show resolved Hide resolved
@natebower
Copy link
Collaborator

@Naarcha-AWS @bharath-techie We need to wrap up and implement any remaining tech feedback ASAP so that I can read again and approve. Thank you.

@bharath-techie
Copy link
Contributor

@Naarcha-AWS @bharath-techie We need to wrap up and implement any remaining tech feedback ASAP so that I can read again and approve. Thank you.

No more comments from my end, just the last two.

Naarcha-AWS and others added 6 commits November 5, 2024 10:15
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Naarcha-AWS LGTM with the final noted changes. Thanks!

| :--- | :--- |
| `ordered_dimensions` | A [list of fields](#ordered-dimensions) based on which metrics will be aggregated in a star-tree index. Required. |
| `metrics` | A [list of metric](#metrics) fields required in order to perform aggregations. Required. |
| `max_leaf_docs` | The maximum number of star-tree documents that a leaf node can point to. After the maximum number of documents is reached, child nodes will be created based on the unique values of the next field in the `ordered_dimension` (if any). Default is `10000`. A lower value will use more storage but result in faster query performance. Inversely, a higher value will use less storage but result in slower query performance. For more information, see [Star-tree indexing structure]({{site.url}}{{site.baseurl}}/search-plugins/star-tree-index/#star-tree-index-structure). |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"unique values" => "unique value"?

_search-plugins/star-tree-index.md Outdated Show resolved Hide resolved
- When the status equals `200`. The query traverses through a star node in the `port` dimension because `port` is not present as part of the query.
- When the port equals `5600`. The query traverses through a star node in the `status` dimension because `status` is not present as part of the query.
- **Blue**: In a `terms` query that searches for the average request size aggregation, the `port` equals `8443` and the status equals `200`. Because the query contains values in both the `status` and `port` dimensions, the query traverses status node `200` and returns the aggregations from child node `8443`.
- **Green**: In a `term` query that searches for the number of requests aggregation, the `status` equals `200`. Because the query only contains a value from the `status` dimension, the query traverses the `200` node's child star node, which contains the aggregated value of all the `port` child nodes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"requests aggregation" => "request aggregations" or "aggregation requests"?

_search-plugins/star-tree-index.md Outdated Show resolved Hide resolved
@Naarcha-AWS Naarcha-AWS merged commit 1e453d9 into main Nov 5, 2024
6 checks passed
epugh pushed a commit to o19s/documentation-website that referenced this pull request Nov 23, 2024
* Additional star tree feedback.

Signed-off-by: Archer <[email protected]>

* Add tech feedback.

Signed-off-by: Archer <[email protected]>

* Fix broken link

Signed-off-by: Archer <[email protected]>

* Update star-tree.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Update star-tree-index.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Update star-tree.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Update star-tree.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Update star-tree-index.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

---------

Signed-off-by: Archer <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Eric Pugh <[email protected]>
@Naarcha-AWS Naarcha-AWS deleted the star-tree-additional-feedback branch December 19, 2024 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Editorial review PR: Editorial review in progress v2.18.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants