-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenSearch performance 2.17 blog #3470
Conversation
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws Editorial review complete. Please see my comments and changes and let me know if you have any questions. Thanks!
|sort_numeric_asc_with_match |16 |2 |1.75 |2 |2 |2 |2 |1.75 |2 |2 | | ||
|sort_numeric_desc |17 |8 |6 |6 |5.5 |4.75 |5 |4.75 |4.25 |4.5 | | ||
|sort_numeric_desc_with_match |18 |2 |2 |2 |2 |2 |2 |1.75 |2 |2 | | ||
|Terms Aggregation |cardinality-agg-high |19 |3075.75 |2432.25 |2506.25 |2246 |2284.5 |2202.25 |2323.75 |2337.25 |2408.75 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|Terms Aggregation |cardinality-agg-high |19 |3075.75 |2432.25 |2506.25 |2246 |2284.5 |2202.25 |2323.75 |2337.25 |2408.75 | | |
|Terms aggregation |cardinality-agg-high |19 |3075.75 |2432.25 |2506.25 |2246 |2284.5 |2202.25 |2323.75 |2337.25 |2408.75 | |
|sort_numeric_desc |17 |8 |6 |6 |5.5 |4.75 |5 |4.75 |4.25 |4.5 | | ||
|sort_numeric_desc_with_match |18 |2 |2 |2 |2 |2 |2 |1.75 |2 |2 | | ||
|Terms Aggregation |cardinality-agg-high |19 |3075.75 |2432.25 |2506.25 |2246 |2284.5 |2202.25 |2323.75 |2337.25 |2408.75 | | ||
|cardinality-agg-low |20 |2925.5 |2295.5 |2383 |2126 |2245.25 |2159 |3 |3 |3 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 253: Same comment re: Term/Terms
|keyword-terms |23 |4695.25 |3478.75 |3557.5 |3220 |29.5 |26 |25.75 |26.25 |26.25 | | ||
|keyword-terms-low-cardinality |24 |4699.5 |3383 |3477.25 |3249.75 |25 |22 |21.75 |21.75 |21.75 | | ||
|multi_terms-keyword |25 |0* |0* |854.75 |817.25 |796.5 |748 |768.5 |746.75 |770 | | ||
|Range Queries |keyword-in-range |26 |101.5 |100 |18 |22 |23.25 |26 |27.25 |18 |17.75 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|Range Queries |keyword-in-range |26 |101.5 |100 |18 |22 |23.25 |26 |27.25 |18 |17.75 | | |
|Range queries |keyword-in-range |26 |101.5 |100 |18 |22 |23.25 |26 |27.25 |18 |17.75 | |
|keyword-terms-low-cardinality |24 |4699.5 |3383 |3477.25 |3249.75 |25 |22 |21.75 |21.75 |21.75 | | ||
|multi_terms-keyword |25 |0* |0* |854.75 |817.25 |796.5 |748 |768.5 |746.75 |770 | | ||
|Range Queries |keyword-in-range |26 |101.5 |100 |18 |22 |23.25 |26 |27.25 |18 |17.75 | | ||
|range |27 |85 |77 |14.5 |18.25 |20.25 |22.75 |24.25 |13.75 |14.25 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Range" (capitalized)?
|range-agg-1 |32 |4641.25 |3810.75 |3745.75 |3578.75 |3477.5 |3328.75 |3318.75 |2 |2.25 | | ||
|range-agg-2 |33 |4568 |3717.25 |3669.75 |3492.75 |3403.5 |3243.5 |3235 |2 |2.25 | | ||
|range-numeric |34 |2 |2 |2 |2 |2 |2 |2 |2 |2 | | ||
|Date Histogram |composite-date_histogram-daily |35 |4828.75 |4055.5 |4051.25 |9 |3 |2.5 |3 |2.75 |2.75 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|Date Histogram |composite-date_histogram-daily |35 |4828.75 |4055.5 |4051.25 |9 |3 |2.5 |3 |2.75 |2.75 | | |
|Date histogram |composite-date_histogram-daily |35 |4828.75 |4055.5 |4051.25 |9 |3 |2.5 |3 |2.75 |2.75 | |
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws Final comments/changes.
@@ -178,7 +294,7 @@ In 2025, we will continue to invest in the following key initiatives aimed at pe | |||
* **Autotuning k-NN indexes:** OpenSearch's vector database offers a toolkit of algorithms tailored for diverse workloads. In 2025, our goal is to enhance the out-of-the-box experience by autotuning hyperparameters and settings based on access patterns and hardware resources. | |||
* **Cold-warm tiering:** In version 2.18, we added support for enabling vector search on remote snapshots. We will continue focusing on decoupling index read/write operations to extend vector indexes to different storage systems in order to reduce storage and compute costs. | |||
* **Memory footprint reduction:** We will continue to aggressively reduce the memory footprint of vector indexes. One of our goals is to support the ability to partially load HNSW indexes into native engines. This complements our disk-optimized search and helps further reduce the operating costs of OpenSearch clusters. | |||
* **Reduced disk storage with "derived source":** Currently, vector data is stored both in a doc-values-like format and in the stored `_source` field. The stored `_source` field can contribute more than 60% of the overall vector storage requirement. We plan to create a custom stored field format that will inject the vector fields into the source from the doc-values-like format. In addition to storage savings, this will have the secondary effects of improved indexing throughput, lighter shards, and even faster search. | |||
* **Reduced disk storage using derived source:** Currently, vector data is stored both in a doc-values-like format and in the stored `_source` field. The stored `_source` field can contribute more than 60% of the overall vector storage requirement. We plan to create a custom stored field format that will inject the vector fields into the source from the doc-values-like format, creating a dervied source field. In addition to storage savings, this approach will improve indexing throughput, reduce shard size, and even accelerate search. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bullet title: Should "a" precede "derived source"? Penultimate sentence: "dervied" => "derived"
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws LGTM
Description
Add OpenSearch performance 2.17 blog
Issues Resolved
Closes #3465
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.