-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add skipping index benchmark test #291
Add skipping index benchmark test #291
Conversation
Signed-off-by: Chen Dai <[email protected]>
Signed-off-by: Chen Dai <[email protected]>
Signed-off-by: Chen Dai <[email protected]>
Signed-off-by: Chen Dai <[email protected]>
Signed-off-by: Chen Dai <[email protected]>
Signed-off-by: Chen Dai <[email protected]>
Signed-off-by: Chen Dai <[email protected]>
679569a
to
61c4dc4
Compare
Signed-off-by: Chen Dai <[email protected]>
Signed-off-by: Chen Dai <[email protected]>
OpenJDK 64-Bit Server VM 11.0.20+0 on Mac OS X 14.3.1 | ||
Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz | ||
Skipping Index Read 1000000 Rows with Cardinality 64: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------------------ | ||
Partition Read 54 65 9 0.0 54473389.0 1.0X | ||
MinMax Read 57 65 8 0.0 56855820.0 1.0X | ||
ValueSet Read (Default Size 100) 50 61 11 0.0 49529808.0 1.1X | ||
ValueSet Read (Unlimited Size) 43 54 8 0.0 43301469.0 1.3X | ||
BloomFilter Read (1M NDV) 2648 2733 60 0.0 2647662965.0 0.0X | ||
BloomFilter Read (Optimal NDV) 2450 2484 24 0.0 2450135369.0 0.0X | ||
Adaptive BloomFilter Read (Default 10 Candidates) 2441 2458 18 0.0 2441226280.0 0.0X | ||
Adaptive BloomFilter Read (5 Candidates) 2451 2476 26 0.0 2450510244.0 0.0X | ||
Adaptive BloomFilter Read (15 Candidates) 2397 2461 44 0.0 2397133383.0 0.0X | ||
|
||
OpenJDK 64-Bit Server VM 11.0.20+0 on Mac OS X 14.3.1 | ||
Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz | ||
Skipping Index Read 1000000 Rows with Cardinality 2048: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
-------------------------------------------------------------------------------------------------------------------------------------- | ||
Partition Read 31 35 5 0.0 31101827.0 1.0X | ||
MinMax Read 33 40 6 0.0 33385163.0 0.9X | ||
ValueSet Read (Default Size 100) 30 37 6 0.0 30479810.0 1.0X | ||
ValueSet Read (Unlimited Size) 31 37 6 0.0 31004587.0 1.0X | ||
BloomFilter Read (1M NDV) 2477 2537 51 0.0 2477281890.0 0.0X | ||
BloomFilter Read (Optimal NDV) 2408 2461 45 0.0 2408002056.0 0.0X | ||
Adaptive BloomFilter Read (Default 10 Candidates) 2367 2413 43 0.0 2366950203.0 0.0X | ||
Adaptive BloomFilter Read (5 Candidates) 2399 2429 26 0.0 2399147197.0 0.0X | ||
Adaptive BloomFilter Read (15 Candidates) 2382 2421 34 0.0 2381512783.0 0.0X | ||
|
||
OpenJDK 64-Bit Server VM 11.0.20+0 on Mac OS X 14.3.1 | ||
Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz | ||
Skipping Index Read 1000000 Rows with Cardinality 65536: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
--------------------------------------------------------------------------------------------------------------------------------------- | ||
Partition Read 26 30 5 0.0 25781731.0 1.0X | ||
MinMax Read 30 34 7 0.0 29514335.0 0.9X | ||
ValueSet Read (Default Size 100) 27 34 6 0.0 27338628.0 0.9X | ||
ValueSet Read (Unlimited Size) 39 45 6 0.0 39315292.0 0.7X | ||
BloomFilter Read (1M NDV) 2374 2433 55 0.0 2373982609.0 0.0X | ||
BloomFilter Read (Optimal NDV) 2354 2415 60 0.0 2354204521.0 0.0X | ||
Adaptive BloomFilter Read (Default 10 Candidates) 2322 2407 51 0.0 2321669934.0 0.0X | ||
Adaptive BloomFilter Read (5 Candidates) 2413 2465 44 0.0 2413487418.0 0.0X | ||
Adaptive BloomFilter Read (15 Candidates) 2351 2401 36 0.0 2351322414.0 0.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the intended result? Can we include some explanations for why the BF read takes so long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because BF is translated to Painless script filtering which deserializes BF out of bytes and then do the membership check. Reading other data structures is simply translated to OpenSearch match query.
Skipping Index Write 1000000 Rows with Cardinality 65536: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
---------------------------------------------------------------------------------------------------------------------------------------- | ||
Partition Write 1304 1304 0 0.8 1304.1 1.0X | ||
MinMax Write 1287 1287 0 0.8 1286.8 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why BF latency is lower than MinMax?
@dai-chen do we want to move this to 0.4? |
Moved to 0.4. Will try to address the comments this week. Thanks! |
Will reopen and address the comments. |
Description
This PR introduces a benchmark test focusing on the skipping index data structure. It is positioned between end-to-end benchmarking and low-level microbenchmarking. It aims to provide insights into the read and write performance of different skipping data structure using OpenSearch as index store.
The reasons behind this addition instead of E2E or microbenchmarking are:
End-to-End Benchmarking:
Microbenchmarking:
flint-core
, making it straightforward to incorporate into microbenchmarking.TODO
Test Cases
Please find details in the Javadoc on
FlintSkippingIndexBenchmark
. The test is based on Spark benchmark test framework and will be triggered manually.Test Results
https://github.com/dai-chen/opensearch-spark/blob/add-skipping-index-benchmark-rebased/docs/benchmark-skipping-index.txt
Test Data (Generated)
The schema and size of skipping index written in the test:
Skipping index data written:
Issues Resolved
opensearch-project/sql#1399
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.