support shard level split on read path #402

penghuo · 2024-06-27T23:44:05Z

Description

A partition of an OpenSearchTable is backed by an OpenSearch Index. Each partition is split into a configurable number of shards, which are then distributed across the cluster.
Update doc, https://github.com/penghuo/penghuo-opensearch-spark/blob/issue396/docs/opensearch-table.md#inputpartition.
Performance test track at Performance Test - Evaluate OpenSearch read performance #403.

Test

Test with EMR-S. The task is scheduled based on split count. For instance, index has 5 shards, then 5 tasks are scheduled.

24/06/29 00:24:43 INFO DAGScheduler: Got map stage job 0 (main at NativeMethodAccessorImpl.java:0) with 5 output partitions
24/06/29 00:24:43 INFO DAGScheduler: Submitting 5 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[5] at main at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4))

Test with AWS OpenSearch Multi-AZ with Standby domain. it support preference:_shard: paramaters.

Benchmark

Single Index with 5 shards

The table below presents the p90 query latency results for both the partitioned and non-partitioned test cases. Across all queries, the results with partitioning show significantly lower times compared to the non-partitioned results.

Query	p90 ms (without partition)	p90 ms (with partition)
SELECT COUNT(*) FROM dev.default.`logs-181998`	42296	17031
SELECT COUNT(*) FROM dev.default.`logs-181998` WHERE status <> 0;	44142	16667
SELECT COUNT(*), AVG(size) FROM dev.default.`logs-181998`;	44584	18775
SELECT AVG(CAST(size AS BIGINT)) FROM dev.default.`logs-181998`;	43575	19474
SELECT MIN(`@timestamp`), MAX(`@timestamp`) FROM dev.default.`logs-181998`;	43533	19130
SELECT status, COUNT() FROM dev.default.logs-181998 WHERE status <> 0 GROUP BY status ORDER BY COUNT() DESC;	43952	19661

Multiple Indices, each index has 5 shards

The table below presents the p90 query latency results for both the partitioned and non-partitioned test cases when query index wildcard. Across all queries, the results with partitioning show significantly lower times compared to the non-partitioned results.

Query	p90 ms (without partition)	p90 ms (with partition)
SELECT COUNT() FROM dev.default.`logs-1`	228049	52123
SELECT COUNT() FROM dev.default.`logs-1` WHERE status <> 0;	215548	43631
SELECT COUNT(), AVG(size) FROM dev.default.`logs-1`;	222332	51583
SELECT AVG(CAST(size AS BIGINT)) FROM dev.default.`logs-1*`;	215273	51805
SELECT MIN(`@timestamp`), MAX(`@timestamp`) FROM dev.default.`logs-1*`;	224576	45914
SELECT status, COUNT() FROM dev.default.`logs-1` WHERE status <> 0 GROUP BY status ORDER BY COUNT(*) DESC;	189594	46758

Issues Resolved

#396

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Peng Huo <[email protected]>

dai-chen · 2024-07-05T18:23:42Z

flint-core/src/main/scala/org/opensearch/flint/core/FlintClient.java

+   * @param query DSL query. DSL query is null means match_all
+   * @return {@link FlintReader}.
+   */
+  FlintReader createReader(String indexName, String shardId, String query);


Is shardId concept binding to FlintOpenSearchClient implementation or generic?

bind to OpenSearch

We can abstract this task/split info later.

dai-chen · 2024-07-05T18:35:36Z

flint-spark-integration/src/main/scala/org/apache/spark/opensearch/table/OpenSearchTable.scala

+ * @param metadata
+ *   Metadata of the table.
+ */
+case class OpenSearchTable(tableName: String, metadata: Map[String, FlintMetadata]) {


Is using FlintMetadata temporary for only fetching index setting? Or is there hard dependency between OS table and Flint index in future?

only for index setting and mapping, not binding to real index

support shard level split on read path

514d51a

Signed-off-by: Peng Huo <[email protected]>

penghuo added enhancement New feature or request 0.5 labels Jun 27, 2024

penghuo self-assigned this Jun 27, 2024

penghuo added 4 commits June 28, 2024 07:25

fix IT

1ee02fd

Signed-off-by: Peng Huo <[email protected]>

revert unnecessary change

a14d5d3

Signed-off-by: Peng Huo <[email protected]>

Add OpenSearchTable IT

68b761f

Signed-off-by: Peng Huo <[email protected]>

update doc

0523127

Signed-off-by: Peng Huo <[email protected]>

penghuo marked this pull request as ready for review July 2, 2024 17:25

penghuo requested review from dai-chen, rupal-bq, vmmusings, seankao-az, anirudha, kaituo and YANG-DB as code owners July 2, 2024 17:25

dai-chen reviewed Jul 5, 2024

View reviewed changes

dai-chen approved these changes Jul 9, 2024

View reviewed changes

penghuo merged commit 087a9df into opensearch-project:main Jul 9, 2024
4 checks passed

penghuo mentioned this pull request Jul 15, 2024

Support read OpenSearch using PIT #430

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support shard level split on read path #402

support shard level split on read path #402

penghuo commented Jun 27, 2024 •

edited

Loading

dai-chen Jul 5, 2024

penghuo Jul 9, 2024

dai-chen Jul 9, 2024

dai-chen Jul 5, 2024

penghuo Jul 9, 2024

support shard level split on read path #402

support shard level split on read path #402

Conversation

penghuo commented Jun 27, 2024 • edited Loading

Description

Test

Benchmark

Single Index with 5 shards

Multiple Indices, each index has 5 shards

Issues Resolved

dai-chen Jul 5, 2024

Choose a reason for hiding this comment

penghuo Jul 9, 2024

Choose a reason for hiding this comment

dai-chen Jul 9, 2024

Choose a reason for hiding this comment

dai-chen Jul 5, 2024

Choose a reason for hiding this comment

penghuo Jul 9, 2024

Choose a reason for hiding this comment

penghuo commented Jun 27, 2024 •

edited

Loading