[Segment Replication] Create a separate thread pool for Segment Replication events #8669

Rishikesh1159 · 2023-07-12T21:15:46Z

Description

This PR creates a new threadpool for segment replication.

Related Issues

Resolves #8118

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff
Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Rishikesh1159 <[email protected]>

github-actions · 2023-07-12T21:52:59Z

Gradle Check (Jenkins) Run Completed with:

RESULT: UNSTABLE ❕
TEST FAILURES:

      1 org.opensearch.remotestore.multipart.RemoteStoreMultipartIT.testStaleCommitDeletionWithoutInvokeFlush
      1 org.opensearch.remotestore.multipart.RemoteStoreMultipartIT.testStaleCommitDeletionWithInvokeFlush
      1 org.opensearch.remotestore.RemoteStoreIT.testStaleCommitDeletionWithInvokeFlush

URL: https://build.ci.opensearch.org/job/gradle-check/20078/
CommitID: 3433c26
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

codecov · 2023-07-12T21:53:51Z

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (7dddb31) 70.95% compared to head (3433c26) 70.85%.
Report is 779 commits behind head on main.

❗ Current head 3433c26 differs from pull request most recent head f0a5b00. Consider uploading reports for the commit f0a5b00 to get more accurate results

Files	Patch %	Lines
...etry/tracing/exporter/OTelSpanExporterFactory.java	71.42%	5 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #8669      +/-   ##
============================================
- Coverage     70.95%   70.85%   -0.11%     
+ Complexity    57056    57030      -26     
============================================
  Files          4759     4761       +2     
  Lines        269691   269723      +32     
  Branches      39454    39455       +1     
============================================
- Hits         191359   191109     -250     
- Misses        62184    62535     +351     
+ Partials      16148    16079      -69

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mch2 · 2023-07-12T21:55:13Z

server/src/main/java/org/opensearch/index/SegmentReplicationPressureService.java

@@ -272,7 +272,7 @@ public void onFailure(Exception e) {

        @Override
        protected String getThreadPool() {
-            return ThreadPool.Names.GENERIC;
+            return ThreadPool.Names.SEGMENT_REPLICATION;


This component is used during ingest to determine if pressure should be applied, not to perform segrep activities. I don't think we should use the pool here.

yes makes sense, will update it

Signed-off-by: Rishikesh1159 <[email protected]>

kotwanikunal · 2023-07-12T22:07:17Z

Should we first test the difference with and without the custom ThreadPool? @mch2 / @Rishikesh1159

Rishikesh1159 · 2023-07-12T22:10:07Z

Should we first test the difference with and without the custom ThreadPool? @mch2 / @Rishikesh1159

@kotwanikunal do you mean running some OS benchmarks with and without custom threadpool?

kotwanikunal · 2023-07-12T22:11:12Z

Should we first test the difference with and without the custom ThreadPool? @mch2 / @Rishikesh1159

@kotwanikunal do you mean running some OS benchmarks with and without custom threadpool?

Yes.

github-actions · 2023-07-12T22:27:34Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/20083/
CommitID: f0a5b00
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

Poojita-Raj · 2023-07-13T16:52:38Z

server/src/main/java/org/opensearch/threadpool/ThreadPool.java

@@ -267,6 +270,10 @@ public ThreadPool(
            Names.REMOTE_REFRESH,
            new ScalingExecutorBuilder(Names.REMOTE_REFRESH, 1, halfProcMaxAt10, TimeValue.timeValueMinutes(5))
        );
+        builders.put(
+            Names.SEGMENT_REPLICATION,
+            new ScalingExecutorBuilder(Names.SEGMENT_REPLICATION, 1, halfProcMaxAt10, TimeValue.timeValueMinutes(5))


How are we deciding on these values? Should we be trying out a few different values to see what works best with most common segment replication setups?

+1
I think bounding SEGMENT_REPLICATION threadpool to max 10 threads is not correct and is problematic for cluster having high number of indices. I think with benchmark with multiple indices, problem will surface in segment replication stats (higher replication lag) and thread pool stats.

I think for segment replication, the thread queue should not be bounded or at least should be a fairly large value

@Rishikesh1159 : Identifying ideal maximum thread count is tricky and will depend on cluster load & usage. I think we need to set this to a number sufficient to handle the busiest of traffic/ingestion pattern without bottleneck on threads in pool. One way to get more insights on this number is to run an experiment

Max thread count to a very high value (512?)

Create X indices (start with 1 ?)

Simulate traffic on all X indices

Repeat 1-3 for higher values of X

The count of max/avg thread usage will give us idea on this number.
Please note, we need to have ingestion happening on all indices simulateneously to parallel rounds of segment replication and an increased threads usage.

dreamer-89 · 2023-07-13T21:50:10Z

server/src/main/java/org/opensearch/threadpool/ThreadPool.java

@@ -267,6 +270,10 @@ public ThreadPool(
            Names.REMOTE_REFRESH,
            new ScalingExecutorBuilder(Names.REMOTE_REFRESH, 1, halfProcMaxAt10, TimeValue.timeValueMinutes(5))
        );
+        builders.put(
+            Names.SEGMENT_REPLICATION,
+            new ScalingExecutorBuilder(Names.SEGMENT_REPLICATION, 1, halfProcMaxAt10, TimeValue.timeValueMinutes(5))


+1
I think bounding SEGMENT_REPLICATION threadpool to max 10 threads is not correct and is problematic for cluster having high number of indices. I think with benchmark with multiple indices, problem will surface in segment replication stats (higher replication lag) and thread pool stats.

I think for segment replication, the thread queue should not be bounded or at least should be a fairly large value

dreamer-89 · 2023-07-13T21:51:44Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌

URL: https://build.ci.opensearch.org/job/gradle-check/20083/

CommitID: f0a5b00
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

I think you need to update your local with latest main which identifies 2.10.0 as new bwc minor version.

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':distribution:bwc:minor:buildBwcLinuxTar'.
> Building 2.9.0 didn't generate expected file /var/jenkins/workspace/gradle-check/search/distribution/bwc/minor/build/bwc/checkout-2.x/distribution/archives/linux-tar/build/distributions/opensearch-min-2.9.0-SNAPSHOT-linux-x64.tar.gz

Rishikesh1159 · 2023-07-19T22:42:14Z

Thanks for review @kotwanikunal @Poojita-Raj @dreamer-89 @mch2, I tried OS benchmarks to see if there will be any increment/decrement in performance with using separate threadpool. Here are results:

Using Segrep Threadpool.

Operation	Index Latency	Index Throughput	index-stats-wait-until-merges-finish latency

P0	-	29170	-
P50	1205	30973	104,121
P90	1751		833,121
P99	2550		833,121
P100	4239	35415	833,121

Using GENERIC threadpool

Operation	Index Latency	Index Throughput	index-stats-wait-until-merges-finish latency


P0	-	29,768	-
P50	1222	31160	774,913
P90	1751		962,163
P99	2404		962,163
P100	3526	36,143	962,163

Summary is there is around 20% increase in indexing latency for P100 metric. As @dreamer-89 and @Poojita-Raj suggested I feel like maximum of 10 threads for segrep threadpool might not be enough in cases of having high no.of replicas and indices in a single node. Capping at max of 10 threads we might see performance degradation. Usually GENERIC threadpool has around max of 512 threads (which is shared by many other tasks) might be better in some cases than having segrep threadpool. One thing we can do is increase the max no.of threads in segrep threadpool, but we exactly don't know what are the optimal no.of threads here, as it varies from workload being executed. We definitely need do to more benchmark runs with different workloads to gather some data before we decide on optimal max no.of threads for segrep threadpool.

To be clear on main objective of creating a separate threadpool for segrep is for monitoring segrep related tasks and not for any performance gain.

Please share if you guys have any other thoughts.

Rishikesh1159 · 2023-07-26T15:48:41Z

I have done few other benchmark runs. Here are my observations.

-> As I increase the number of shards to 20, replicas to 3 and nodes 4 I did see the degradation in index latency for P100 metric with separate segment replication threadpool was even worse when compared to before. So we can say that as shards and replica count increases the P100 metric of index latency becomes worse. Below are results:

Using separate segrep threadpool 20 shards, 3 replicas, 4 node (max of 10 threads in pool):

Operation	Index Latency	Index Throughput	index-stats-wait-until-merges-finish latency

P0	-	33687	-
P50	1,080	35,677	56914
P90	1445		107,319
P99	2,029		107,319
P100	5,106	42,487	107,319

Using GENERIC threadpool (no separate segrep threadpool) 20 shards, 3 replicas, 4 node:

Operation	Index Latency	Index Throughput	index-stats-wait-until-merges-finish latency

P0	-	34039	-
P50	1,071.6	36287	70617
P90	1411		93,227
P99	1944		93,227
P100	3,665.8	41927	93,227

-> I tried to increase the max number of threads in segrep threadpool to 20 and 50 and I don't see degradation in index latency of P100 anymore. Here are the results:

Using separate segrep threadpool 20 shards, 3 replicas, 4 node (max of 20 threads in pool):

Operation	Index Latency	Index Throughput	index-stats-wait-until-merges-finish latency

P0	-	34,772	-
P50	1,075	36,125	60940
P90	1382		115,963
P99	1,979		115,963
P100	3668	41227	115,963

Using separate segrep threadpool 20 shards, 3 replicas, 4 node (max of 50 threads in pool):

Operation	Index Latency	Index Throughput	index-stats-wait-until-merges-finish latency

P0	-	35,734	-
P50	1,043	36825	110498
P90	1349		154,722
P99	1,935		154,722
P100	3407	41,882	154,722

-> Next I tried to change the execution of Replication runner from GENERIC thread to segrep threadpool. I changed here and here. All 5 transport layer calls related to segment replication are also using segrep threadpool and max number of threads in the threadpool is set to only 10. Here are results:

Using separate segrep threadpool for transport calls and replication runner 20 shards, 3 replicas, 4 node (max of 10 threads in pool)

Operation	Index Latency	Index Throughput	index-stats-wait-until-merges-finish latency

P0	-	34,966	-
P50	1,067	36,762	95531
P90	1,379		159,398
P99	1896		159,398
P100	3497	42115	159,398

Summary:

As we increase the no.of shards and replicas in cluster the index latency for P100 metric gets worse if we use separate segrep threadpool (only transport layer calls) with max of 10 threads in threadpool. But if we increase the max no.of threads in separate segrep threadpool we don't see the index latency degradation. Also if we include replication runners on separate segrep threadpool then we don't see any degradation in index latency

I will update this task as I find new findings from benchmark runs. Thanks @dreamer-89 for your suggestion. I will try few more rounds of benchmarks with the configuration you suggested.

Bukhtawar

+1 to understanding how are these threads going to be used. Are they compute bound? If yes we might need to run experiments to decide on the precise size

opensearch-trigger-bot · 2023-08-26T15:20:18Z

This PR is stalled because it has been open for 30 days with no activity. Remove stalled label or comment or this will be closed in 7 days.

opensearch-trigger-bot · 2023-09-03T15:20:31Z

This PR was closed because it has been stalled for 7 days with no activity.

kotwanikunal · 2023-09-14T18:22:56Z

Apologies. This PR was auto closed without reaching a resolution from the maintainers.
Re-opening to move it forward.
Thanks for your contributions to OpenSearch!

github-actions · 2023-09-14T18:32:01Z

Compatibility status:

Checks if related components are compatible with change f0a5b00

Incompatible components

Skipped components

Compatible components

Compatible components: []

github-actions · 2023-09-14T18:46:19Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/25618/
CommitID: f0a5b00
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

opensearch-trigger-bot · 2024-01-13T15:19:47Z

This PR is stalled because it has been open for 30 days with no activity.

mch2 · 2024-01-29T18:36:23Z

Closing, this as it has stalled.

Create a new threadpool for segment replication.

3433c26

Signed-off-by: Rishikesh1159 <[email protected]>

Rishikesh1159 requested review from reta, anasalkouz, andrross, Bukhtawar, CEHENKLE, dblock, gbbafna, setiah, kartg, kotwanikunal, mch2, nknize, owaiskazi19, ryanbogan, saratvemulapalli, shwetathareja, dreamer-89, tlfeng, VachaShah, dbwiddis and sachinpkale as code owners July 12, 2023 21:15

Rishikesh1159 added the skip-changelog label Jul 12, 2023

mch2 reviewed Jul 12, 2023

View reviewed changes

Address comments on PR.

f0a5b00

Signed-off-by: Rishikesh1159 <[email protected]>

Poojita-Raj reviewed Jul 13, 2023

View reviewed changes

dreamer-89 requested changes Jul 13, 2023

View reviewed changes

Bukhtawar reviewed Jul 26, 2023

View reviewed changes

opensearch-trigger-bot bot added the stalled Issues that have stalled label Aug 26, 2023

opensearch-trigger-bot bot closed this Sep 3, 2023

kotwanikunal reopened this Sep 14, 2023

kotwanikunal requested review from peternied, sohami and msfroh as code owners September 14, 2023 18:22

github-actions bot added distributed framework enhancement Enhancement or improvement to existing feature or request Indexing:Replication Issues and PRs related to core replication framework eg segrep labels Sep 14, 2023

kotwanikunal removed the stalled Issues that have stalled label Sep 14, 2023

anasalkouz removed the distributed framework label Sep 19, 2023

opensearch-trigger-bot bot added stalled Issues that have stalled and removed stalled Issues that have stalled labels Jan 13, 2024

mch2 closed this Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Segment Replication] Create a separate thread pool for Segment Replication events #8669

[Segment Replication] Create a separate thread pool for Segment Replication events #8669

Rishikesh1159 commented Jul 12, 2023

github-actions bot commented Jul 12, 2023

codecov bot commented Jul 12, 2023 •

edited

Loading

mch2 Jul 12, 2023

Rishikesh1159 Jul 12, 2023

kotwanikunal commented Jul 12, 2023

Rishikesh1159 commented Jul 12, 2023

kotwanikunal commented Jul 12, 2023

github-actions bot commented Jul 12, 2023

Poojita-Raj Jul 13, 2023

dreamer-89 Jul 13, 2023 •

edited

Loading

dreamer-89 Jul 19, 2023

dreamer-89 Jul 13, 2023 •

edited

Loading

dreamer-89 commented Jul 13, 2023

Gradle Check (Jenkins) Run Completed with:

Rishikesh1159 commented Jul 19, 2023 •

edited

Loading

Rishikesh1159 commented Jul 26, 2023 •

edited

Loading

Bukhtawar left a comment

opensearch-trigger-bot bot commented Aug 26, 2023

opensearch-trigger-bot bot commented Sep 3, 2023

kotwanikunal commented Sep 14, 2023

github-actions bot commented Sep 14, 2023

github-actions bot commented Sep 14, 2023

opensearch-trigger-bot bot commented Jan 13, 2024

mch2 commented Jan 29, 2024

[Segment Replication] Create a separate thread pool for Segment Replication events #8669

[Segment Replication] Create a separate thread pool for Segment Replication events #8669

Conversation

Rishikesh1159 commented Jul 12, 2023

Description

Related Issues

Check List

github-actions bot commented Jul 12, 2023

Gradle Check (Jenkins) Run Completed with:

codecov bot commented Jul 12, 2023 • edited Loading

Codecov Report

mch2 Jul 12, 2023

Choose a reason for hiding this comment

Rishikesh1159 Jul 12, 2023

Choose a reason for hiding this comment

kotwanikunal commented Jul 12, 2023

Rishikesh1159 commented Jul 12, 2023

kotwanikunal commented Jul 12, 2023

github-actions bot commented Jul 12, 2023

Gradle Check (Jenkins) Run Completed with:

Poojita-Raj Jul 13, 2023

Choose a reason for hiding this comment

dreamer-89 Jul 13, 2023 • edited Loading

Choose a reason for hiding this comment

dreamer-89 Jul 19, 2023

Choose a reason for hiding this comment

dreamer-89 Jul 13, 2023 • edited Loading

Choose a reason for hiding this comment

dreamer-89 commented Jul 13, 2023

Gradle Check (Jenkins) Run Completed with:

Rishikesh1159 commented Jul 19, 2023 • edited Loading

Rishikesh1159 commented Jul 26, 2023 • edited Loading

Summary:

Bukhtawar left a comment

Choose a reason for hiding this comment

opensearch-trigger-bot bot commented Aug 26, 2023

opensearch-trigger-bot bot commented Sep 3, 2023

kotwanikunal commented Sep 14, 2023

github-actions bot commented Sep 14, 2023

Compatibility status:

Incompatible components

Skipped components

Compatible components

github-actions bot commented Sep 14, 2023

Gradle Check (Jenkins) Run Completed with:

opensearch-trigger-bot bot commented Jan 13, 2024

mch2 commented Jan 29, 2024

codecov bot commented Jul 12, 2023 •

edited

Loading

dreamer-89 Jul 13, 2023 •

edited

Loading

dreamer-89 Jul 13, 2023 •

edited

Loading

Rishikesh1159 commented Jul 19, 2023 •

edited

Loading

Rishikesh1159 commented Jul 26, 2023 •

edited

Loading