Fix `buffer_drain_concurrency` not doing anything. #14545

arthurschreiber · 2023-11-18T12:56:52Z

Description

As described in #11684, the --buffer_drain_concurrency CLI argument to vtgate does not actually do anything.

This pull request implements the logic required to make this flag actually do something. 😬 When the buffer is drained, we now spawn as many goroutines as specified by --buffer_drain_concurrency to drain the buffer in parallel.

I don't think introducing a new flag as described in #11684 actually makes sense - instead I propose we mention this in the v19 changelog that this flag is now doing something, and don't backport the change to any earlier releases.

Related Issue(s)

#11684

Checklist

"Backport to:" labels have been added if this change should be back-ported
Tests were added or are not required
Did the new or modified tests pass consistently locally and on the CI
Documentation was added or is not required

Deployment Notes

Signed-off-by: Arthur Schreiber <[email protected]>

vitess-bot · 2023-11-18T12:56:54Z

arthurschreiber · 2023-11-18T13:58:17Z

go/vt/vtgate/buffer/shard_buffer.go

+	entryChan := make(chan *entry, len(q))
+
+	parallelism := min(bufferDrainConcurrency, len(q))
+
+	var wg sync.WaitGroup
+	wg.Add(parallelism)
+
+	for i := 0; i < parallelism; i++ {
+		go func() {
+			for _, e := range q {
+				sb.unblockAndWait(e, err, true /* releaseSlot */, true /* blockingWait */)
+			}
+
+			wg.Done()
+		}()
+	}
+
 	for _, e := range q {
-		sb.unblockAndWait(e, err, true /* releaseSlot */, true /* blockingWait */)
+		entryChan <- e
 	}
+
+	close(entryChan)
+	wg.Wait()


Would it make sense to only pump the entries through the channel if bufferDrainConcurrency is set to a value higher than 1? Or is the overhead negligible?

No, I think this is an important optimization. We need a legacy codepath for when bufferDrainConcurrency <= 1.

arthurschreiber · 2023-11-18T14:04:09Z

go/vt/vtgate/buffer/shard_buffer.go

-	// TODO(mberlin): Parallelize the drain by pumping the data through a channel.
+
+	// Parallelize the drain by pumping the data through a channel.
+	entryChan := make(chan *entry, len(q))


Do I need to make the channel buffered? 🤔 I think it should be fine to not have it buffered, but I'm not 100% sure.

I think this should not be buffered, definitely not buffered as len(q) because that's a lot of underlying memory being allocated!

arthurschreiber · 2023-11-24T09:36:23Z

@vmg Do you mind taking a look? 🙇‍♂️

vmg

Quite an oversight! I think this looks good overall. We should definitely add a serial path for the case where there's no concurrency.

vmg · 2023-11-27T10:27:55Z

go/vt/vtgate/buffer/shard_buffer.go

-	// TODO(mberlin): Parallelize the drain by pumping the data through a channel.
+
+	// Parallelize the drain by pumping the data through a channel.
+	entryChan := make(chan *entry, len(q))


I think this should not be buffered, definitely not buffered as len(q) because that's a lot of underlying memory being allocated!

vmg · 2023-11-27T10:28:50Z

go/vt/vtgate/buffer/shard_buffer.go

+	entryChan := make(chan *entry, len(q))
+
+	parallelism := min(bufferDrainConcurrency, len(q))
+
+	var wg sync.WaitGroup
+	wg.Add(parallelism)
+
+	for i := 0; i < parallelism; i++ {
+		go func() {
+			for _, e := range q {
+				sb.unblockAndWait(e, err, true /* releaseSlot */, true /* blockingWait */)
+			}
+
+			wg.Done()
+		}()
+	}
+
 	for _, e := range q {
-		sb.unblockAndWait(e, err, true /* releaseSlot */, true /* blockingWait */)
+		entryChan <- e
 	}
+
+	close(entryChan)
+	wg.Wait()


No, I think this is an important optimization. We need a legacy codepath for when bufferDrainConcurrency <= 1.

vmg · 2023-11-27T10:29:21Z

go/vt/vtgate/buffer/shard_buffer.go

+				sb.unblockAndWait(e, err, true /* releaseSlot */, true /* blockingWait */)
+			}
+
+			wg.Done()


defer wg.Done() at the start of the goroutine to prevent a deadlock on panic.

vmg · 2023-11-27T10:31:46Z

go/vt/vtgate/buffer/shard_buffer.go

 	for _, e := range q {
-		sb.unblockAndWait(e, err, true /* releaseSlot */, true /* blockingWait */)
+		entryChan <- e


To elaborate: entryChan doesn't need to be buffered because the sending goroutine (i.e. this one) doesn't have anything else to do besides sending the entries through the channel. If it's not blocked on the channel send, it'll block on wg.Wait() a couple lines below. So buffering is not an optimization here.

github-actions · 2023-12-28T01:27:03Z

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

Push additional commits to the associated branch.
Remove the stale label.
Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

github-actions · 2024-01-04T01:28:33Z

This PR was closed because it has been stale for 7 days with no activity.

Fix buffer_drain_concurrency not doing anything.

f020258

Signed-off-by: Arthur Schreiber <[email protected]>

vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Nov 18, 2023

github-actions bot added this to the v19.0.0 milestone Nov 18, 2023

arthurschreiber marked this pull request as ready for review November 18, 2023 13:56

arthurschreiber requested review from harshit-gangal, systay, frouioui and GuptaManan100 as code owners November 18, 2023 13:56

arthurschreiber added Component: Query Serving Type: Bug and removed NeedsIssue A linked issue is missing for this Pull Request NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work labels Nov 18, 2023

arthurschreiber self-assigned this Nov 18, 2023

arthurschreiber commented Nov 18, 2023

View reviewed changes

arthurschreiber requested a review from vmg November 20, 2023 11:58

vmg reviewed Nov 27, 2023

View reviewed changes

github-actions bot added the Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. label Dec 28, 2023

github-actions bot closed this Jan 4, 2024

rbranson mentioned this pull request Jan 26, 2024

Fix buffer_drain_concurrency not doing anything #15042

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `buffer_drain_concurrency` not doing anything. #14545

Fix `buffer_drain_concurrency` not doing anything. #14545

arthurschreiber commented Nov 18, 2023 •

edited

Loading

vitess-bot bot commented Nov 18, 2023

arthurschreiber Nov 18, 2023

vmg Nov 27, 2023

arthurschreiber Nov 18, 2023

vmg Nov 27, 2023

arthurschreiber commented Nov 24, 2023

vmg left a comment

vmg Nov 27, 2023

vmg Nov 27, 2023

vmg Nov 27, 2023

vmg Nov 27, 2023

github-actions bot commented Dec 28, 2023

github-actions bot commented Jan 4, 2024

Fix buffer_drain_concurrency not doing anything. #14545

Fix buffer_drain_concurrency not doing anything. #14545

Conversation

arthurschreiber commented Nov 18, 2023 • edited Loading

Description

Related Issue(s)

Checklist

Deployment Notes

vitess-bot bot commented Nov 18, 2023

Review Checklist

General

Tests

Documentation

New flags

If a workflow is added or modified:

Backward compatibility

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arthurschreiber commented Nov 24, 2023

vmg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 28, 2023

github-actions bot commented Jan 4, 2024

Fix `buffer_drain_concurrency` not doing anything. #14545

Fix `buffer_drain_concurrency` not doing anything. #14545

arthurschreiber commented Nov 18, 2023 •

edited

Loading