-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shards: respect scheduler and use smarter synchronization for List #750
Conversation
|
I see that the new implementation has a much better concurrency, but shouldn't we have similar chan-related operations? |
Previously List would never call proc.Yield, which broke the co-operative scheduler in the case of slow List calls. Additionally, we used a naive concurrency (a large buffered channel) which shows up in the profiler as 30% of CPU spent on chan_ related operations under List. This commit follows how Search used to respect proc.Yield. See sched.go in 90ed7bf. We did not copy Search since it uses a more complicated implementation than we need since it supports streaming, while List is still batch only. We needed to use errgroup to ensure we drained all channels in the case of an error. Previously we did not need to do this since the channels had a buffer size of len(shards), which gaurenteed nothing would ever block. Now channels are never larger than the number of workers (<= GOMAXPROCS). Test Plan: go test covers the no error cases. In the case of errors we manually tested by running zoekt-webserver and adding a random context cancellation. We observed the error being reported and no List goroutines running. Co-authored-by: William Bezuidenhout <[email protected]>
57837ba
to
753239b
Compare
I'm not sure if go has poorer performance with very large buffered channels vs smaller ones. Given that we don't see chan_ operations appear for normal search in the flamegraphs makes me think that is the case? But this is totally unconfirmed. A more likely explanation is that now that we are doing selectRepoSet (like for normal search) we just send much less work over the channels. So essentially the use of smaller buffered channels doesn't actually help. However, the most important part of this change is the proc.Yield. And while I was introducing that I thought we might as well use a similar pattern for concurrency control. |
Manual testing results: zoekt on k/concurrency [$!] ❄️ zoekt-env took 37s
❯ git diff
diff --git a/shards/shards.go b/shards/shards.go
index aa384371..1f3011fa 100644
--- a/shards/shards.go
+++ b/shards/shards.go
@@ -951,7 +951,7 @@ func (ss *shardedSearcher) List(ctx context.Context, r query.Q, opts *zoekt.List
g, ctx := errgroup.WithContext(ctx)
// Bound work by number of CPUs.
- workers := min(runtime.GOMAXPROCS(0), len(shards))
+ workers := min(runtime.GOMAXPROCS(0), len(shards), 2)
var (
feeder = make(chan zoekt.Searcher, workers)
@@ -976,7 +976,12 @@ func (ss *shardedSearcher) List(ctx context.Context, r query.Q, opts *zoekt.List
// send results down all. If an error is encountered we cancel the errgroup.
for range workers {
g.Go(func() error {
+ i := 0
for s := range feeder {
+ i++
+ if i > 1 {
+ return fmt.Errorf("boom")
+ }
result, err := listOneShard(ctx, s, r, opts)
if err != nil {
return err
zoekt on k/concurrency [$!] ❄️ zoekt-env
❯ go run ./cmd/zoekt-webserver -pprof
2024/03/26 17:25:07 loading 5 shard(s): github.com%2Fgolang%2Fgo_v16.00000.zoekt, github.com%2Fsourcegraph%2Fcody_v16.00000.zoekt, github.com%2Fsourcegraph%2Fsourcegraph_v16.00000.zoekt, github.com%2Fsourcegraph%2Fsymf-private_v16.00000.zoekt, github.com%2Fsourcegraph%2Fzoekt_v16.00000.zoekt |
… List (#750)" This reverts commit 931974d. After monitoring production we are seeing increased latencies in List calls. Additionally we are seeing many more scheduler transitions into interactive queued. The only realistic cause of this was this commit, so we are reverting for now until further investigation. Test Plan: go test
… List (#750)" This reverts commit 931974d. After monitoring production we are seeing increased latencies in List calls. Additionally we are seeing many more scheduler transitions into interactive queued. The only realistic cause of this was this commit, so we are reverting for now until further investigation. Test Plan: go test
… List (#750)" (#752) This reverts commit 931974d. After monitoring production we are seeing increased latencies in List calls. Additionally we are seeing many more scheduler transitions into interactive queued. The only realistic cause of this was this commit, so we are reverting for now until further investigation. Test Plan: go test
) Previously List would never call proc.Yield, which broke the co-operative scheduler in the case of slow List calls. Additionally, we used a naive concurrency (a large buffered channel) which shows up in the profiler as 30% of CPU spent on chan_ related operations under List. This commit follows how Search used to respect proc.Yield. See sched.go in 90ed7bf. We did not copy Search since it uses a more complicated implementation than we need since it supports streaming, while List is still batch only. We needed to use errgroup to ensure we drained all channels in the case of an error. Previously we did not need to do this since the channels had a buffer size of len(shards), which gaurenteed nothing would ever block. Now channels are never larger than the number of workers (<= GOMAXPROCS). Test Plan: go test covers the no error cases. In the case of errors we manually tested by running zoekt-webserver and adding a random context cancellation. We observed the error being reported and no List goroutines running. Co-authored-by: William Bezuidenhout <[email protected]> (cherry picked from commit 931974d)
… List (#750)" (#752) This reverts commit 931974d. After monitoring production we are seeing increased latencies in List calls. Additionally we are seeing many more scheduler transitions into interactive queued. The only realistic cause of this was this commit, so we are reverting for now until further investigation. Test Plan: go test (cherry picked from commit 8cf8887)
Previously List would never call proc.Yield, which broke the co-operative scheduler in the case of slow List calls. Additionally, we used a naive concurrency (a large buffered channel) which shows up in the profiler as 30% of CPU spent on chan_ related operations under List.
This commit follows how Search used to respect proc.Yield. See sched.go in 90ed7bf. We did not copy Search since it uses a more complicated implementation than we need since it supports streaming, while List is still batch only.
We needed to use errgroup to ensure we drained all channels in the case of an error. Previously we did not need to do this since the channels had a buffer size of len(shards), which gaurenteed nothing would ever block. Now channels are never larger than the number of workers (<= GOMAXPROCS).
Test Plan: go test covers the no error cases. In the case of errors we manually tested by running zoekt-webserver and adding a random context cancellation. We observed the error being reported and no List goroutines running.
Stacked on #749