Skip to content

Commit

Permalink
Indexing: properly block on shard building (#689)
Browse files Browse the repository at this point in the history
When indexing, we build shards in parallel based on the `parallelism` flag.
Each shard handles ~100MB of document contents, which should limit the memory
usage to roughly `100MB * parallelism`.

Looking at the size of the buffered document contents in memory profiles, we
see much higher usage than this. The issue seems to be that we continue to
buffer up documents even if all threads are busy building shards. This can be a
real problem if shards take a super long time to build (say because ctags is
slow) -- we could end up buffering a ton of content into memory at once.

This change fixes the throttling logic so we block indexing when all threads
are busy building shards.
  • Loading branch information
jtibshirani committed Nov 16, 2023
1 parent 943fc74 commit 61edc95
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion build/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -854,8 +854,8 @@ func (b *Builder) flush() error {

if b.opts.Parallelism > 1 && b.opts.MemProfile == "" {
b.building.Add(1)
b.throttle <- 1
go func() {
b.throttle <- 1
done, err := b.buildShard(todo, shard)
<-b.throttle

Expand Down

0 comments on commit 61edc95

Please sign in to comment.