[BugFix] fix compression context pool slow down after long running (backport #53172) #53231
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I'm doing:
In current implementation, we use
moodycamel::concurrentqueue
to reuse compression context, each time we start compress one block, we will try to dequeue ctx from pool, and then return ctx to pool after compression finish.And now we use implicit enqueue method, which causes an automatically-allocated thread-local producer sub-queue to be allocated, and it won't destroy after thread finish:
So after long running, sub-queue will keep growing without bound and slow down consumer.
And in doc (https://github.com/cameron314/concurrentqueue?tab=readme-ov-file#basic-use), author recommend to use explicit producer tokens instead.
What I'm doing:
Use explicit producer tokens to avoid the overhead of too many sub-ququeues.
This pull request introduces improvements to the compression context pool and adds new tests to ensure the robustness of the multi-threaded context retrieval. The most important changes include the addition of a producer token to optimize the context pool and the introduction of a new multi-threaded test.
Improvements to compression context pool:
be/src/util/compression/compression_context_pool.h
: Added a thread-localProducerToken
to theadd
method to reduce the overhead of multiple sub-queues when enqueuing contexts.Enhancements to testing:
be/test/util/block_compression_test.cpp
: Included thecompression_context_pool_singletons.h
header to support new tests.be/test/util/block_compression_test.cpp
: Added a new testtest_multi_thread_get_ctx
to verify the behavior of multi-threaded context retrieval from the LZ4F context pool.What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
This is an automatic backport of pull request [BugFix] fix compression context pool slow down after long running #53172 done by Mergify.
Why I'm doing:
In current implementation, we use
moodycamel::concurrentqueue
to reuse compression context, each time we start compress one block, we will try to dequeue ctx from pool, and then return ctx to pool after compression finish.And now we use implicit enqueue method, which causes an automatically-allocated thread-local producer sub-queue to be allocated, and it won't destroy after thread finish:
So after long running, sub-queue will keep growing without bound and slow down consumer.
And in doc (https://github.com/cameron314/concurrentqueue?tab=readme-ov-file#basic-use), author recommend to use explicit producer tokens instead.
What I'm doing:
Use explicit producer tokens to avoid the overhead of too many sub-ququeues.
This pull request introduces improvements to the compression context pool and adds new tests to ensure the robustness of the multi-threaded context retrieval. The most important changes include the addition of a producer token to optimize the context pool and the introduction of a new multi-threaded test.
Improvements to compression context pool:
be/src/util/compression/compression_context_pool.h
: Added a thread-localProducerToken
to theadd
method to reduce the overhead of multiple sub-queues when enqueuing contexts.Enhancements to testing:
be/test/util/block_compression_test.cpp
: Included thecompression_context_pool_singletons.h
header to support new tests.be/test/util/block_compression_test.cpp
: Added a new testtest_multi_thread_get_ctx
to verify the behavior of multi-threaded context retrieval from the LZ4F context pool.What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist: