[BugFix] fix compression context pool slow down after long running (backport #53172) #53231

mergify · 2024-11-27T05:58:30Z

Why I'm doing:

In current implementation, we use moodycamel::concurrentqueue to reuse compression context, each time we start compress one block, we will try to dequeue ctx from pool, and then return ctx to pool after compression finish.

And now we use implicit enqueue method, which causes an automatically-allocated thread-local producer sub-queue to be allocated, and it won't destroy after thread finish:

void add(InternalRef ptr) {
        DCHECK(ptr);
        Status status = _resetter(ptr.get());
        // if reset fail, then delete this context
        if (!status.ok()) {
            return;
        }

        _ctx_resources.enqueue(std::move(ptr));
    }

So after long running, sub-queue will keep growing without bound and slow down consumer.

And in doc (https://github.com/cameron314/concurrentqueue?tab=readme-ov-file#basic-use), author recommend to use explicit producer tokens instead.

What I'm doing:

Use explicit producer tokens to avoid the overhead of too many sub-ququeues.

This pull request introduces improvements to the compression context pool and adds new tests to ensure the robustness of the multi-threaded context retrieval. The most important changes include the addition of a producer token to optimize the context pool and the introduction of a new multi-threaded test.

Improvements to compression context pool:

be/src/util/compression/compression_context_pool.h: Added a thread-local ProducerToken to the add method to reduce the overhead of multiple sub-queues when enqueuing contexts.

Enhancements to testing:

be/test/util/block_compression_test.cpp: Included the compression_context_pool_singletons.h header to support new tests.
be/test/util/block_compression_test.cpp: Added a new test test_multi_thread_get_ctx to verify the behavior of multi-threaded context retrieval from the LZ4F context pool.

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

I have checked the version labels which the pr will be auto-backported to the target branch
- 3.4
- 3.3
- 3.2
- 3.1
- 3.0
  This is an automatic backport of pull request [BugFix] fix compression context pool slow down after long running #53172 done by Mergify.

Why I'm doing:

In current implementation, we use moodycamel::concurrentqueue to reuse compression context, each time we start compress one block, we will try to dequeue ctx from pool, and then return ctx to pool after compression finish.

And now we use implicit enqueue method, which causes an automatically-allocated thread-local producer sub-queue to be allocated, and it won't destroy after thread finish:

void add(InternalRef ptr) {
        DCHECK(ptr);
        Status status = _resetter(ptr.get());
        // if reset fail, then delete this context
        if (!status.ok()) {
            return;
        }

        _ctx_resources.enqueue(std::move(ptr));
    }

So after long running, sub-queue will keep growing without bound and slow down consumer.

And in doc (https://github.com/cameron314/concurrentqueue?tab=readme-ov-file#basic-use), author recommend to use explicit producer tokens instead.

What I'm doing:

Use explicit producer tokens to avoid the overhead of too many sub-ququeues.

This pull request introduces improvements to the compression context pool and adds new tests to ensure the robustness of the multi-threaded context retrieval. The most important changes include the addition of a producer token to optimize the context pool and the introduction of a new multi-threaded test.

Improvements to compression context pool:

be/src/util/compression/compression_context_pool.h: Added a thread-local ProducerToken to the add method to reduce the overhead of multiple sub-queues when enqueuing contexts.

Enhancements to testing:

be/test/util/block_compression_test.cpp: Included the compression_context_pool_singletons.h header to support new tests.
be/test/util/block_compression_test.cpp: Added a new test test_multi_thread_get_ctx to verify the behavior of multi-threaded context retrieval from the LZ4F context pool.

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

…53172) Signed-off-by: luohaha <[email protected]> (cherry picked from commit b141be8)

[BugFix] fix compression context pool slow down after long running (#…

658c84a

…53172) Signed-off-by: luohaha <[email protected]> (cherry picked from commit b141be8)

mergify bot mentioned this pull request Nov 27, 2024

[BugFix] fix compression context pool slow down after long running #53172

Merged

24 tasks

github-actions bot assigned luohaha Nov 27, 2024

github-actions bot added the automerge label Nov 27, 2024

wanpengfei-git enabled auto-merge (squash) November 27, 2024 05:59

luohaha approved these changes Nov 27, 2024

View reviewed changes

wanpengfei-git merged commit 82da570 into branch-3.3 Nov 27, 2024
35 of 36 checks passed

wanpengfei-git deleted the mergify/bp/branch-3.3/pr-53172 branch November 27, 2024 06:24

github-actions bot added the version:3.3.7 label Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] fix compression context pool slow down after long running (backport #53172) #53231

[BugFix] fix compression context pool slow down after long running (backport #53172) #53231

mergify bot commented Nov 27, 2024 •

edited by wanpengfei-git

Loading

[BugFix] fix compression context pool slow down after long running (backport #53172) #53231

[BugFix] fix compression context pool slow down after long running (backport #53172) #53231

Conversation

mergify bot commented Nov 27, 2024 • edited by wanpengfei-git Loading

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

mergify bot commented Nov 27, 2024 •

edited by wanpengfei-git

Loading