Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column: optimze filter #9670

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Lloyd-Pottiger
Copy link
Contributor

@Lloyd-Pottiger Lloyd-Pottiger commented Nov 25, 2024

What problem does this PR solve?

Issue Number: ref #6233

Problem Summary:

What is changed and how it works?

Rewrite IColumn::filter interface with avx2, which can improve at most ~10x performance.

Perf test result (larger is better):

$ ./dbms/bench_dbms --benchmark_filter="columnFilter*"          
2024-11-27T15:16:36+08:00
Running ./dbms/bench_dbms
Run on (72 X 3299.98 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x36)
  L1 Instruction 32 KiB (x36)
  L2 Unified 1024 KiB (x36)
  L3 Unified 25344 KiB (x2)
Load Average: 11.32, 18.49, 38.43
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
columnFilter/sse2_00        3866 ns         3849 ns       182665
columnFilter/avx2_00        2614 ns         2603 ns       270405
columnFilter/sse2_01       22352 ns        22252 ns        32578
columnFilter/avx2_01        4663 ns         4642 ns       159532
columnFilter/sse2_10      147419 ns       146777 ns         4694
columnFilter/avx2_10       15518 ns        15453 ns        45420
columnFilter/sse2_20      215868 ns       214854 ns         3274
columnFilter/avx2_20       25262 ns        25147 ns        29981
columnFilter/sse2_30      284431 ns       283065 ns         2463
columnFilter/avx2_30       30250 ns        30120 ns        24217
columnFilter/sse2_40      345727 ns       344208 ns         2040
columnFilter/avx2_40       35211 ns        35066 ns        19974
columnFilter/sse2_50      388276 ns       386500 ns         1850
columnFilter/avx2_50       41809 ns        41632 ns        17209
columnFilter/sse2_60      360463 ns       358898 ns         1962
columnFilter/avx2_60       47454 ns        47242 ns        14610
columnFilter/sse2_70      306273 ns       304934 ns         2284
columnFilter/avx2_70       52609 ns        52386 ns        13615
columnFilter/sse2_80      249978 ns       248908 ns         2825
columnFilter/avx2_80       61784 ns        61495 ns        12060
columnFilter/sse2_90      180275 ns       179494 ns         3867
columnFilter/avx2_90       70565 ns        70232 ns        11137
columnFilter/sse2_99       48001 ns        47784 ns        15408
columnFilter/avx2_99       47023 ns        46831 ns        15164
columnFilter/sse2_100     153413 ns       152821 ns         4436
columnFilter/avx2_100     148977 ns       148396 ns         4720
Rewrite `IColumn::filter` interface with `avx2`, which can improve at most ~10x performance.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@ti-chi-bot ti-chi-bot bot added the release-note-none Denotes a PR that doesn't merit a release note. label Nov 25, 2024
Copy link
Contributor

ti-chi-bot bot commented Nov 25, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from lloyd-pottiger, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 25, 2024
@Lloyd-Pottiger Lloyd-Pottiger force-pushed the optimize-filter branch 2 times, most recently from 1e4d2a2 to 3a82055 Compare November 26, 2024 03:53
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
@purelind
Copy link
Collaborator

/retest

@JinheLin
Copy link
Contributor

JinheLin commented Nov 27, 2024

Do you compare the performance of different filtration rates. For example, 1%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 99%.

And set the size of the column to DEFAULT_BLOCK_SIZE.

@Lloyd-Pottiger
Copy link
Contributor Author

Do you compare the performance of different filtration rates. For example, 1%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 99%.

And set the size of the column to DEFAULT_BLOCK_SIZE.

$ ./dbms/src/Columns/tests/column_vector_perftest int64 filter 10000 100 30
Test int64-filter rows=10000 columns=100 seconds=30
FilterV1: 514449    
FilterV2: 1245839

100 / 10000 is the filtration rate.

@JinheLin
Copy link
Contributor

Add micro-benchmark for column filter: Lloyd-Pottiger#18

./dbms/bench_dbms --benchmark_filter="columnFilter*"
2024-11-27T14:37:00+08:00
Running ./dbms/bench_dbms
Run on (72 X 3300.01 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x36)
  L1 Instruction 32 KiB (x36)
  L2 Unified 1024 KiB (x36)
  L3 Unified 25344 KiB (x2)
Load Average: 10.74, 19.38, 23.60
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
columnFilter/sse2_00        2697 ns         2686 ns       246181
columnFilter/sse2_01       16080 ns        16017 ns        41333
columnFilter/sse2_10      127919 ns       127345 ns         5714
columnFilter/sse2_20      202684 ns       201722 ns         3510
columnFilter/sse2_30      258492 ns       257367 ns         2754
columnFilter/sse2_40      309530 ns       308144 ns         2289
columnFilter/sse2_50      331585 ns       330077 ns         2063
columnFilter/sse2_60      320293 ns       318848 ns         2250
columnFilter/sse2_70      271073 ns       269877 ns         2633
columnFilter/sse2_80      210172 ns       209248 ns         3346
columnFilter/sse2_90      151390 ns       150685 ns         4818
columnFilter/sse2_99       42376 ns        42196 ns        16193
columnFilter/sse2_100     158190 ns       157494 ns         4296
columnFilter/avx2_00        2585 ns         2574 ns       274291
columnFilter/avx2_01        4273 ns         4257 ns       162088
columnFilter/avx2_10       14650 ns        14586 ns        49541
columnFilter/avx2_20       21996 ns        21904 ns        32461
columnFilter/avx2_30       27256 ns        27141 ns        25264
columnFilter/avx2_40       33599 ns        33456 ns        20461
columnFilter/avx2_50       39686 ns        39513 ns        18076
columnFilter/avx2_60       45817 ns        45624 ns        15675
columnFilter/avx2_70       54466 ns        54213 ns        11688
columnFilter/avx2_80       59390 ns        59112 ns        12559
columnFilter/avx2_90       65427 ns        65135 ns        10859
columnFilter/avx2_99       52000 ns        51764 ns        10000
columnFilter/avx2_100     157735 ns       157017 ns         4695

@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 27, 2024
Signed-off-by: Lloyd-Pottiger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants