feat(leap): add performance article #760

vaeng · 2024-01-08T12:17:27Z

~~To make the links work, this should be merged after #759~~ ✔️
~~To make the image work, exercism/images#24 should be merged beforehand~~. ✔️

siebenschlaefer · 2024-01-09T23:28:10Z

Hej!

When I ran the tests on my computer I got significantly different results:

I compiled with GCC-13.2.0 and -O3
I ran cpupower frequency-set --governor performance to avoid any interference from the processor
I ran benchmark_leap --benchmark_repetitions=100

On my computer (with a AMD Ryzen 1920X) I got these results (slightly shortened):

Benchmark                                     Time             CPU   Iterations
-------------------------------------------------------------------------------
BM_Leap_boolean_chain_mean                 1925 ns         1924 ns          100
BM_Leap_boolean_chain_median               1952 ns         1952 ns          100
BM_Leap_boolean_chain_stddev               37.9 ns         37.9 ns          100
BM_Leap_boolean_chain_cv                   1.97 %          1.97 %           100
BM_Leap_boolean_chain_inverse_mean         1213 ns         1213 ns          100
BM_Leap_boolean_chain_inverse_median       1210 ns         1210 ns          100
BM_Leap_boolean_chain_inverse_stddev       20.2 ns         20.2 ns          100
BM_Leap_boolean_chain_inverse_cv           1.67 %          1.67 %           100
BM_Leap_ternary_mean                        976 ns          976 ns          100
BM_Leap_ternary_median                      968 ns          968 ns          100
BM_Leap_ternary_stddev                     19.6 ns         19.6 ns          100
BM_Leap_ternary_cv                         2.01 %          2.01 %           100
BM_Leap_chrono_mean                         949 ns          949 ns          100
BM_Leap_chrono_median                       947 ns          947 ns          100
BM_Leap_chrono_stddev                      7.14 ns         7.11 ns          100
BM_Leap_chrono_cv                          0.75 %          0.75 %           100
BM_Leap_boost_mean                         1685 ns         1685 ns          100
BM_Leap_boost_median                       1683 ns         1683 ns          100
BM_Leap_boost_stddev                       9.90 ns         9.93 ns          100
BM_Leap_boost_cv                           0.59 %          0.59 %           100
BM_empty_read_mean                        0.000 ns        0.000 ns          100
BM_empty_read_median                      0.000 ns        0.000 ns          100
BM_empty_read_stddev                      0.000 ns        0.000 ns          100
BM_empty_read_cv                           1.78 %          1.20 %           100

It looks like "boolean_chain" is among the slowest approaches, "boolean_chain_inverse" is significantly faster but "ternary" beats both. That's surprising. Or am I misinterpreting the numbers?

vaeng · 2024-01-10T09:14:39Z

BM_Leap_boolean_chain_median 1952 ns
BM_Leap_boolean_chain_inverse_mean 1213 ns

That is strange. I used g++ leap_benchmark.cpp -std=c++20 -isystem benchmark/include -Lbenchmark/build/src -lbenchmark -lpthread -o leap_benchmark, compiled with g++ version 14.10.0. So the article's values are without -O3.

I can't change my CPU to performance, as the VM does not support it, but that should not change things so drastically. I reran the tests many times and always got those results. Even with -O3.

Another go:

❯ gcc leap_benchmark.cpp -std=gnu++2a -O3 -isystem benchmark/include  -Lbenchmark/build/src -lbenchmark -lpthread -o leap_benchmark -lstdc++ -lm
❯ ./leap_benchmark --benchmark_report_aggregates_only=true --benchmark_repetitions=10
BM_Leap_boolean_chain_mean                  937 ns          937 ns           10
BM_Leap_boolean_chain_median                937 ns          937 ns           10
BM_Leap_boolean_chain_stddev               4.00 ns         4.01 ns           10
BM_Leap_boolean_chain_cv                   0.43 %          0.43 %            10
BM_Leap_boolean_chain_inverse_mean         1334 ns         1334 ns           10
BM_Leap_boolean_chain_inverse_median       1333 ns         1333 ns           10
BM_Leap_boolean_chain_inverse_stddev       3.36 ns         3.32 ns           10
BM_Leap_boolean_chain_inverse_cv           0.25 %          0.25 %            10
BM_Leap_ternary_mean                       1060 ns         1060 ns           10
BM_Leap_ternary_median                     1062 ns         1061 ns           10
BM_Leap_ternary_stddev                     2.95 ns         2.90 ns           10
BM_Leap_ternary_cv                         0.28 %          0.27 %            10
BM_Leap_chrono_mean                        1144 ns         1144 ns           10
BM_Leap_chrono_median                      1143 ns         1143 ns           10
BM_Leap_chrono_stddev                      4.75 ns         4.75 ns           10
BM_Leap_chrono_cv                          0.42 %          0.42 %            10
BM_Leap_boost_mean                         1018 ns         1018 ns           10
BM_Leap_boost_median                       1018 ns         1018 ns           10
BM_Leap_boost_stddev                       2.51 ns         2.45 ns           10
BM_Leap_boost_cv                           0.25 %          0.24 %            10
BM_empty_read_mean                        0.614 ns        0.614 ns           10
BM_empty_read_median                      0.614 ns        0.614 ns           10
BM_empty_read_stddev                      0.001 ns        0.001 ns           10
BM_empty_read_cv                           0.13 %          0.12 %            10
```text

clechasseur · 2024-01-11T12:35:43Z

Finally managed to run benchmarks on my old, slow computer 😉

$ ./leap_benchmark --benchmark_report_aggregates_only=true --benchmark_repetitions=10
2024-01-11T07:30:31-05:00
Running ./leap_benchmark
Run on (4 X 2712 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x2)
  L1 Instruction 32 KiB (x2)
  L2 Unified 256 KiB (x2)
  L3 Unified 3072 KiB (x1)
Load Average: 0.85, 0.63, 0.62
-------------------------------------------------------------------------------
Benchmark                                     Time             CPU   Iterations
-------------------------------------------------------------------------------
BM_Leap_boolean_chain_mean                17235 ns        17235 ns           10
BM_Leap_boolean_chain_median              17157 ns        17157 ns           10
BM_Leap_boolean_chain_stddev                193 ns          193 ns           10
BM_Leap_boolean_chain_cv                   1.12 %          1.12 %            10
BM_Leap_boolean_chain_inverse_mean        35652 ns        35652 ns           10
BM_Leap_boolean_chain_inverse_median      34359 ns        34359 ns           10
BM_Leap_boolean_chain_inverse_stddev       2747 ns         2747 ns           10
BM_Leap_boolean_chain_inverse_cv           7.70 %          7.70 %            10
BM_Leap_ternary_mean                      21083 ns        21084 ns           10
BM_Leap_ternary_median                    20425 ns        20425 ns           10
BM_Leap_ternary_stddev                     1350 ns         1350 ns           10
BM_Leap_ternary_cv                         6.40 %          6.40 %            10
BM_Leap_chrono_mean                       20505 ns        20506 ns           10
BM_Leap_chrono_median                     19862 ns        19863 ns           10
BM_Leap_chrono_stddev                      1577 ns         1577 ns           10
BM_Leap_chrono_cv                          7.69 %          7.69 %            10
BM_Leap_boost_mean                        36426 ns        36427 ns           10
BM_Leap_boost_median                      35927 ns        35927 ns           10
BM_Leap_boost_stddev                        824 ns          824 ns           10
BM_Leap_boost_cv                           2.26 %          2.26 %            10
BM_empty_read_mean                         1834 ns         1834 ns           10
BM_empty_read_median                       1833 ns         1833 ns           10
BM_empty_read_stddev                       5.73 ns         5.71 ns           10
BM_empty_read_cv                           0.31 %          0.31 %            10

Looks like on my computer, boost is much slower than chrono, which differs from @vaeng's results. Aside from that it seems to be similar (slower but proportional).

vaeng · 2024-01-11T21:43:27Z

@clechasseur at least chain and inverse chain are in a "sensible" order.

What do you and @siebenschlaefer think? Should I add or edit some of the article, or let it stand as it is for my machine?

vaeng · 2024-01-12T11:46:39Z

@ErikSchierboom have you had diverging benchmarks for an approach article before?

clechasseur · 2024-01-12T12:12:36Z

When I read the different approach documents for the Leap exercise, I wondered whether results could vary depending on how the CPU handles pipelining. The "inverse chain" is an example; on paper, it looks worse because the edge case is checked fist. But what if instructions are reordered in some way? I am wondering whether @siebenschlaefer's results can be explained by the fact that they're using an AMD processor (I am on Intel).

I will also admit to not being a fan of such very low-level optimizations in general. Optimizing a method to determine if a year is a leap year seems a bit extreme: is that really the performance bottleneck of your program? Let's say that it's true that the ternary approach is 27% slower; it's 27% of such an infinitesimal amount of time, does it really matter? But I will leave it to you to determine whether the article should be posted: in itself, it seems fine.

vaeng · 2024-01-12T12:55:56Z

AMD processor (I am on Intel)

My results are AMD as well.

Let's say that it's true that the ternary approach is 27% slower; it's 27% of such an infinitesimal amount of time, does it really matter?

I should include that into the article.

ErikSchierboom · 2024-01-12T13:34:48Z

@ErikSchierboom have you had diverging benchmarks for an approach article before?

Nope, but I've been the only one doing them on my articles :D

…com/exercism/cpp into feat(leap)--add-performance-article

vaeng · 2024-01-13T09:01:52Z

I think the wording in the article is a good summary of our discussion here @siebenschlaefer @ErikSchierboom.
I'd like to merge the article to see the new invert mechanism :)

iHiD · 2024-01-13T17:30:11Z

Merged. Everyone feel free to do follow ups if appropriate but it's good for us to test this -invertible functionality before Tues :)

feat(leap): add performance article

39ee3e8

vaeng added the x:type/docs Work on Documentation label Jan 8, 2024

vaeng requested a review from siebenschlaefer January 8, 2024 12:17

vaeng self-assigned this Jan 8, 2024

fix: update svg url

9f3b344

vaeng requested a review from ErikSchierboom January 12, 2024 11:46

vaeng added 2 commits January 13, 2024 09:53

feat: include some warnings

8a4939e

Merge branch 'feat(leap)--add-performance-article' of https://github.…

2e08778

…com/exercism/cpp into feat(leap)--add-performance-article

iHiD merged commit 77b2998 into main Jan 13, 2024
8 checks passed

iHiD deleted the feat(leap)--add-performance-article branch January 13, 2024 17:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(leap): add performance article #760

feat(leap): add performance article #760

vaeng commented Jan 8, 2024 •

edited

Loading

siebenschlaefer commented Jan 9, 2024 •

edited

Loading

vaeng commented Jan 10, 2024

clechasseur commented Jan 11, 2024

vaeng commented Jan 11, 2024

vaeng commented Jan 12, 2024

clechasseur commented Jan 12, 2024

vaeng commented Jan 12, 2024

ErikSchierboom commented Jan 12, 2024

vaeng commented Jan 13, 2024

iHiD commented Jan 13, 2024

feat(leap): add performance article #760

feat(leap): add performance article #760

Conversation

vaeng commented Jan 8, 2024 • edited Loading

siebenschlaefer commented Jan 9, 2024 • edited Loading

vaeng commented Jan 10, 2024

clechasseur commented Jan 11, 2024

vaeng commented Jan 11, 2024

vaeng commented Jan 12, 2024

clechasseur commented Jan 12, 2024

vaeng commented Jan 12, 2024

ErikSchierboom commented Jan 12, 2024

vaeng commented Jan 13, 2024

iHiD commented Jan 13, 2024

vaeng commented Jan 8, 2024 •

edited

Loading

siebenschlaefer commented Jan 9, 2024 •

edited

Loading