Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow up improvements to Date_Time_Formatter #7875

Merged
merged 38 commits into from
Sep 28, 2023

Conversation

radeusgd
Copy link
Member

@radeusgd radeusgd commented Sep 22, 2023

Pull Request Description

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed, the GUI was tested when built using ./run ide build.

@radeusgd radeusgd force-pushed the wip/radeusgd/date-time-formats-followup branch from a0e8034 to 9bbaab9 Compare September 22, 2023 17:13
@radeusgd radeusgd added the CI: No changelog needed Do not require a changelog entry for this PR. label Sep 22, 2023
@radeusgd radeusgd self-assigned this Sep 22, 2023
@radeusgd radeusgd marked this pull request as ready for review September 22, 2023 17:14
@radeusgd radeusgd changed the title use Value_Type in Data_Formatter Follow up improvements to Date_Time_Formatter Sep 22, 2023
Copy link
Member

@jdunkerley jdunkerley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one slight question but looks good.

@radeusgd
Copy link
Member Author

radeusgd commented Sep 27, 2023

Added a benchmark (to Exploratory Benchmarks, so it will not be run by CI by default, I don't think it's important enough to run it, but IMO worth keeping in repo).

Raw results
Found 4 cases to execute
Benchmarking 'Format_Vector_Of_Dates.Naive' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={2 iterations, 5 seconds each}]
Warmup duration:    11752.7002 ms
Warmup invocations: 3
Warmup avg time:    3910.385 ms
Measurement duration:    12369.5478 ms
Measurement invocations: 5
Measurement avg time:    2473.712 ms
Benchmark 'Format_Vector_Of_Dates.Naive' finished in 24129.273 ms
Benchmarking 'Format_Vector_Of_Dates.Prepared_Formatter' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={2 iterations, 5 seconds each}]
Warmup duration:    10190.8434 ms
Warmup invocations: 1681
Warmup avg time:    5.949 ms
Measurement duration:    10242.1637 ms
Measurement invocations: 2431
Measurement avg time:    4.115 ms
Benchmark 'Format_Vector_Of_Dates.Prepared_Formatter' finished in 20434.452 ms
Benchmarking 'Format_Vector_Of_Dates.Java_Formatter_recreated_on_each' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={2 iterations, 5 seconds each}]
Warmup duration:    10040.4148 ms
Warmup invocations: 793
Warmup avg time:    12.621 ms
Measurement duration:    10045.1645 ms
Measurement invocations: 930
Measurement avg time:    10.761 ms
Benchmark 'Format_Vector_Of_Dates.Java_Formatter_recreated_on_each' finished in 20086.788 ms
Benchmarking 'Format_Vector_Of_Dates.Column_Format' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={2 iterations, 5 seconds each}]
Warmup duration:    10026.1589 ms
Warmup invocations: 549
Warmup avg time:    18.221 ms
Measurement duration:    10044.7375 ms
Measurement invocations: 637
Measurement avg time:    15.713 ms
Benchmark 'Format_Vector_Of_Dates.Column_Format' finished in 20072.428 ms

Re-creating the formatter at each invocation (default if you call something like vector.map _.format "str") is about 600x slower (!) than using a prepared formatter instance.

In comparison, using Column.format (which creates a formatter under the hood), is only ~4-5 times slower.

We should indeed try adding an LRU cache for the formats.

Results with shorter warmup (more CI ready)
Found 4 cases to execute
Benchmarking 'Format_Vector_Of_Dates.Naive' with configuration: [warmup={1 iterations, 4 seconds each}, measurement={2 iterations, 4 seconds each}]
Warmup duration:    6927.1965 ms
Warmup invocations: 1
Warmup avg time:    6910.172 ms
Measurement duration:    8217.4258 ms
Measurement invocations: 3
Measurement avg time:    2738.048 ms
Benchmark 'Format_Vector_Of_Dates.Naive' finished in 15151.292 ms
Benchmarking 'Format_Vector_Of_Dates.Prepared_Formatter' with configuration: [warmup={1 iterations, 4 seconds each}, measurement={2 iterations, 4 seconds each}]
Warmup duration:    4048.7626 ms
Warmup invocations: 358
Warmup avg time:    11.188 ms
Measurement duration:    8127.921 ms
Measurement invocations: 969
Measurement avg time:    8.264 ms
Benchmark 'Format_Vector_Of_Dates.Prepared_Formatter' finished in 12178.561 ms
Benchmarking 'Format_Vector_Of_Dates.Java_Formatter_recreated_on_each' with configuration: [warmup={1 iterations, 4 seconds each}, measurement={2 iterations, 4 seconds each}]
Warmup duration:    4008.1436 ms
Warmup invocations: 114
Warmup avg time:    35.117 ms
Measurement duration:    8023.6116 ms
Measurement invocations: 562
Measurement avg time:    14.244 ms
Benchmark 'Format_Vector_Of_Dates.Java_Formatter_recreated_on_each' finished in 12033.032 ms
Benchmarking 'Format_Vector_Of_Dates.Column_Format' with configuration: [warmup={1 iterations, 4 seconds each}, measurement={2 iterations, 4 seconds each}]
Warmup duration:    4016.2714 ms
Warmup invocations: 172
Warmup avg time:    23.333 ms
Measurement duration:    8015.4114 ms
Measurement invocations: 413
Measurement avg time:    19.373 ms
Benchmark 'Format_Vector_Of_Dates.Column_Format' finished in 12033.088 ms

@radeusgd
Copy link
Member Author

radeusgd commented Sep 27, 2023

Benchmarks after implementing the cache

Raw results
Found 4 cases to execute
Benchmarking 'Format_Vector_Of_Dates.Naive' with configuration: [warmup={1 iterations, 4 seconds each}, measurement={2 iterations, 4 seconds each}]
Warmup duration:    4051.5825 ms
Warmup invocations: 372
Warmup avg time:    10.753 ms
Measurement duration:    8104.8076 ms
Measurement invocations: 1285
Measurement avg time:    6.229 ms
Benchmark 'Format_Vector_Of_Dates.Naive' finished in 12166.056 ms
Benchmarking 'Format_Vector_Of_Dates.Prepared_Formatter' with configuration: [warmup={1 iterations, 4 seconds each}, measurement={2 iterations, 4 seconds each}]
Warmup duration:    4051.4442 ms
Warmup invocations: 831
Warmup avg time:    4.814 ms
Measurement duration:    8175.2936 ms
Measurement invocations: 2022
Measurement avg time:    3.957 ms
Benchmark 'Format_Vector_Of_Dates.Prepared_Formatter' finished in 12228.495 ms
Benchmarking 'Format_Vector_Of_Dates.Java_Formatter_recreated_on_each' with configuration: [warmup={1 iterations, 4 seconds each}, measurement={2 iterations, 4 seconds each}]
Warmup duration:    4008.8002 ms
Warmup invocations: 259
Warmup avg time:    15.449 ms
Measurement duration:    8026.8088 ms
Measurement invocations: 682
Measurement avg time:    11.734 ms
Benchmark 'Format_Vector_Of_Dates.Java_Formatter_recreated_on_each' finished in 12036.888 ms
Benchmarking 'Format_Vector_Of_Dates.Column_Format' with configuration: [warmup={1 iterations, 4 seconds each}, measurement={2 iterations, 4 seconds each}]
Warmup duration:    4013.7933 ms
Warmup invocations: 214
Warmup avg time:    18.735 ms
Measurement duration:    8061.3893 ms
Measurement invocations: 832
Measurement avg time:    9.617 ms
Benchmark 'Format_Vector_Of_Dates.Column_Format' finished in 12076.766 ms
Approach Avg Measurement Time
Naïve (without cache) 2738.048 ms
Naïve (with cache) 6.229 ms
Formatter precomputed 3.957 ms
Java ofPattern re-created on each call 11.734 ms
Column.format 9.617 ms

Adding the cache was definitely worth it - on this example we are about 400x faster than before!

After implementing the cache all timings are on par between ~4-11ms.

@radeusgd radeusgd added the CI: Ready to merge This PR is eligible for automatic merge label Sep 27, 2023
@radeusgd radeusgd mentioned this pull request Sep 27, 2023
5 tasks
@radeusgd radeusgd added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Sep 27, 2023
@mergify mergify bot merged commit 8d92616 into develop Sep 28, 2023
23 checks passed
@mergify mergify bot deleted the wip/radeusgd/date-time-formats-followup branch September 28, 2023 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built. CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Additional improvements to date-time format Make Data_Formatter.parse use Value_Type
3 participants