diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index d8eaef37..d94f2286 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.0","generation_timestamp":"2024-01-04T11:57:28","documenter_version":"1.2.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.0","generation_timestamp":"2024-01-17T11:59:15","documenter_version":"1.2.1"}} \ No newline at end of file diff --git a/dev/index.html b/dev/index.html index 8858ae54..7c8de244 100644 --- a/dev/index.html +++ b/dev/index.html @@ -32,4 +32,4 @@ 0.024 ns (0 allocations: 0 bytes) 3

As a rule of thumb, if a benchmark reports that it took less than a nanosecond to perform, this hoisting probably occurred. You can avoid this by referencing and dereferencing the interpolated variables

julia> @btime $(Ref(a))[] + $(Ref(b))[]
   1.277 ns (0 allocations: 0 bytes)
-3

As described in the Manual, the BenchmarkTools package supports many other features, both for additional output and for more fine-grained control over the benchmarking process.

+3

As described in the Manual, the BenchmarkTools package supports many other features, both for additional output and for more fine-grained control over the benchmarking process.

diff --git a/dev/internals/index.html b/dev/internals/index.html index d391d482..ee3f131f 100644 --- a/dev/internals/index.html +++ b/dev/internals/index.html @@ -1,8 +1,8 @@ -Internals · BenchmarkTools.jl

Internals

Base.isemptyMethod
isempty(group::BenchmarkGroup)

Return true if group is empty. This will first run clear_empty! on group to recursively remove any empty subgroups.

source
BenchmarkTools._withprogressMethod
_withprogress(
+Internals · BenchmarkTools.jl

Internals

Base.isemptyMethod
isempty(group::BenchmarkGroup)

Return true if group is empty. This will first run clear_empty! on group to recursively remove any empty subgroups.

source
BenchmarkTools._withprogressMethod
_withprogress(
     name::AbstractString,
     group::BenchmarkGroup;
     kwargs...,
 ) do progressid, nleaves, ndone
     ...
-end

Execute do block with following arguments:

  • progressid: logging ID to be used for @logmsg.
  • nleaves: total number of benchmarks counted at the root benchmark group.
  • ndone: number of completed benchmarks

They are either extracted from kwargs (for sub-groups) or newly created (for root benchmark group).

source
BenchmarkTools.loadMethod
BenchmarkTools.load(filename)

Load serialized benchmarking objects (e.g. results or parameters) from a JSON file.

source
BenchmarkTools.quasiquote!Method
quasiquote!(expr::Expr, vars::Vector{Symbol}, vals::Vector{Expr})

Replace every interpolated value in expr with a placeholder variable and store the resulting variable / value pairings in vars and vals.

source
BenchmarkTools.saveMethod
BenchmarkTools.save(filename, args...)

Save serialized benchmarking objects (e.g. results or parameters) to a JSON file.

source
+end

Execute do block with following arguments:

  • progressid: logging ID to be used for @logmsg.
  • nleaves: total number of benchmarks counted at the root benchmark group.
  • ndone: number of completed benchmarks

They are either extracted from kwargs (for sub-groups) or newly created (for root benchmark group).

source
BenchmarkTools.loadMethod
BenchmarkTools.load(filename)

Load serialized benchmarking objects (e.g. results or parameters) from a JSON file.

source
BenchmarkTools.quasiquote!Method
quasiquote!(expr::Expr, vars::Vector{Symbol}, vals::Vector{Expr})

Replace every interpolated value in expr with a placeholder variable and store the resulting variable / value pairings in vars and vals.

source
BenchmarkTools.saveMethod
BenchmarkTools.save(filename, args...)

Save serialized benchmarking objects (e.g. results or parameters) to a JSON file.

source
diff --git a/dev/linuxtips/index.html b/dev/linuxtips/index.html index e896bcd4..f10e742b 100644 --- a/dev/linuxtips/index.html +++ b/dev/linuxtips/index.html @@ -75,4 +75,4 @@ MCE: 0 0 Machine check exceptions MCP: 61112 61112 Machine check polls ERR: 0 -MIS: 0

Some interrupts, like non-maskable interrupts (NMI), can't be redirected, but you can change the SMP affinities of the rest by writing processor indices to /proc/irq/n/smp_affinity_list, where n is the IRQ number. Here's an example that sets IRQ 22's SMP affinity to processors 0, 1, and 2:

➜ echo 0-2 | sudo tee /proc/irq/22/smp_affinity_list

The optimal way to configure SMP affinities depends a lot on your benchmarks and benchmarking process. For example, if you're running a lot of network-bound benchmarks, it can sometimes be more beneficial to evenly balance ethernet driver interrupts (usually named something like eth0-*) than to restrict them to specific processors.

A smoke test for determining the impact of IRQs on benchmark results is to see what happens when you turn on/off an IRQ load balancer like irqbalance. If this has a noticeable effect on your results, it might be worth playing around with SMP affinities to figure out which IRQs should be directed away from your shielded processors.

Performance monitoring interrupts (PMIs) and perf

Performance monitoring interrupts (PMIs) are sent by the kernel's perf subsystem, which is used to set and manage hardware performance counters monitored by other parts of the kernel. Unless perf is a dependency of your benchmarking process, it may be useful to lower perf's sample rate so that PMIs don't interfere with your experiments. One way to do this is to set the kernel.perf_cpu_time_max_percent parameter to 1:

➜ sudo sysctl kernel.perf_cpu_time_max_percent=1

This tells the kernel to inform perf that it should lower its sample rate such that sampling consumes less than 1% of CPU time. After changing this parameter, you may see messages in the system log like:

[ 3835.065463] perf samples too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate

These messages are nothing to be concerned about - it's simply the kernel reporting that it's lowering perf's max sample rate in order to respect the perf_cpu_time_max_percent property we just set.

Additional resources

+MIS: 0

Some interrupts, like non-maskable interrupts (NMI), can't be redirected, but you can change the SMP affinities of the rest by writing processor indices to /proc/irq/n/smp_affinity_list, where n is the IRQ number. Here's an example that sets IRQ 22's SMP affinity to processors 0, 1, and 2:

➜ echo 0-2 | sudo tee /proc/irq/22/smp_affinity_list

The optimal way to configure SMP affinities depends a lot on your benchmarks and benchmarking process. For example, if you're running a lot of network-bound benchmarks, it can sometimes be more beneficial to evenly balance ethernet driver interrupts (usually named something like eth0-*) than to restrict them to specific processors.

A smoke test for determining the impact of IRQs on benchmark results is to see what happens when you turn on/off an IRQ load balancer like irqbalance. If this has a noticeable effect on your results, it might be worth playing around with SMP affinities to figure out which IRQs should be directed away from your shielded processors.

Performance monitoring interrupts (PMIs) and perf

Performance monitoring interrupts (PMIs) are sent by the kernel's perf subsystem, which is used to set and manage hardware performance counters monitored by other parts of the kernel. Unless perf is a dependency of your benchmarking process, it may be useful to lower perf's sample rate so that PMIs don't interfere with your experiments. One way to do this is to set the kernel.perf_cpu_time_max_percent parameter to 1:

➜ sudo sysctl kernel.perf_cpu_time_max_percent=1

This tells the kernel to inform perf that it should lower its sample rate such that sampling consumes less than 1% of CPU time. After changing this parameter, you may see messages in the system log like:

[ 3835.065463] perf samples too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate

These messages are nothing to be concerned about - it's simply the kernel reporting that it's lowering perf's max sample rate in order to respect the perf_cpu_time_max_percent property we just set.

Additional resources

diff --git a/dev/manual/index.html b/dev/manual/index.html index 8c606acd..fb9a0f8d 100644 --- a/dev/manual/index.html +++ b/dev/manual/index.html @@ -594,4 +594,4 @@ plot(t)

This will show the timing results of the trial as a violin plot. You can use all the keyword arguments from Plots.jl, for instance st=:box or yaxis=:log10.

If a BenchmarkGroup contains (only) Trials, its results can be visualized simply by

using BenchmarkPlots, StatsPlots
 t = run(g)
-plot(t)

This will display each Trial as a violin plot.

Miscellaneous tips and info

+plot(t)

This will display each Trial as a violin plot.

Miscellaneous tips and info

diff --git a/dev/reference/index.html b/dev/reference/index.html index b6a3551b..0fb1f63c 100644 --- a/dev/reference/index.html +++ b/dev/reference/index.html @@ -1,5 +1,5 @@ -Reference · BenchmarkTools.jl

References

BenchmarkTools.clear_empty!Method
clear_empty!(group::BenchmarkGroup)

Recursively remove any empty subgroups from group.

Use this to prune a BenchmarkGroup after accessing the incorrect fields, such as g=BenchmarkGroup(); g[1], without storing anything to g[1], which will create an empty subgroup g[1].

source
BenchmarkTools.tune!Function
tune!(b::Benchmark, p::Parameters = b.params; verbose::Bool = false, pad = "", kwargs...)

Tune a Benchmark instance.

If the number of evals in the parameters p has been set manually, this function does nothing.

source
BenchmarkTools.tune!Method
tune!(group::BenchmarkGroup; verbose::Bool = false, pad = "", kwargs...)

Tune a BenchmarkGroup instance. For most benchmarks, tune! needs to perform many evaluations to determine the proper parameters for any given benchmark - often more evaluations than are performed when running a trial. In fact, the majority of total benchmarking time is usually spent tuning parameters, rather than actually running trials.

source
BenchmarkTools.@ballocatedMacro
@ballocated expression [other parameters...]

Similar to the @allocated macro included with Julia, this returns the number of bytes allocated when executing a given expression. It uses the @benchmark macro, however, and accepts all of the same additional parameters as @benchmark. The returned allocations correspond to the trial with the minimum elapsed time measured during the benchmark.

source
BenchmarkTools.@belapsedMacro
@belapsed expression [other parameters...]

Similar to the @elapsed macro included with Julia, this returns the elapsed time (in seconds) to execute a given expression. It uses the @benchmark macro, however, and accepts all of the same additional parameters as @benchmark. The returned time is the minimum elapsed time measured during the benchmark.

source
BenchmarkTools.@benchmarkMacro
@benchmark <expr to benchmark> [setup=<setup expr>]

Run benchmark on a given expression.

Example

The simplest usage of this macro is to put it in front of what you want to benchmark.

julia> @benchmark sin(1)
+Reference · BenchmarkTools.jl

References

BenchmarkTools.clear_empty!Method
clear_empty!(group::BenchmarkGroup)

Recursively remove any empty subgroups from group.

Use this to prune a BenchmarkGroup after accessing the incorrect fields, such as g=BenchmarkGroup(); g[1], without storing anything to g[1], which will create an empty subgroup g[1].

source
BenchmarkTools.tune!Function
tune!(b::Benchmark, p::Parameters = b.params; verbose::Bool = false, pad = "", kwargs...)

Tune a Benchmark instance.

If the number of evals in the parameters p has been set manually, this function does nothing.

source
BenchmarkTools.tune!Method
tune!(group::BenchmarkGroup; verbose::Bool = false, pad = "", kwargs...)

Tune a BenchmarkGroup instance. For most benchmarks, tune! needs to perform many evaluations to determine the proper parameters for any given benchmark - often more evaluations than are performed when running a trial. In fact, the majority of total benchmarking time is usually spent tuning parameters, rather than actually running trials.

source
BenchmarkTools.@ballocatedMacro
@ballocated expression [other parameters...]

Similar to the @allocated macro included with Julia, this returns the number of bytes allocated when executing a given expression. It uses the @benchmark macro, however, and accepts all of the same additional parameters as @benchmark. The returned allocations correspond to the trial with the minimum elapsed time measured during the benchmark.

source
BenchmarkTools.@belapsedMacro
@belapsed expression [other parameters...]

Similar to the @elapsed macro included with Julia, this returns the elapsed time (in seconds) to execute a given expression. It uses the @benchmark macro, however, and accepts all of the same additional parameters as @benchmark. The returned time is the minimum elapsed time measured during the benchmark.

source
BenchmarkTools.@benchmarkMacro
@benchmark <expr to benchmark> [setup=<setup expr>]

Run benchmark on a given expression.

Example

The simplest usage of this macro is to put it in front of what you want to benchmark.

julia> @benchmark sin(1)
 BenchmarkTools.Trial:
   memory estimate:  0 bytes
   allocs estimate:  0
@@ -37,6 +37,6 @@
   maximum time:     276.033 ns (0.00% GC)
   --------------
   samples:          10000
-  evals/sample:     935
source
BenchmarkTools.@benchmarkableMacro
@benchmarkable <expr to benchmark> [setup=<setup expr>]

Create a Benchmark instance for the given expression. @benchmarkable has similar syntax with @benchmark. See also @benchmark.

source
BenchmarkTools.@benchmarksetMacro
@benchmarkset "title" begin ... end

Create a benchmark set, or multiple benchmark sets if a for loop is provided.

Examples

@benchmarkset "suite" for k in 1:5
+  evals/sample:     935
source
BenchmarkTools.@benchmarkableMacro
@benchmarkable <expr to benchmark> [setup=<setup expr>]

Create a Benchmark instance for the given expression. @benchmarkable has similar syntax with @benchmark. See also @benchmark.

source
BenchmarkTools.@benchmarksetMacro
@benchmarkset "title" begin ... end

Create a benchmark set, or multiple benchmark sets if a for loop is provided.

Examples

@benchmarkset "suite" for k in 1:5
     @case "case $k" rand($k, $k)
-end
source
BenchmarkTools.@bprofileMacro
@bprofile expression [other parameters...]

Run @benchmark while profiling. This is similar to

@profile @benchmark expression [other parameters...]

but the profiling is applied only to the main execution (after compilation and tuning). The profile buffer is cleared prior to execution.

View the profile results with Profile.print(...). See the profiling section of the Julia manual for more information.

source
BenchmarkTools.@btimeMacro
@btime expression [other parameters...]

Similar to the @time macro included with Julia, this executes an expression, printing the time it took to execute and the memory allocated before returning the value of the expression.

Unlike @time, it uses the @benchmark macro, and accepts all of the same additional parameters as @benchmark. The printed time is the minimum elapsed time measured during the benchmark.

source
Base.runFunction
run(b::Benchmark[, p::Parameters = b.params]; kwargs...)

Run the benchmark defined by @benchmarkable.

source
run(group::BenchmarkGroup[, args...]; verbose::Bool = false, pad = "", kwargs...)

Run the benchmark group, with benchmark parameters set to group's by default.

source
BenchmarkTools.saveFunction
BenchmarkTools.save(filename, args...)

Save serialized benchmarking objects (e.g. results or parameters) to a JSON file.

source
BenchmarkTools.loadFunction
BenchmarkTools.load(filename)

Load serialized benchmarking objects (e.g. results or parameters) from a JSON file.

source
+end
source
BenchmarkTools.@bprofileMacro
@bprofile expression [other parameters...]

Run @benchmark while profiling. This is similar to

@profile @benchmark expression [other parameters...]

but the profiling is applied only to the main execution (after compilation and tuning). The profile buffer is cleared prior to execution.

View the profile results with Profile.print(...). See the profiling section of the Julia manual for more information.

source
BenchmarkTools.@btimeMacro
@btime expression [other parameters...]

Similar to the @time macro included with Julia, this executes an expression, printing the time it took to execute and the memory allocated before returning the value of the expression.

Unlike @time, it uses the @benchmark macro, and accepts all of the same additional parameters as @benchmark. The printed time is the minimum elapsed time measured during the benchmark.

source
Base.runFunction
run(b::Benchmark[, p::Parameters = b.params]; kwargs...)

Run the benchmark defined by @benchmarkable.

source
run(group::BenchmarkGroup[, args...]; verbose::Bool = false, pad = "", kwargs...)

Run the benchmark group, with benchmark parameters set to group's by default.

source
BenchmarkTools.saveFunction
BenchmarkTools.save(filename, args...)

Save serialized benchmarking objects (e.g. results or parameters) to a JSON file.

source
BenchmarkTools.loadFunction
BenchmarkTools.load(filename)

Load serialized benchmarking objects (e.g. results or parameters) from a JSON file.

source