From 243bdede3d686b1eaa634db190f2bab3e5f4de72 Mon Sep 17 00:00:00 2001 From: Ian Butterworth Date: Sat, 14 Sep 2024 12:22:31 -0400 Subject: [PATCH 1/5] Add a docs section about loading/precomp/ttfx time tuning (#55569) --- doc/src/manual/performance-tips.md | 119 +++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) diff --git a/doc/src/manual/performance-tips.md b/doc/src/manual/performance-tips.md index 38e27476f0af8..436d58f54754a 100644 --- a/doc/src/manual/performance-tips.md +++ b/doc/src/manual/performance-tips.md @@ -1394,6 +1394,125 @@ Prominent examples include [MKL.jl](https://github.com/JuliaLinearAlgebra/MKL.jl These are external packages, so we will not discuss them in detail here. Please refer to their respective documentations (especially because they have different behaviors than OpenBLAS with respect to multithreading). +## Execution latency, package loading and package precompiling time + +### Reducing time to first plot etc. + +The first time a julia method is called it (and any methods it calls, or ones that can be statically determined) will be +compiled. The [`@time`](@ref) macro family illustrates this. + +``` +julia> foo() = rand(2,2) * rand(2,2) +foo (generic function with 1 method) + +julia> @time @eval foo(); + 0.252395 seconds (1.12 M allocations: 56.178 MiB, 2.93% gc time, 98.12% compilation time) + +julia> @time @eval foo(); + 0.000156 seconds (63 allocations: 2.453 KiB) +``` + +Note that `@time @eval` is better for measuring compilation time because without [`@eval`](@ref), some compilation may +already be done before timing starts. + +When developing a package, you may be able to improve the experience of your users with *precompilation* +so that when they use the package, the code they use is already compiled. To precompile package code effectively, it's +recommended to use [`PrecompileTools.jl`](https://julialang.github.io/PrecompileTools.jl/stable/) to run a +"precompile workload" during precompilation time that is representative of typical package usage, which will cache the +native compiled code into the package `pkgimage` cache, greatly reducing "time to first execution" (often referred to as +TTFX) for such usage. + +Note that [`PrecompileTools.jl`](https://julialang.github.io/PrecompileTools.jl/stable/) workloads can be +disabled and sometimes configured via Preferences if you do not want to spend the extra time precompiling, which +may be the case during development of a package. + +### Reducing package loading time + +Keeping the time taken to load the package down is usually helpful. +General good practice for package developers includes: + +1. Reduce your dependencies to those you really need. Consider using [package extensions](@ref) to support interoperability with other packages without bloating your essential dependencies. +3. Avoid use of [`__init__()`](@ref) functions unless there is no alternative, especially those which might trigger a lot + of compilation, or just take a long time to execute. +4. Where possible, fix [invalidations](https://julialang.org/blog/2020/08/invalidations/) among your dependencies and from your package code. + +The tool [`@time_imports`](@ref) can be useful in the REPL to review the above factors. + +```julia-repl +julia> @time @time_imports using Plots + 0.5 ms Printf + 16.4 ms Dates + 0.7 ms Statistics + ┌ 23.8 ms SuiteSparse_jll.__init__() 86.11% compilation time (100% recompilation) + 90.1 ms SuiteSparse_jll 91.57% compilation time (82% recompilation) + 0.9 ms Serialization + ┌ 39.8 ms SparseArrays.CHOLMOD.__init__() 99.47% compilation time (100% recompilation) + 166.9 ms SparseArrays 23.74% compilation time (100% recompilation) + 0.4 ms Statistics → SparseArraysExt + 0.5 ms TOML + 8.0 ms Preferences + 0.3 ms PrecompileTools + 0.2 ms Reexport +... many deps omitted for example ... + 1.4 ms Tar + ┌ 73.8 ms p7zip_jll.__init__() 99.93% compilation time (100% recompilation) + 79.4 ms p7zip_jll 92.91% compilation time (100% recompilation) + ┌ 27.7 ms GR.GRPreferences.__init__() 99.77% compilation time (100% recompilation) + 43.0 ms GR 64.26% compilation time (100% recompilation) + ┌ 2.1 ms Plots.__init__() 91.80% compilation time (100% recompilation) + 300.9 ms Plots 0.65% compilation time (100% recompilation) + 1.795602 seconds (3.33 M allocations: 190.153 MiB, 7.91% gc time, 39.45% compilation time: 97% of which was recompilation) + +``` + +Notice that in this example there are multiple packages loaded, some with `__init__()` functions, some of which cause +compilation of which some is recompilation. Recompilation is caused by earlier packages invalidating methods, then in +these cases when the following packages run their `__init__()` function some hit recompilation before the code can be run. + +Further, note the `Statistics` extension `SparseArraysExt` has been activated because `SparseArrays` is in the dependency +tree. i.e. see `0.4 ms Statistics → SparseArraysExt`. + +This report gives a good opportunity to review whether the cost of dependency load time is worth the functionality it brings. +Also the `Pkg` utility `why` can be used to report why a an indirect dependency exists. + +``` +(CustomPackage) pkg> why FFMPEG_jll + Plots → FFMPEG → FFMPEG_jll + Plots → GR → GR_jll → FFMPEG_jll +``` + +or to see the indirect dependencies that a package brings in, you can `pkg> rm` the package, see the deps that are removed +from the manifest, then revert the change with `pkg> undo`. + +If loading time is dominated by slow `__init__()` methods having compilation, one verbose way to identify what is being +compiled is to use the julia args `--trace-compile=stderr --trace-compile-timing` which will report a [`precompile`](@ref) +statement each time a method is compiled, along with how long compilation took. For instance, the full setup would be: + +``` +$ julia --startup-file=no --trace-compile=stderr --trace-compile-timing +julia> @time @time_imports using CustomPackage +... +``` + +Note the `--startup-file=no` which helps isolate the test from packages you may have in your `startup.jl`. + +More analysis of the reasons for recompilation can be achieved with the +[`SnoopCompile`](https://github.com/timholy/SnoopCompile.jl) package. + +### Reducing precompilation time + +If package precompilation is taking a long time, one option is to set the following internal and then precompile. +``` +julia> Base.PRECOMPILE_TRACE_COMPILE[] = "stderr" + +pkg> precompile +``` + +This has the effect of setting `--trace-compile=stderr --trace-compile-timing` in the precompilation processes themselves, +so will show which methods are precompiled and how long they took to precompile. + +There are also profiling options such as [using the external profiler Tracy to profile the precompilation process](@ref Profiling-package-precompilation-with-Tracy). + ## Miscellaneous From 346f38bceabf3dab1d3912fe822a663735c91d4a Mon Sep 17 00:00:00 2001 From: Lilith Orion Hafner Date: Sat, 14 Sep 2024 14:40:10 -0500 Subject: [PATCH 2/5] Add compat entry for `Base.donotdelete` (#55773) --- base/docs/basedocs.jl | 3 +++ 1 file changed, 3 insertions(+) diff --git a/base/docs/basedocs.jl b/base/docs/basedocs.jl index e03d0db78f29f..0fc253bd73d1c 100644 --- a/base/docs/basedocs.jl +++ b/base/docs/basedocs.jl @@ -3712,6 +3712,9 @@ unused and delete the entire benchmark code). which the value of the arguments of this intrinsic were available (in a register, in memory, etc.). +!!! compat "Julia 1.8" + This method was added in Julia 1.8. + # Examples ```julia From f4fb87b0f9c3d8e5bcc9901701acd81534789293 Mon Sep 17 00:00:00 2001 From: Ian Butterworth Date: Sun, 15 Sep 2024 07:12:12 -0400 Subject: [PATCH 3/5] REPL: precompile in its own module because Main is closed. Add check for unexpected errors. (#55759) --- stdlib/REPL/src/REPL.jl | 4 ++-- stdlib/REPL/src/precompile.jl | 27 ++++++++++++++++++++++----- 2 files changed, 24 insertions(+), 7 deletions(-) diff --git a/stdlib/REPL/src/REPL.jl b/stdlib/REPL/src/REPL.jl index ddf2f55d0b9f7..44fe0446240c6 100644 --- a/stdlib/REPL/src/REPL.jl +++ b/stdlib/REPL/src/REPL.jl @@ -758,11 +758,11 @@ setmodifiers!(c::REPLCompletionProvider, m::LineEdit.Modifiers) = c.modifiers = Set `mod` as the default contextual module in the REPL, both for evaluating expressions and printing them. """ -function activate(mod::Module=Main) +function activate(mod::Module=Main; interactive_utils::Bool=true) mistate = (Base.active_repl::LineEditREPL).mistate mistate === nothing && return nothing mistate.active_module = mod - Base.load_InteractiveUtils(mod) + interactive_utils && Base.load_InteractiveUtils(mod) return nothing end diff --git a/stdlib/REPL/src/precompile.jl b/stdlib/REPL/src/precompile.jl index 82a1a0bb78ee8..c42def9078759 100644 --- a/stdlib/REPL/src/precompile.jl +++ b/stdlib/REPL/src/precompile.jl @@ -14,6 +14,19 @@ finally end let + # these are intentionally triggered + allowed_errors = [ + "BoundsError: attempt to access 0-element Vector{Any} at index [1]", + "MethodError: no method matching f(::$Int, ::$Int)", + "Padding of type", # reinterpret docstring has ERROR examples + ] + function check_errors(out) + str = String(out) + if occursin("ERROR:", str) && !any(occursin(e, str) for e in allowed_errors) + @error "Unexpected error (Review REPL precompilation with debug_output on):\n$str" + exit(1) + end + end ## Debugging options # View the code sent to the repl by setting this to `stdout` debug_output = devnull # or stdout @@ -25,6 +38,8 @@ let DOWN_ARROW = "\e[B" repl_script = """ + import REPL + REPL.activate(REPL.Precompile; interactive_utils=false) # Main is closed so we can't evaluate in it 2+2 print("") printstyled("a", "b") @@ -47,6 +62,7 @@ let [][1] Base.Iterators.minimum cd("complete_path\t\t$CTRL_C + REPL.activate(; interactive_utils=false) println("done") """ @@ -113,10 +129,10 @@ let end schedule(repltask) # wait for the definitive prompt before start writing to the TTY - readuntil(output_copy, JULIA_PROMPT) + check_errors(readuntil(output_copy, JULIA_PROMPT)) write(debug_output, "\n#### REPL STARTED ####\n") sleep(0.1) - readavailable(output_copy) + check_errors(readavailable(output_copy)) # Input our script precompile_lines = split(repl_script::String, '\n'; keepempty=false) curr = 0 @@ -124,16 +140,16 @@ let sleep(0.1) curr += 1 # consume any other output - bytesavailable(output_copy) > 0 && readavailable(output_copy) + bytesavailable(output_copy) > 0 && check_errors(readavailable(output_copy)) # push our input write(debug_output, "\n#### inputting statement: ####\n$(repr(l))\n####\n") # If the line ends with a CTRL_C, don't write an extra newline, which would # cause a second empty prompt. Our code below expects one new prompt per # input line and can race out of sync with the unexpected second line. endswith(l, CTRL_C) ? write(ptm, l) : write(ptm, l, "\n") - readuntil(output_copy, "\n") + check_errors(readuntil(output_copy, "\n")) # wait for the next prompt-like to appear - readuntil(output_copy, "\n") + check_errors(readuntil(output_copy, "\n")) strbuf = "" while !eof(output_copy) strbuf *= String(readavailable(output_copy)) @@ -143,6 +159,7 @@ let occursin(HELP_PROMPT, strbuf) && break sleep(0.1) end + check_errors(strbuf) end write(debug_output, "\n#### COMPLETED - Closing REPL ####\n") write(ptm, "$CTRL_D") From 4633607ce9b9f077f32f89f09a136e04389bbac2 Mon Sep 17 00:00:00 2001 From: Lilith Orion Hafner Date: Sun, 15 Sep 2024 07:11:22 -0500 Subject: [PATCH 4/5] Try to put back previously flakey addmul tests (#55775) Partial revert of #50071, inspired by conversation in https://github.com/JuliaLang/julia/issues/49966#issuecomment-2350935477 Ran the tests 100 times to make sure we're not putting back something that's still flaky. Closes #49966 --- stdlib/LinearAlgebra/test/addmul.jl | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/stdlib/LinearAlgebra/test/addmul.jl b/stdlib/LinearAlgebra/test/addmul.jl index 3fff8289242f7..72fdf687bf5c3 100644 --- a/stdlib/LinearAlgebra/test/addmul.jl +++ b/stdlib/LinearAlgebra/test/addmul.jl @@ -164,8 +164,7 @@ end Bc = Matrix(B) returned_mat = mul!(C, A, B, α, β) @test returned_mat === C - # This test is skipped because it is flakey, but should be fixed and put back (see #49966) - @test_skip collect(returned_mat) ≈ α * Ac * Bc + β * Cc rtol=rtol + @test collect(returned_mat) ≈ α * Ac * Bc + β * Cc rtol=rtol y = C[:, 1] x = B[:, 1] @@ -190,8 +189,7 @@ end returned_mat = mul!(C, Af, Bf, α, β) @test returned_mat === C - # This test is skipped because it is flakey, but should be fixed and put back (see #49966) - @test_skip collect(returned_mat) ≈ α * Ac * Bc + β * Cc rtol=rtol + @test collect(returned_mat) ≈ α * Ac * Bc + β * Cc rtol=rtol end end end @@ -203,8 +201,7 @@ end Bc = Matrix(B) returned_mat = mul!(C, A, B, α, zero(eltype(C))) @test returned_mat === C - # This test is skipped because it is flakey, but should be fixed and put back (see #49966) - @test_skip collect(returned_mat) ≈ α * Ac * Bc rtol=rtol + @test collect(returned_mat) ≈ α * Ac * Bc rtol=rtol end end From a993cd8f81a6bc02a88ee5cf036f4e29c36d5580 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mos=C3=A8=20Giordano?= <765740+giordano@users.noreply.github.com> Date: Sun, 15 Sep 2024 21:24:47 +0100 Subject: [PATCH 5/5] Print results of `runtests` with `printstyled` (#55780) This ensures escape characters are used only if `stdout` can accept them. --- test/runtests.jl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/test/runtests.jl b/test/runtests.jl index c46472ac93fa8..e48e896f4069e 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -438,9 +438,9 @@ cd(@__DIR__) do # o_ts.verbose = true # set to true to show all timings when successful Test.print_test_results(o_ts, 1) if !o_ts.anynonpass - println(" \033[32;1mSUCCESS\033[0m") + printstyled(" SUCCESS\n"; bold=true, color=:green) else - println(" \033[31;1mFAILURE\033[0m\n") + printstyled(" FAILURE\n\n"; bold=true, color=:red) skipped > 0 && println("$skipped test", skipped > 1 ? "s were" : " was", " skipped due to failure.") println("The global RNG seed was 0x$(string(seed, base = 16)).\n")