Merge pull request #509 from SpeedyWeather/mk/benchmark

Benchmark overview
SpeedyWeather · Apr 2, 2024 · 4f948bb · 4f948bb
2 parents a461379 + 95d18d3
commit 4f948bb
Show file tree

Hide file tree

Showing 5 changed files with 362 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -177,6 +177,15 @@ and all dependencies automatically. For more information see the
 Please use the current minor version of Julia,
 compatibilities with older versions are not guaranteed.
 
+## Benchmarks
+
+The primitive equations at 400km resolution with 8 vertical layers are simulated by
+SpeedyWeather.jl at about 500 simulated years per day, i.e. one year takes about
+3min single-threaded on a CPU. Multi-threading will increase the speed typically by 2-4x.
+
+For an overview of typical simulation speeds a user can expect under different model setups see
+[Benchmarks](https://github.com/SpeedyWeather/SpeedyWeather.jl/blob/main/benchmark).
+
 ## Copyright and license
 
 Copyright (c) 2020 Milan Klöwer for SpeedyWeather.jl  

diff --git a/benchmark/README.md b/benchmark/README.md
@@ -0,0 +1,121 @@
+# Benchmarks
+
+created for SpeedyWeather.jl v0.9.0 on Tue, 02 Apr 2024 14:05:51. 
+
+All simulations have been benchmarked over several seconds (wallclock time) without output. Benchmarking excludes initialization and is started just before the main time loop and finishes right after. The benchmarking results here are not very robust, timings that change with +-50% are not uncommon. Proper benchmarking for performance optimization uses the minimum or median of many executions, while we run a simulation for several time steps which effectively represents the mean, susceptible to outliers that slow down the simulation. However, this is what a user will experience in most situations anyway and the following therefore presents a rough idea of how fast a SpeedyWeather simulation will run, and how much memory it requires.
+
+### Machine details
+
+All benchmark simulation were single-threaded on a CPU:
+```julia
+julia> versioninfo()
+Julia Version 1.10.2
+Commit bd47eca2c8a (2024-03-01 10:14 UTC)
+Build Info:
+  Official https://julialang.org/ release
+Platform Info:
+  OS: macOS (x86_64-apple-darwin22.4.0)
+  CPU: 8 × Intel(R) Core(TM) i5-1030NG7 CPU @ 1.10GHz
+  WORD_SIZE: 64
+  LIBM: libopenlibm
+  LLVM: libLLVM-15.0.7 (ORCJIT, icelake-client)
+Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
+Environment:
+  LD_LIBRARY_PATH = /Users/milan/.julia/conda/3/lib:
+```
+
+### Explanation
+
+Abbreviations in the tables below are as follows, omitted columns use defaults.
+- NF: Number format, default: Float32
+- T: Spectral resolution, maximum degree of spherical harmonics, default: T31
+- L: Number of vertical layers, default: 8 (for 3D models)
+- Grid: Horizontal grid, default: OctahedralGaussianGrid
+- Rings: Grid-point resolution, number of latitude rings pole to pole
+- Dynamics: With dynamics?, default: true
+- Physics: With physical parameterizations?, default: true (for primitive equation models)
+- Δt: time step [s].
+- SYPD: Speed of simulation, simulated years per wallclock day.
+- Memory: Memory footprint of simulation, variables and constants.
+
+### Running the benchmarks
+
+The benchmark suite here can be reproduced by executing:
+
+```> julia manual_benchmarking.jl```
+
+inside `the SpeedyWeather.jl/benchmark` folder. It will create this `README.md` which can be pushed to the repository for updates or comparison.
+## Models, default setups
+
+| Model | T | L | Physics | Δt | SYPD | Memory|
+| - | - | - | - | - | - | - |
+| BarotropicModel | 31 | 1 | false | 1800 | 19349 | 1.22 MB |
+| ShallowWaterModel | 31 | 1 | false | 1800 | 10195 | 1.24 MB |
+| PrimitiveDryModel | 31 | 8 | true | 1800 | 530 | 3.95 MB |
+| PrimitiveWetModel | 31 | 8 | true | 1800 | 399 | 4.28 MB |
+
+## Grids
+
+| Model | T | L | Grid | Rings | Δt | SYPD | Memory|
+| - | - | - | - | - | - | - | - |
+| PrimitiveWetModel | 63 | 8 | FullGaussianGrid | 96 | 900 | 35 | 22.50 MB |
+| PrimitiveWetModel | 63 | 8 | FullClenshawGrid | 95 | 900 | 34 | 22.29 MB |
+| PrimitiveWetModel | 63 | 8 | OctahedralGaussianGrid | 96 | 900 | 45 | 15.34 MB |
+| PrimitiveWetModel | 63 | 8 | OctahedralClenshawGrid | 95 | 900 | 54 | 15.12 MB |
+| PrimitiveWetModel | 63 | 8 | HEALPixGrid | 95 | 900 | 77 | 11.46 MB |
+| PrimitiveWetModel | 63 | 8 | OctaHEALPixGrid | 95 | 900 | 56 | 13.67 MB |
+
+## Primitive wet model, resolution
+
+| Model | T | L | Rings | Δt | SYPD | Memory|
+| - | - | - | - | - | - | - |
+| PrimitiveWetModel | 31 | 8 | 48 | 1800 | 378 | 4.28 MB |
+| PrimitiveWetModel | 42 | 8 | 64 | 1350 | 161 | 7.22 MB |
+| PrimitiveWetModel | 63 | 8 | 96 | 900 | 47 | 15.34 MB |
+| PrimitiveWetModel | 85 | 8 | 128 | 675 | 20 | 26.73 MB |
+| PrimitiveWetModel | 127 | 8 | 192 | 450 | 6 | 59.10 MB |
+| PrimitiveWetModel | 170 | 8 | 256 | 338 | 2 | 105.37 MB |
+
+## PrimitiveWetModel: Physics or dynamics only
+
+| Model | T | L | Dynamics | Physics | Δt | SYPD | Memory|
+| - | - | - | - | - | - | - | - |
+| PrimitiveWetModel | 31 | 8 | true | true | 1800 | 322 | 4.28 MB |
+| PrimitiveWetModel | 31 | 8 | true | false | 1800 | 587 | 4.28 MB |
+| PrimitiveWetModel | 31 | 8 | false | true | 1800 | 527 | 4.28 MB |
+
+## Shallow water model, resolution
+
+| Model | T | L | Rings | Δt | SYPD | Memory|
+| - | - | - | - | - | - | - |
+| ShallowWaterModel | 31 | 1 | 48 | 1800 | 9402 | 1.24 MB |
+| ShallowWaterModel | 42 | 1 | 64 | 1350 | 4034 | 2.13 MB |
+| ShallowWaterModel | 63 | 1 | 96 | 900 | 1132 | 4.71 MB |
+| ShallowWaterModel | 85 | 1 | 128 | 675 | 415 | 8.45 MB |
+| ShallowWaterModel | 127 | 1 | 192 | 450 | 105 | 19.60 MB |
+| ShallowWaterModel | 170 | 1 | 256 | 338 | 39 | 36.41 MB |
+| ShallowWaterModel | 255 | 1 | 384 | 225 | 9 | 89.44 MB |
+
+## Primitive Equation, Float32 vs Float64
+
+| Model | NF | T | L | Δt | SYPD | Memory|
+| - | - | - | - | - | - | - |
+| PrimitiveWetModel | Float32 | 31 | 8 | 1800 | 317 | 4.28 MB |
+| PrimitiveWetModel | Float64 | 31 | 8 | 1800 | 343 | 8.03 MB |
+
+## PrimitiveDryModel: Physics or dynamics only
+
+| Model | T | L | Dynamics | Physics | Δt | SYPD | Memory|
+| - | - | - | - | - | - | - | - |
+| PrimitiveDryModel | 31 | 8 | true | true | 1800 | 462 | 3.95 MB |
+| PrimitiveDryModel | 31 | 8 | true | false | 1800 | 657 | 3.95 MB |
+| PrimitiveDryModel | 31 | 8 | false | true | 1800 | 683 | 3.95 MB |
+
+## Number of vertical layers
+
+| Model | T | L | Δt | SYPD | Memory|
+| - | - | - | - | - | - |
+| PrimitiveWetModel | 31 | 4 | 1800 | 583 | 2.92 MB |
+| PrimitiveWetModel | 31 | 8 | 1800 | 345 | 4.28 MB |
+| PrimitiveWetModel | 31 | 12 | 1800 | 217 | 5.65 MB |
+| PrimitiveWetModel | 31 | 16 | 1800 | 186 | 7.03 MB |
diff --git a/benchmark/benchmark_suite.jl b/benchmark/benchmark_suite.jl
@@ -0,0 +1,63 @@
+Base.@kwdef mutable struct BenchmarkSuite
+    title::String
+    nruns::Int = 1
+    model::Vector = fill(PrimitiveWetModel, nruns)
+    NF::Vector = fill(SpeedyWeather.DEFAULT_NF, nruns)
+    trunc::Vector{Int} = fill(SpeedyWeather.DEFAULT_TRUNC, nruns)
+    nlev::Vector{Int} = default_nlev(model)
+    Grid::Vector = fill(SpeedyWeather.DEFAULT_GRID, nruns)
+    nlat::Vector{Int} = fill(0, nruns)
+    dynamics::Vector{Bool} = fill(true, nruns)
+    physics::Vector{Bool} = fill(true, nruns)
+    SYPD::Vector{Float64} = fill(0.0, nruns)
+    Δt::Vector{Float64} = fill(0.0, nruns)
+    memory::Vector{Int} = fill(0, nruns)
+end
+
+default_nlev(::Type{<:Barotropic}) = 1
+default_nlev(::Type{<:ShallowWater}) = 1
+default_nlev(::Type{<:PrimitiveEquation}) = 8
+default_nlev(models) = [default_nlev(model) for model in models]
+
+# this should return number of timesteps so that every simulation
+# only takes seconds
+n_timesteps(trunc, nlev) = max(10, round(Int, 4e8/trunc^3/nlev^2))
+
+function run_benchmark_suite!(suite::BenchmarkSuite)
+    for i in 1:suite.nruns
+
+        # unpack 
+        Model = suite.model[i]
+        NF = suite.NF[i]
+        trunc = suite.trunc[i]
+        nlev = suite.nlev[i]
+        Grid = suite.Grid[i]
+        dynamics = suite.dynamics[i]
+        physics = suite.physics[i]
+
+        spectral_grid = SpectralGrid(;NF, trunc, Grid, nlev)
+        suite.nlat[i] = spectral_grid.nlat
+
+        model = Model(;spectral_grid)
+        if Model <: PrimitiveEquation
+            model.physics = physics
+            model.dynamics = dynamics
+        else
+            suite.dynamics[i] = true
+            suite.physics[i] = false
+        end
+
+        simulation = initialize!(model)
+        suite.memory[i] = Base.summarysize(simulation)
+
+        nsteps = n_timesteps(trunc, nlev)
+        period = Second(round(Int,model.time_stepping.Δt_sec * (nsteps+1)))
+        run!(simulation; period)
+
+        time_elapsed = model.feedback.progress_meter.tlast - model.feedback.progress_meter.tinit
+        sypd = model.time_stepping.Δt_sec*nsteps / (time_elapsed * 365.25)
+
+        suite.Δt[i] = model.time_stepping.Δt_sec
+        suite.SYPD[i] = sypd
+    end
+end
diff --git a/benchmark/define_benchmarks.jl b/benchmark/define_benchmarks.jl
@@ -0,0 +1,65 @@
+# dictionary of all benchmark suites, define with whatever key ::Symbol
+benchmarks = Dict{Symbol,BenchmarkSuite}()
+
+# Models
+benchmarks[:benchmark100] = BenchmarkSuite(
+    title = "Models, default setups",
+    nruns = 4,
+    model = [BarotropicModel, ShallowWaterModel, PrimitiveDryModel, PrimitiveWetModel],
+    )
+
+# BarotropicModel, resolution
+benchmarks[:benchmark200] = BenchmarkSuite(
+    title = "Shallow water model, resolution",
+    nruns = 7,
+    model = fill(ShallowWaterModel, 7),
+    trunc = [31, 42, 63, 85, 127, 170, 255],
+    )
+
+## Primitive WET MODELS RESOLUTION
+benchmarks[:benchmark201] = BenchmarkSuite(
+    title = "Primitive wet model, resolution",
+    nruns = 6,
+    model = fill(PrimitiveWetModel, 6),
+    trunc = [31, 42, 63, 85, 127, 170],
+    )
+
+## NUMBER FORMATS
+benchmarks[:benchmark300] = BenchmarkSuite(
+    title = "Primitive Equation, Float32 vs Float64",
+    nruns = 2,
+    NF = [Float32, Float64],
+    )
+
+## GRIDS
+benchmarks[:benchmark400] = BenchmarkSuite(
+    title = "Grids",
+    nruns = 6,
+    trunc = fill(63, 6),
+    Grid = [FullGaussianGrid, FullClenshawGrid, OctahedralGaussianGrid, OctahedralClenshawGrid,
+            HEALPixGrid, OctaHEALPixGrid],
+    )
+
+## NLEV
+benchmarks[:benchmark500] = BenchmarkSuite(
+    title = "Number of vertical layers",
+    nruns = 4,
+    nlev = [4, 8, 12, 16],
+    )
+
+## PHYSICS/DYNAMICS
+benchmarks[:benchmark600] = BenchmarkSuite(
+    title = "PrimitiveDryModel: Physics or dynamics only",
+    nruns = 3,
+    model = [PrimitiveDryModel, PrimitiveDryModel, PrimitiveDryModel],
+    physics = [true, false, true],
+    dynamics = [true, true, false],
+    )
+
+## PHYSICS/DYNAMICS
+benchmarks[:benchmark601] = BenchmarkSuite(
+    title = "PrimitiveWetModel: Physics or dynamics only",
+    nruns = 3,
+    physics = [true, false, true],
+    dynamics = [true, true, false],
+    )
diff --git a/benchmark/manual_benchmarking.jl b/benchmark/manual_benchmarking.jl
@@ -0,0 +1,104 @@
+using SpeedyWeather, Dates, Printf
+import SpeedyWeather.SpeedyTransforms: prettymemory
+
+include("benchmark_suite.jl")
+include("define_benchmarks.jl")
+
+## RUN BENCHMARKS
+for key in keys(benchmarks)
+    suite = benchmarks[key]
+    run_benchmark_suite!(suite)
+end
+
+## WRITE into benchmarks.md
+md = open("README.md", "w")
+write(md, "# Benchmarks\n")
+
+version = SpeedyWeather.pkgversion(SpeedyWeather)
+write(md, "\ncreated for SpeedyWeather.jl v$version on $(Dates.format(Dates.now(), Dates.RFC1123Format)). \n\n")
+write(md, "All simulations have been benchmarked over several seconds (wallclock time) without output. ")
+write(md, "Benchmarking excludes initialization and is started just before the main time loop and finishes right after. ")
+write(md, "The benchmarking results here are not very robust, timings that change with +-50% are not uncommon. ")
+write(md, "Proper benchmarking for performance optimization uses the minimum or median of many executions, ")
+write(md, "while we run a simulation for several time steps which effectively represents the mean, ")
+write(md, "susceptible to outliers that slow down the simulation. However, this is what a user will experience ")
+write(md, "in most situations anyway and the following therefore presents a rough idea of how fast a ")
+write(md, "SpeedyWeather simulation will run, and how much memory it requires.\n\n")
+
+write(md, "### Machine details\n\n")
+write(md, "All benchmark simulation were single-threaded on a CPU:\n")
+
+write(md, "```julia\njulia> versioninfo()\n")
+versioninfo(md)
+write(md,"```\n\n")
+
+write(md, "### Explanation\n\n")
+write(md, "Abbreviations in the tables below are as follows, omitted columns use defaults.\n")
+write(md, "- NF: Number format, default: $(SpeedyWeather.DEFAULT_NF)\n")
+write(md, "- T: Spectral resolution, maximum degree of spherical harmonics, default: T$(SpeedyWeather.DEFAULT_TRUNC)\n")
+write(md, "- L: Number of vertical layers, default: $(SpeedyWeather.DEFAULT_NLEV) (for 3D models)\n")
+write(md, "- Grid: Horizontal grid, default: $(SpeedyWeather.DEFAULT_GRID)\n")
+write(md, "- Rings: Grid-point resolution, number of latitude rings pole to pole\n")
+write(md, "- Dynamics: With dynamics?, default: true\n")
+write(md, "- Physics: With physical parameterizations?, default: true (for primitive equation models)\n")
+write(md, "- Δt: time step [s].\n")
+write(md, "- SYPD: Speed of simulation, simulated years per wallclock day.\n")
+write(md, "- Memory: Memory footprint of simulation, variables and constants.\n\n")
+
+write(md, "### Running the benchmarks\n\n")
+write(md, "The benchmark suite here can be reproduced by executing:\n\n")
+write(md, "```> julia manual_benchmarking.jl```\n\n")
+write(md, "inside `the SpeedyWeather.jl/benchmark` folder. It will create this `README.md` which can be pushed ")
+write(md, "to the repository for updates or comparison.")
+
+# Write benchmark suites into markdown
+for key in keys(benchmarks)
+
+    suite = benchmarks[key]
+
+    write(md, "\n## $(suite.title)\n\n")
+
+    print_NF = any(suite.NF .!= suite.NF[1])
+    print_Grid = any(suite.Grid .!= suite.Grid[1])
+    print_nlat = any(suite.nlat .!= suite.nlat[1]) 
+    print_dynamics = any(suite.dynamics .!= suite.dynamics[1])
+    print_physics = any(suite.physics .!= suite.physics[1])
+
+    column_header = "| Model "
+    column_header *= print_NF ? "| NF " : ""
+    column_header *= "| T "
+    column_header *= "| L "
+    column_header *= print_Grid ? "| Grid " : ""
+    column_header *= print_nlat ? "| Rings " : ""
+    column_header *= print_dynamics ? "| Dynamics " : ""
+    column_header *= print_physics ? "| Physics " : ""
+    column_header *= "| Δt | SYPD | Memory|"
+
+    ncolumns = length(findall('|',column_header)) - 1
+    second_row = repeat("| - ", ncolumns) * "|"
+
+    write(md,"$column_header\n")
+    write(md,"$second_row\n")
+
+    for i in 1:suite.nruns
+
+        row = "| $(suite.model[i]) "
+        row *= print_NF ? "| $(suite.NF[i]) " : ""
+        row *= "| $(suite.trunc[i]) "
+        row *= "| $(suite.nlev[i]) "
+        row *= print_Grid ? "| $(suite.Grid[i]) " : ""
+        row *= print_nlat ? "| $(suite.nlat[i]) " : ""
+        row *= print_dynamics ? "| $(suite.dynamics[i]) " : ""
+        row *= print_physics  ? "| $(suite.physics[i]) " : ""
+
+        Δt = round(Int,suite.Δt[i])
+        sypd = suite.SYPD[i]
+        SYPD = isfinite(sypd) ? round(Int, sypd) : 0
+        memory = prettymemory(suite.memory[i])
+        row *= "| $Δt | $SYPD | $memory |"
+
+        write(md,"$row\n")
+    end
+end
+
+close(md)