Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compilation time #258

Closed
rafaqz opened this issue Mar 29, 2022 · 7 comments
Closed

Fix compilation time #258

rafaqz opened this issue Mar 29, 2022 · 7 comments

Comments

@rafaqz
Copy link
Owner

rafaqz commented Mar 29, 2022

Loading a raster takes a few seconds the first run. Some of this is due to the backends, some to DimensionalData.jl, some here. We should fix all of these as much as possible.

@felixcremer
Copy link
Contributor

With a fully precompiled environment I get a 3.6 second time for loading Rasters, ArchGDAL and NCDatasets and another 3.8 seconds opening a netcdf file.

The largest imports timing I see is Dimensionaldata with 800 ms and ImageCore with 658 ms
I think, that this is an okay timing.
The more problematic thing is the precompilation time which was in this environment over 400 seconds, admittedly with GLMakie in it as well so that increases the timings by a lot.

`@time_imports` timings

julia> @time @time_imports using ArchGDAL, Rasters, NCDatasets
      4.4 ms  CEnum
     18.3 ms  Preferences
      0.8 ms  JLLWrappers
     13.2 ms  GEOS_jll 78.91% compilation time
      0.4 ms  Zlib_jll
      0.9 ms  SQLite_jll
      0.9 ms  JpegTurbo_jll
      0.6 ms  LERC_jll
      0.9 ms  XZ_jll
      0.7 ms  Zstd_jll
      0.7 ms  Libtiff_jll
      2.8 ms  PROJ_jll
      1.9 ms  OpenSSL_jll
      1.6 ms  Kerberos_krb5_jll
      3.0 ms  ICU_jll
      0.8 ms  LibPQ_jll
      0.8 ms  LittleCMS_jll
      0.7 ms  libpng_jll
      0.8 ms  OpenJpeg_jll
      0.6 ms  Expat_jll
      1.0 ms  libgeotiff_jll
      0.7 ms  Bzip2_jll
      0.5 ms  MPIPreferences
     10.8 ms  CompilerSupportLibraries_jll 90.02% compilation time
      1.0 ms  libaec_jll
      2.2 ms  MPICH_jll
      4.7 ms  HDF5_jll
      0.7 ms  Libiconv_jll
      1.0 ms  XML2_jll
      1.0 ms  NetCDF_jll
      8.7 ms  boost_jll
      1.3 ms  Lz4_jll
      1.7 ms  Thrift_jll
      0.7 ms  LZO_jll
      0.7 ms  snappy_jll
      8.6 ms  Arrow_jll
      5.8 ms  GDAL_jll
     43.7 ms  GDAL
     13.3 ms  GeoFormatTypes
      4.0 ms  Extents
     16.4 ms  GeoInterface
      0.5 ms  PrecompileTools
     19.5 ms  RecipesBase
      0.6 ms  GeoInterfaceRecipes
      0.3 ms  DataValueInterfaces
      1.3 ms  DataAPI
      0.3 ms  IteratorInterfaceExtensions
      0.3 ms  TableTraits
     44.8 ms  Tables
      0.4 ms  Reexport
      1.3 ms  Statistics
     57.4 ms  FixedPointNumbers
    116.0 ms  ColorTypes
    108.7 ms  Colors
      7.9 ms  IrrationalConstants
      2.0 ms  DocStringExtensions
      0.8 ms  LogExpFunctions
      0.4 ms  OpenLibm_jll
      1.0 ms  OpenSpecFun_jll
     17.9 ms  SpecialFunctions
      0.6 ms  TensorCore
    179.3 ms  ColorVectorSpace
      0.7 ms  Adapt
     62.5 ms  OffsetArrays
      2.9 ms  PaddedViews
      3.9 ms  MappedArrays
      3.3 ms  StackViews
      1.3 ms  MosaicViews
      0.5 ms  NaNMath
      1.4 ms  Graphics
     11.0 ms  AbstractFFTs
      0.6 ms  AbstractFFTs → AbstractFFTsTestExt
    658.5 ms  ImageCore
     34.5 ms  DiskArrays
    192.1 ms  ArchGDAL
      0.4 ms  SuiteSparse
      0.5 ms  Requires
      4.2 ms  ArrayInterface
      2.0 ms  ConstructionBase
      7.1 ms  InvertedIndices
     53.2 ms  IntervalSets
      0.6 ms  ConstructionBase → ConstructionBaseIntervalSetsExt
      0.3 ms  IntervalSets → IntervalSetsStatisticsExt
    837.2 ms  DimensionalData
    228.1 ms  FillArrays
      8.4 ms  FillArrays → FillArraysSparseArraysExt
      0.5 ms  FillArrays → FillArraysStatisticsExt
      1.1 ms  FieldMetadata
      0.7 ms  Flatten
      5.8 ms  ProgressMeter
     23.9 ms  Missings
     18.3 ms  MacroTools
      5.9 ms  StaticArraysCore
      0.6 ms  ArrayInterface → ArrayInterfaceStaticArraysCoreExt
     32.0 ms  Setfield
     46.5 ms  Rasters
      2.0 ms  Rasters → RastersArchGDALExt
     18.1 ms  CFTime
      0.6 ms  Compat
      0.3 ms  Compat → CompatLinearAlgebraExt
    131.0 ms  DataStructures
      2.7 ms  CommonDataModel
    137.2 ms  NCDatasets
      2.1 ms  Rasters → RastersNCDatasetsExt
  3.650150 seconds (3.53 M allocations: 206.035 MiB, 5.00% gc time, 3.97% compilation time)

@rafaqz
Copy link
Owner Author

rafaqz commented Oct 25, 2023

Try it without ArchGDAL and NCDatasets.jl

@felixcremer
Copy link
Contributor

Just using Rasters takes 1.5 seconds and then loading only NCDatasets to actually be able to load data takes another 0.42 seconds.
For Rasters the largest contributors are

 703.7 ms  DimensionalData
 57.3 ms  FixedPointNumbers
110.4 ms  ColorTypes
 92.3 ms  OffsetArrays
140.7 ms  FillArrays

and the rest is below 50 ms per package.

Details

```julia julia> @time @time_imports using Rasters 1.2 ms Statistics 0.6 ms Adapt 0.3 ms SuiteSparse 0.6 ms Requires 3.9 ms ArrayInterface 1.5 ms ConstructionBase 2.9 ms Extents 3.4 ms InvertedIndices 0.3 ms IteratorInterfaceExtensions 17.1 ms Preferences 0.5 ms PrecompileTools 18.9 ms RecipesBase 0.4 ms TableTraits 0.2 ms DataValueInterfaces 1.3 ms DataAPI 42.7 ms Tables 44.1 ms IntervalSets 0.4 ms ConstructionBase → ConstructionBaseIntervalSetsExt 0.2 ms IntervalSets → IntervalSetsStatisticsExt 703.7 ms DimensionalData 57.3 ms FixedPointNumbers 110.4 ms ColorTypes 92.3 ms OffsetArrays 26.8 ms DiskArrays 140.7 ms FillArrays 6.9 ms FillArrays → FillArraysSparseArraysExt 0.4 ms FillArrays → FillArraysStatisticsExt 0.8 ms FieldMetadata 0.6 ms Flatten 17.5 ms GeoInterface 4.6 ms ProgressMeter 15.1 ms Missings 0.3 ms Reexport 15.1 ms MacroTools 5.0 ms StaticArraysCore 0.6 ms ArrayInterface → ArrayInterfaceStaticArraysCoreExt 26.6 ms Setfield 14.3 ms GeoFormatTypes 37.9 ms Rasters 1.516850 seconds (1.78 M allocations: 95.463 MiB, 3.61% gc time, 1.75% compilation time)

julia> @time @time_imports using NCDatasets
15.6 ms CFTime
0.5 ms Compat
0.3 ms Compat → CompatLinearAlgebraExt
102.6 ms DataStructures
0.6 ms JLLWrappers
10.5 ms Bzip2_jll 91.98% compilation time
0.7 ms MPIPreferences
9.3 ms CompilerSupportLibraries_jll 88.99% compilation time
2.1 ms OpenSSL_jll
0.3 ms Zlib_jll
0.9 ms libaec_jll
2.6 ms MPICH_jll
4.9 ms HDF5_jll
0.8 ms Libiconv_jll
1.1 ms XML2_jll
0.7 ms Zstd_jll
1.0 ms NetCDF_jll
2.4 ms CommonDataModel
120.8 ms NCDatasets
1.8 ms Rasters → RastersNCDatasetsExt
0.422905 seconds (500.42 k allocations: 33.224 MiB, 5.96% gc time, 30.38% compilation time)

julia> @time Raster("/home/fcremer/Daten/image_vs_heatmap_corner.nc")
[ Info: No name or key keyword provided, using first valid layer with name :unnamed
3.235648 seconds (4.67 M allocations: 313.527 MiB, 3.91% gc time, 99.62% compilation time: 6% of which was recompilation)



</p>
</details> 

@felixcremer
Copy link
Contributor

using DimensionalData takes 0.91 seconds with itself being the main contributor with the following import timings:

julia> @time @time_imports using DimensionalData
      1.3 ms  Statistics
      0.7 ms  Adapt
      0.4 ms  SuiteSparse
      0.4 ms  Requires
      3.5 ms  ArrayInterface
      1.4 ms  ConstructionBase
      3.0 ms  Extents
      3.6 ms  InvertedIndices
      0.3 ms  IteratorInterfaceExtensions
     16.9 ms  Preferences
      0.4 ms  PrecompileTools
     18.6 ms  RecipesBase
      0.4 ms  TableTraits
      0.2 ms  DataValueInterfaces
      1.2 ms  DataAPI
     42.6 ms  Tables
     44.0 ms  IntervalSets
      0.5 ms  ConstructionBase  ConstructionBaseIntervalSetsExt
      0.3 ms  IntervalSets  IntervalSetsStatisticsExt
    707.2 ms  DimensionalData
  0.915788 seconds (1.32 M allocations: 65.461 MiB, 3.02% gc time, 2.96% compilation time)

@rafaqz
Copy link
Owner Author

rafaqz commented Oct 25, 2023

I think a lot of DD is loading the precompiled code. We could maybe cull some less used parts?

  • Colortypes and FixedPointNumbers are needed for the plot recipes, we could get rid of that if we used an extension on Plots.jl.
  • OffsetArrays.jl is just for the coverage algorithm, I could maybe rewrite it to not need that.
  • FillArrays is pretty convenient for initialising arrays without allocations.

@rafaqz
Copy link
Owner Author

rafaqz commented Oct 25, 2023

ArchGDAL depending on ImageCore is a pretty big chunk of your total time, I'm not sure why we need that.

It pulls in:

ColorVectorSpace = "0.10"
Colors = "0.12"
FixedPointNumbers = "0.8"
MappedArrays = "0.2, 0.3, 0.4"
MosaicViews = "0.3.3"
OffsetArrays = "0.8, 0.9, 0.10, 0.11, 1.0.1"
PaddedViews = "0.5.8"
PrecompileTools = "1"
Reexport = "0.2, 1.0"

@rafaqz
Copy link
Owner Author

rafaqz commented Feb 1, 2024

Compile time is pretty good here now with extensions. Closing.

@rafaqz rafaqz closed this as completed Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants