diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 64d9d06..79ea714 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-08-25T18:09:19","documenter_version":"1.6.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-08-25T18:19:46","documenter_version":"1.6.0"}} \ No newline at end of file diff --git a/dev/Contact.html b/dev/Contact.html index 658588f..ce05f13 100644 --- a/dev/Contact.html +++ b/dev/Contact.html @@ -1,2 +1,2 @@ -Contact Developer · TrixiEnzyme.jl

Contact Developer

If you have questions, suggestions, or are interested in contributing, feel free to reach out our developer, Junyi, via junyixu0@gmail.com.

+Contact Developer · TrixiEnzyme.jl

Contact Developer

If you have questions, suggestions, or are interested in contributing, feel free to reach out our developer, Junyi, via junyixu0@gmail.com.

diff --git a/dev/GSoC.html b/dev/GSoC.html index 80766ce..2988959 100644 --- a/dev/GSoC.html +++ b/dev/GSoC.html @@ -1,2 +1,2 @@ -GSoC · TrixiEnzyme.jl

Final Report: GSoC '24

  • Student Name: Junyi(@junyixu).
  • Organization: Trixi Framework community.
  • Mentors: Michael(@sloede) and Hendrik(@ranocha)
  • Project: Integrating the Modern CFD Package Trixi.jl with Compiler-Based Auto-Diff via Enzyme.jl
  • Project Link: https://github.com/junyixu/TrixiEnzyme.jl

Project Overview

Trixi.jl is a numerical simulation framework for conservation laws written in Julia. The integration of Trixi.jl with Compiler-Based (LLVM level) automatic differentiation via Enzyme.jl offers the following benefits: facilitates rapid forward mode AD, enables reverse mode AD, supports cross-language AD, and critically, supports mutating operations and caching, on which Trixi.jl relies, to enhance the performance of both simulation runs and AD. The final deliverable will include as many of Trixi's advanced features as possible, such as adaptive mesh refinement, shock capturing, etc., showcasing the benefits of differentiable programming in Julia's ecosystem.

  • Forward Mode Automatic Differentiation (AD) for Discontinuous Galerkin Collocation Spectral Element Method (DGSEM): Implement forward mode automatic differentiation to enhance the calculation of derivatives in DG methods, improving computational efficiency and accuracy for various applications.
  • Reverse Mode Automatic Differentiation for DG.
  • Improve Performance:
    • Extract Parameters Passed to Enzyme: Implement a systematic approach to extract and manage parameters passed to Enzyme, ensuring optimal configuration and efficiency in the execution of AD tasks.
    • batchsize for Jacobians:
      • Optimize for Memory Bandwidth: Fine-tune the batch size in Jacobian computations to optimize the use of memory bandwidth, thus improving the overall performance and speed of the computations.
      • Automatically Pick batchsize
  • Explore Enzyme Custom Rules: Investigate and implement custom rules within the Enzyme AD framework to handle specific cases and operations that are not optimally managed by the default settings, enhancing the flexibility and capability of the AD processes.

Please note that the last step was planned but remains incomplete due to time constraints and this step will be completed in the future if possible.

Constraints and Future Work

  • Make Reverse Mode AD Work with Polyester.jl: Address compatibility issues and integrate reverse mode AD with Polyester.jl for multithreading capabilities, aiming to enhance performance and scalability of the AD operations across different computing environments.
  • Integrate Enzyme with GPU Kernels: Extend the functionality of Enzyme by integrating it with GPU kernels, allowing AD operations to leverage the parallel processing power of GPUs.

Acknowledgments

The entire project, along with this blog website, is developed and maintained by Junyi(@junyixu). The whole project is under the guidance of two outstanding professors, Michael(@sloede) and Hendrik(@ranocha), from Trixi Framework community.

The project also received support from other Julia contributors, including Benedict from Trixi Framework community.

+GSoC · TrixiEnzyme.jl

Final Report: GSoC '24

  • Student Name: Junyi(@junyixu).
  • Organization: Trixi Framework community.
  • Mentors: Michael(@sloede) and Hendrik(@ranocha)
  • Project: Integrating the Modern CFD Package Trixi.jl with Compiler-Based Auto-Diff via Enzyme.jl
  • Project Link: https://github.com/junyixu/TrixiEnzyme.jl

Project Overview

Trixi.jl is a numerical simulation framework for conservation laws written in Julia. The integration of Trixi.jl with Compiler-Based (LLVM level) automatic differentiation via Enzyme.jl offers the following benefits: facilitates rapid forward mode AD, enables reverse mode AD, supports cross-language AD, and critically, supports mutating operations and caching, on which Trixi.jl relies, to enhance the performance of both simulation runs and AD. The final deliverable will include as many of Trixi's advanced features as possible, such as adaptive mesh refinement, shock capturing, etc., showcasing the benefits of differentiable programming in Julia's ecosystem.

  • Forward Mode Automatic Differentiation (AD) for Discontinuous Galerkin Collocation Spectral Element Method (DGSEM): Implement forward mode automatic differentiation to enhance the calculation of derivatives in DG methods, improving computational efficiency and accuracy for various applications.
  • Reverse Mode Automatic Differentiation for DG.
  • Improve Performance:
    • Extract Parameters Passed to Enzyme: Implement a systematic approach to extract and manage parameters passed to Enzyme, ensuring optimal configuration and efficiency in the execution of AD tasks.
    • batchsize for Jacobians:
      • Optimize for Memory Bandwidth: Fine-tune the batch size in Jacobian computations to optimize the use of memory bandwidth, thus improving the overall performance and speed of the computations.
      • Automatically Pick batchsize
  • Explore Enzyme Custom Rules: Investigate and implement custom rules within the Enzyme AD framework to handle specific cases and operations that are not optimally managed by the default settings, enhancing the flexibility and capability of the AD processes.

Please note that the last step was planned but remains incomplete due to time constraints and this step will be completed in the future if possible.

Constraints and Future Work

  • Make Reverse Mode AD Work with Polyester.jl: Address compatibility issues and integrate reverse mode AD with Polyester.jl for multithreading capabilities, aiming to enhance performance and scalability of the AD operations across different computing environments.
  • Integrate Enzyme with GPU Kernels: Extend the functionality of Enzyme by integrating it with GPU kernels, allowing AD operations to leverage the parallel processing power of GPUs.

Acknowledgments

The entire project, along with this blog website, is developed and maintained by Junyi(@junyixu). The whole project is under the guidance of two outstanding professors, Michael(@sloede) and Hendrik(@ranocha), from Trixi Framework community.

The project also received support from other Julia contributors, including Benedict from Trixi Framework community.

diff --git a/dev/api.html b/dev/api.html index 03055e0..77a9228 100644 --- a/dev/api.html +++ b/dev/api.html @@ -1,5 +1,5 @@ -API reference · TrixiEnzyme.jl

The TrixiEnzyme Module

TrixiEnzymeModule
TrixiEnzyme

TrixiEnzyme.jl is a component package of the Trixi.jl ecosystem and integrates Trixi.jl with Compiler-Based (LLVM level) automatic differentiation via Enzyme.jl for hyperbolic partial differential equations (PDEs). The integration of Trixi.jl with Compiler-Based (LLVM level) automatic differentiation via Enzyme.jl offers the following benefits: facilitates rapid forward mode AD, enables reverse mode AD, supports cross-language AD, and critically, supports mutating operations and caching, on which Trixi.jl relies, to enhance the performance of both simulation runs and AD. The final deliverable will include as many of Trixi's advanced features as possible, such as adaptive mesh refinement, shock capturing, etc., showcasing the benefits of differentiable programming in Julia's ecosystem.

source

Module Index

Detailed API

TrixiEnzyme.enzyme_rhs!Method
enzyme_rhs!(du_ode::AbstractVector, u_ode::AbstractVector, mesh, equations, initial_condition, boundary_conditions, source_terms, solver, boundaries, _node_coordinates, cell_ids, node_coordinates, inverse_jacobian, _neighbor_ids, neighbor_ids, orientation, surface_flux_values, u)

The best thing to do for a user would be to separate out the things that you need to track through, make them arguments to the function, and then simply Duplicate on those.

source
TrixiEnzyme.jacobian_enzyme_forwardFunction
jacobian_enzyme_forward(semi::SemidiscretizationHyperbolic)

Uses the right-hand side operator of the semidiscretization semi and forward mode automatic differentiation to compute the Jacobian J of the semidiscretization semi at state u0_ode.


jacobian_enzyme_forward(f!::F, x::AbstractArray; N = pick_batchsize(x)) where F <: Function

Uses the function f! and forward mode automatic differentiation to compute the Jacobian J

Examples

julia> x = -1:0.5:1;
+API reference · TrixiEnzyme.jl

The TrixiEnzyme Module

TrixiEnzymeModule
TrixiEnzyme

TrixiEnzyme.jl is a component package of the Trixi.jl ecosystem and integrates Trixi.jl with Compiler-Based (LLVM level) automatic differentiation via Enzyme.jl for hyperbolic partial differential equations (PDEs). The integration of Trixi.jl with Compiler-Based (LLVM level) automatic differentiation via Enzyme.jl offers the following benefits: facilitates rapid forward mode AD, enables reverse mode AD, supports cross-language AD, and critically, supports mutating operations and caching, on which Trixi.jl relies, to enhance the performance of both simulation runs and AD. The final deliverable will include as many of Trixi's advanced features as possible, such as adaptive mesh refinement, shock capturing, etc., showcasing the benefits of differentiable programming in Julia's ecosystem.

source

Module Index

Detailed API

TrixiEnzyme.enzyme_rhs!Method
enzyme_rhs!(du_ode::AbstractVector, u_ode::AbstractVector, mesh, equations, initial_condition, boundary_conditions, source_terms, solver, boundaries, _node_coordinates, cell_ids, node_coordinates, inverse_jacobian, _neighbor_ids, neighbor_ids, orientation, surface_flux_values, u)

The best thing to do for a user would be to separate out the things that you need to track through, make them arguments to the function, and then simply Duplicate on those.

source
TrixiEnzyme.jacobian_enzyme_forwardFunction
jacobian_enzyme_forward(semi::SemidiscretizationHyperbolic)

Uses the right-hand side operator of the semidiscretization semi and forward mode automatic differentiation to compute the Jacobian J of the semidiscretization semi at state u0_ode.


jacobian_enzyme_forward(f!::F, x::AbstractArray; N = pick_batchsize(x)) where F <: Function

Uses the function f! and forward mode automatic differentiation to compute the Jacobian J

Examples

julia> x = -1:0.5:1;
 julia> batch_size = 2;
 julia> jacobian_enzyme_forward(TrixiEnzyme.upwind!, x, N=batch_size)
 5×5 Matrix{Float64}:
@@ -7,9 +7,9 @@
   0.2  -0.2  -0.0  -0.0  -0.0
  -0.0   0.2  -0.2  -0.0  -0.0
  -0.0  -0.0   0.2  -0.2  -0.0
- -0.0  -0.0  -0.0   0.2  -0.2
source
TrixiEnzyme.jacobian_enzyme_forward_closureMethod
jacobian_enzyme_forward_closure(semi::SemidiscretizationHyperbolic; N = pick_batchsize(semi))

Same as jacobian_enzyme_forward but with closure

Notes

I resolved issues related to type instability caused by closures, which is a known limitation of Enzyme.

I utilized closures here because they simplify the reuse of memory buffers and temporary variables without the need for explicit storage. let blocks create a new hard scope and optionally introduce new local bindings.

source
TrixiEnzyme.jacobian_enzyme_reverse_closureMethod
jacobian_enzyme_reverse_closure(semi::SemidiscretizationHyperbolic)

Same as jacobian_enzyme_reverse but with closure

Warning

Enzyme.jl does not play well with Polyester.jl and there are no plans to fix this soon.

source
TrixiEnzyme.jacobian_enzyme_forward_closureMethod
jacobian_enzyme_forward_closure(semi::SemidiscretizationHyperbolic; N = pick_batchsize(semi))

Same as jacobian_enzyme_forward but with closure

Notes

I resolved issues related to type instability caused by closures, which is a known limitation of Enzyme.

I utilized closures here because they simplify the reuse of memory buffers and temporary variables without the need for explicit storage. let blocks create a new hard scope and optionally introduce new local bindings.

source
TrixiEnzyme.jacobian_enzyme_reverse_closureMethod
jacobian_enzyme_reverse_closure(semi::SemidiscretizationHyperbolic)

Same as jacobian_enzyme_reverse but with closure

Warning

Enzyme.jl does not play well with Polyester.jl and there are no plans to fix this soon.

source
TrixiEnzyme.pick_batchsizeFunction
pick_batchsize(x)
 pick_batchsize(semi)

Return a reasonable batch size for batched differentiation.

Arguments

  • x: AbstractArray
  • semi: SemidiscretizationHyperbolic in Trixi.jl

Notes

Inspired by https://github.com/EnzymeAD/Enzyme.jl/pull/1545/files

Warning

This function is experimental, and not part of the public API.

Examples

julia> pick_batchsize(rand(3))
 3
 
 julia> pick_batchsize(rand(20))
-11
source
+11
source
diff --git a/dev/examples.html b/dev/examples.html index a7ce60f..5b90865 100644 --- a/dev/examples.html +++ b/dev/examples.html @@ -26,4 +26,4 @@ J2 = jacobian_enzyme_forward(semi;N=1) J3 = jacobian_enzyme_reverse(semi;N=1) -J1 == J2 +J1 == J2 diff --git a/dev/index.html b/dev/index.html index 457fdd9..e54d8c9 100644 --- a/dev/index.html +++ b/dev/index.html @@ -16,4 +16,4 @@ julia> @time jacobian_enzyme_forward(TrixiEnzyme.upwind!, x, N=11); 0.000332 seconds (307 allocations: 410.453 KiB)

If you do not explicitly provide a chunk size, TrixiEnzyme will try to guess one for you based on your input vector:

julia> x = -1:0.01:1;
 julia> @time jacobian_enzyme_forward(TrixiEnzyme.upwind!, x);
-  0.000327 seconds (307 allocations: 410.453 KiB)

Benchmark for a 401x401 Jacobian of TrixiEnzyme.upwind! (Lower is better): upwind benchmark Enyme(@batch) means applying Polyester.@batch to middlebatches.

+ 0.000327 seconds (307 allocations: 410.453 KiB)

Benchmark for a 401x401 Jacobian of TrixiEnzyme.upwind! (Lower is better): upwind benchmark Enyme(@batch) means applying Polyester.@batch to middlebatches.

diff --git a/dev/notes.html b/dev/notes.html index 7f952b9..b8ae69b 100644 --- a/dev/notes.html +++ b/dev/notes.html @@ -1,3 +1,3 @@ Notes · TrixiEnzyme.jl

Proof of reverse mode AD

Take the example of polar coordinate system $x = r \cos \theta, y = r \sin \theta$, according to the chain rule: compute_graph We use $t$ to represent the $n$-th layer in the tree. with the values of 1,2,3.

\[\frac{\partial y_1}{\partial x_1} = \frac{\partial y_1}{\partial v_4} \frac{\partial v_4}{\partial x_1} \implies \frac{\partial y_1}{\partial x_1} = \sum_{i(t)} \frac{\partial y_1}{\partial v_{t,i}} {\color{red}\frac{\partial v_{t,i}}{\partial x_1}}\]

\[\frac{\partial v_4}{\partial x_1} = \frac{\partial v_4}{\partial v_{-1}} \frac{\partial v_{-1}}{\partial x_1} + \frac{\partial v_4}{\partial v_1} \frac{\partial v_1}{\partial x_1} \implies {\color{red}\frac{\partial v_{t,i}}{\partial x_1}} = \sum_{j(t)} \frac{\partial v_{t,i}}{\partial v_{t-1, j}} \frac{\partial v_{t-1, j}}{\partial x_1}\]

\[\frac{\partial y_1}{\partial x_1} = \sum_{i(t)} \frac{\partial y_1}{\partial v_{t,i}} -\sum_{j(i)} \frac{\partial v_{t,i}}{\partial v_{t-1,j}} \frac{\partial v_{t-1,j}}{\partial x_1} = \sum_{j} \bar{v}_{t-1,j} \frac{\partial v_{t-1,j}}{\partial x_1}\]

where we define adjoint as

\[\bar{v}_{t-1, j} = \sum_i \bar{v}_{t, i} \frac{\partial v_{t,i}}{\partial v_{t-1,j}}.\]

For example (in this computing graph):

\[\bar{v}_0 = \bar{v}_{1,2} = \frac{\partial v_{2,1}}{\partial v_{1,2}} \bar{v}_{2,1} + \frac{\partial v_{2,2}}{\partial v_{1,2}} \bar{v}_{2,2}.\]

+\sum_{j(i)} \frac{\partial v_{t,i}}{\partial v_{t-1,j}} \frac{\partial v_{t-1,j}}{\partial x_1} = \sum_{j} \bar{v}_{t-1,j} \frac{\partial v_{t-1,j}}{\partial x_1}\]

where we define adjoint as

\[\bar{v}_{t-1, j} = \sum_i \bar{v}_{t, i} \frac{\partial v_{t,i}}{\partial v_{t-1,j}}.\]

For example (in this computing graph):

\[\bar{v}_0 = \bar{v}_{1,2} = \frac{\partial v_{2,1}}{\partial v_{1,2}} \bar{v}_{2,1} + \frac{\partial v_{2,2}}{\partial v_{1,2}} \bar{v}_{2,2}.\]