Physics-Informed Neural Network for Predicting Charged Particle's 2D Phase Flow in an Electro-Magnetic Potential
(Final Project of APMA2070 @ Brown)
Implemented in JAX(0.4.6). Data generated by a Stormer-Verlet integrator (1500 training pts + 300 testing pts)
(CUDA 11.7.1, CuDNN 8.6.0 if using GPU/TPU)
JAX
, numpy
, matplotlib
(For plotting), JAXopt
(for L-BFGS optimizer), pickle
(for saving and loading models)
- Run locally by (Vinilla PINN for the Inverse Problem)
python3 train_PINN.py --inverseprob True --savefig True --savemodel True --lamda 5.0,5.0,1.0 --lbfgs 1 --adam 7000
- Run locally by (PINN with Adversarial Training for the Inverse Problem)
python3 train_AdPINN.py --inverseprob True --savefig True --savemodel True --lamda 5.0,5.0,1.0 --lbfgs 1 --adam 7000
- Run locally by PINN that predicts the Phase Flow instead of just the spatial locations
python3 train_pfPINN.py --inverseprob True --adtrain False --savefig True --savemodel True --lamda 5.0,5.0,1.0 --lbfgs 1 --adam 7000
- Run on Oscar with SLURM script
pinn.sh
The Reference Paper on Poisson Neural Networks(PNNs)
Primarily uses SympNets
, PNNs
, PINNs
(will only provide PINN part
in this repo, refer to Pengzhan Jin's Learner
Module(PyTorch) for SympNet
and PNN
; Modified to include more optimizers, i.g. L-BFGS and loading utilities) to predict future states (within a plane, a special case) of a charged particle in an electro-magnetic field, whose motion is governed by the Lorentz Force:
where
For the Inverse Problem (identifying the values of --inverseprob True
to set jax.Array
Since PINN is essentially an approximation to a function governed by a system of PDEs or ODEs, and we are expected to predict the charged particle's phase flow(spatial location in terms of time), here we will use a linear neural network (based on Universal Approximation Theorem) that uses time $$\begin{align}\dot{X}(0) &= V_0 \
X(0) &= X_0\end{align}$$
and the Physics Equation (Lorentz force)
Here, we will use an MLP of 8 hidden layers each of 30 neurons, with input size of 1 and output size of 4
PINN is essentially a DeepLearning based method. Itself does not strictly abides by the Physics Laws, therefore conventional analytical and numerical techniques to solve the ODE (dynamics) may not apply. Some issues and challenges in applying PINN to this specific problem include:
- Periodicity in the Phase Flow?. The given trajectory somewhat looks like a Lissajous Curve, but it is not. Thus periodic featuring (i.g. defining the input as a periodic function to help with the training/approximation) may not work.
-
Computing Elementwise Derivatives (up to the Second Order). Different implementations(auto-diff function wrappers) are included in the
utils.auto_diff
module. Upon preliminary evaluations of performances and costs,vmap
+ reshape is chosen over the handcrafted Hessian Vector Product that uses Forward-over-Reverse mode. -
Excluding the Singularity at the Origin. The electric field is is not completely 'source-less' - its magnitude along with the potential value will go to infinity at the origin. Unlike numerical methods (i.g. ODE solver, integrator, Picard Iteration etc.), PINN won't be able to handle this abnormality (
nan
$*n =$ nan
just as$0 * n = n$ ??) The essence of the problem lies in the fact that the Dirac-Delta function (and possibly the Green Function?) cannot be mathematically defined such that it has all the desired properties that are also consistent; and thus the differences of Odd-Dimensional and Even-Dimensional Huygens-Fresnel Principle Some possible solutions might be:- Initialize the network's biases as non-zeros to avoid initial spatial values at origin
- Value Clipping: replace zero state (
$X = [0,0]$ - rows with all zeros) in the input with$epsilon = 1e-7$ (Must be differentiable for backpropogation; Better be Jittable for performances) However these still do not prevent the predictions of$X$ from approaching zeros. It may suggest that there should be a stronger constraints or penalty term to stop PINN from predicting trajectories that go through the origin - Mollification: A typical technique to 'smooth out' harsh boundaries (spikes at singularity in this problem) by convoluting the boundary with a special smooth function.
-
Determining the Learning Rates for the Adversarial Training Process(if chosen to be added) For the Adversarial Training in the Inverse problem, we convert the original Minimization problem to a Min-Max problem by using two optimizers. One updates the model's parameters by Minimizing the total loss, while the other updates the
$mq$ (Mass-to-Charge Ratio) by Maximizing the residual loss($L_{f1}$ and$L_{f2}$ ). Then the question leaves to determining the respective learning rate, as the adversarial training process is difficult to converge. -
Determining the Weights for the Sum of the Loss Terms. Now this PINN has 4 different loss terms
-
$L_{pf}\quad$ : MSE ($flow_{pred}, flow_{true}$ )$\cdot \lambda _0$ -
$\mathcal{L} _{f1} \cdot \lambda_1 \quad$ : Residual Loss based on$X_p$ -
$\mathcal{L} _{f2} \cdot \lambda_2 \quad$ : Residual Loss based on$V_p$ -
$\mathcal{L} _{approx} \quad$ : MSE($V_p$ ,$X_p$ ')$\cdot \lambda _0$
-
It's difficult to balance the influences among these terms (as the yielded results are not desireable, and the model itself is difficult to fine-tune)
- Increasing network's depth or width.
- Incorporate Self Adaptive Weights for different loss terms (in Point 4; The original SA-PINN could be improved?).
- Compare the Adversarial Training with Actor Critic(Pytorch); More resources and references inside the link.
- Handling issues addressed above (2 - 4)
Comments and Discussion on Adversarial Training v.s. Actor-Critic Scheme for Inverse Problems using PINNs
Based on current designs, in the inverse problem the model must learn the model's parameters along with the unknown coefficient
- Update model's trainable variables by minimizing the Total Loss
- Update the unknown coefficient
$mq$ by maximizing the Residual Loss (e.g.$L_{f1}$ and$L_{f2}$ )
The original Actor-Critic Model was designed to improve stability in training, by providing a baseline that would reduce the variance in the Discounted Reward Funciton (that is substituted by an Advantage Function)
- Consider the unknown coefficient
$mq$ or the Phase Flow(Data) as the 'actor' (though there's no extra network/parametrization to predict this scalar value in this case) - Consider the Physics Equation (Lorentz Force in this case) as the 'critic'
The actual parameters' update scheme is still to-be-determined. A naive design would be:
- Update the model's trainable parameters by minimizing the Total Loss, while freezing the unknown coefficient
$mq$ , or freezing the predicted Phase Flow - Update the unknown coefficient
$mq$ by minimizing the Residual Loss, where the 'Baseline' in this case is implied in the given Physics Equation (Lorentz Force)
However, this naive attempt might not rule out the possible 'Vanishing Gradient' Problem or a Mode Collapse indicated in the original SA-PINN paper, where the unknown coefficient $mq$ will always be picked by the model to minimize the Residual Loss yielding extremely small gradients or simply suboptimal $mq$ values.
Results Between Physics-Informed Neural Network(3000 Adam + 1 L-BFGS) and Poisson Neural Network (5000 Adam + 1 L-BFGS)
PINN Predictions and Losses | PNN Predictions and Loss |
---|---|