Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Wait for #2591][Mixed] Support MSELoss - Mixed Precision #2604

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Commits on May 23, 2024

  1. [ Weight ] Add Var32 Tensor in Weight.

    We will add Var32 Tensor if the Variable Weight is not Full
    precision (FP32). This eables the Weight Update with full precision
    and only Apply Gradient Process ueses this Tensor. Therefore, the
    lifespan of this tensor should be "ApplyGradient".
    
    . Modify TensorPool to generate Weigth considering Mixed Precsion.
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    9efb873 View commit details
    Browse the repository at this point in the history
  2. [ Mixed ] Create weight with var32 tensor

    This pr create the variable fp32 tensor when we create the Weight and
    Optimizer Weight.
    
    . update the manager to create Weight with  var32 tensor which
    requested to weight pool.
    . update the weight requests with Weight Spec and var, grad and var32
    tensors which created already.
    . add clone Tensor with specific type in tensor.h
    
    Resolves:
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    717dba6 View commit details
    Browse the repository at this point in the history
  3. [ Layers ] Update Layers to support FP16

    This PR enables the FP16 support for the layers below:
    
    . input layer
    . mse loss layer
    
    Resolves:
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    6dc8a05 View commit details
    Browse the repository at this point in the history
  4. [ Test ] Mixed Precision Test Case

    This PR includes the mixed precision test case.
    
    . Input - FC - MSE
     : "batch_size=2", "model_tensor_type=FP16-FP16", "loss_scale=128"
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    b86d833 View commit details
    Browse the repository at this point in the history
  5. [ Optimizer ] Update Optimizer / Adam to support Mixed training

    This commit modify apply gradient in optimizer.
    We do not need to save optimizer variables in weight type. Only
    Optimizer needs the optimizer variables and we should update the
    weight with full precision to maintain the accuracy. Therefore,
    remove the var32 tensors for optimizer variables.
    
    Resolves:
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    47038cb View commit details
    Browse the repository at this point in the history
  6. [ Tensor ] add is_NaN check in Tensor

    This PR add is_NaN function to check if the tensor has NaN value. This
    is for the check NaN during mixed precision training.
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    0e7884f View commit details
    Browse the repository at this point in the history
  7. [ Context ] Add loss scale in Context & using mse loss

    This PR add loss scale parameter in runcontext and use it to update
    mse loss.
    
    . Add Loss Scale Parameter in RunLayerContext Constructor
    . Add applyLossScale func to update return derivitive in Loss Layer
    . Change MSE Loss Layer to apply the loss scale to return derivitive
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    36de26e View commit details
    Browse the repository at this point in the history
  8. [ Mixed Precision ] Enable Mixed Precision

    This PR enables the Mixed Precision Training. For now only FP16-FP32
    is considered. Additional Test cases will be added.
    
    . add getSortedLayerIdx to set the graph order for fowarding.
    . change clip_weights to lazy_apply_weights to use both cases.
    . add fowarding_op to run forwarding from that layer which has a
    gradient with nan.
    . add while loop for re-run backwarding after reset the loss scale.
    . add setLossScale in RunLayerContext
    . add check the gradient if mixed precsion enable.
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    c3f54ec View commit details
    Browse the repository at this point in the history
  9. [ Tensor ] Add inifinity check in Tensor

    This PR add inifinity value check in Tensor data.
    . rename the hasNaN to isValid
    . add infinity check in isValid Function and now it check NaN and Inf
    . modify to check the blas_avx and blas_neon
    . modify graph and model check is_valid rather than has_nan
    . add unittest of isValid Function
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    3f3a1bd View commit details
    Browse the repository at this point in the history
  10. [ MSE ] Fix for better MSE loss precision

    This PR chage the loss computation using full precsion rather than
    half precsion to maintain accuracy.
    
    **Changes proposed in this PR:**
    - Added TOC generator for README.md
    
    Resolves:
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    eb2061f View commit details
    Browse the repository at this point in the history
  11. [ TEST ] Add Torch Mixed Precision Model Test

    This PR enables the Mixed Precsion Unittest with Torch Model.
    
    Resolves:
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    0a4561e View commit details
    Browse the repository at this point in the history
  12. [ TEST ] add torch input and output test data for mixed precision

    This PR add torch mixed precsion golden data generation and input and
    output for test.
    
    . some fixes to test.
    
    Resolves:
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: jijoong.moon <[email protected]>
    jijoongmoon authored and DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    ca3375e View commit details
    Browse the repository at this point in the history
  13. [Mixed] Support MSELoss - Mixed Precision

    enable mixed precision in MSE Loss
    
    **Self evaluation:**
    1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
    2. Run test:	 [X]Passed [ ]Failed [ ]Skipped
    
    Signed-off-by: Donghak PARK <[email protected]>
    DonghakPark committed May 23, 2024
    Configuration menu
    Copy the full SHA
    2f6b205 View commit details
    Browse the repository at this point in the history