-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ RMSProp ] add rms prop optimizer #2587
base: main
Are you sure you want to change the base?
Conversation
This PR implement rms prop optimizer. unittest for rms prop is required. . emplementation of RMSProp Class . Seperate the optimizer properties into optimizer common properties. and some fixes. . add enum for rms props **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2587. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
cibot: @jijoongmoon, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2587-202405171534300.80631399154663-f060dd3678e0f993ad16229252b1145c6b39f2e7/. |
Tensor denom = wv.apply<float>(sqrtFloat<float>); | ||
denom.add_i(epsilon); | ||
|
||
wv.divide(denom, x_grad); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to compute
W = W - alpha * wv / (sqrt(wv) + epsilon)
, which should be
W = W - alpha * x_grad / (sqrt(wv) + epsilon)
so as to work properly.
wv.divide(denom, x_grad); | |
x_grad.divide_i(denom); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems eunju is right.
Further more unlike the adam optimizer torch seems to work in the same way as other frameworks do.
Please refer https://pytorch.org/docs/stable/generated/torch.optim.RMSprop.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the CI-related error is a rebase issue. In my local environment, when I built with 2584 as the base, it passed build & test without any problems. Other than that, LGTM!!
if (opt && istrequal(opt->getType(), "rmsprop")) { | ||
std::string rmsprop = "rmsprop"; | ||
model_file.write(rmsprop.c_str(), 7); | ||
for (auto iter = model_graph.cbegin(); iter != model_graph.cend(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Saving rms prop optimizer variable code seems to be integrated with that of adam optimizer
- Needs to handle the load rms prop optimizer variable
@@ -85,6 +85,7 @@ typedef enum { | |||
typedef enum { | |||
ML_TRAIN_OPTIMIZER_TYPE_ADAM = 0, /**< Adam Optimizer */ | |||
ML_TRAIN_OPTIMIZER_TYPE_SGD = 1, /**< Stochastic Gradient Descent Optimizer */ | |||
ML_TRAIN_OPTIMIZER_TYPE_RMSPROP = 2, /** rmsprop */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please match the format of other optimizer description.
rmsprop -> RMSProps Optimizer
class RMSProp : public Optimizer { | ||
public: | ||
/** | ||
* @brief Construct a new Adam object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adam -> RMSProp
? context.getGradient() | ||
: empty_tensor; | ||
|
||
if (x_grad.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that it can be merged with ternary operator
}; | ||
|
||
x_grad = wv.apply<float>(sqrtEps, x_grad); | ||
context.applyGradient(context.getLearningRate(), x_grad); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RumOptimizerContext does not have a function which has a prototype of applyGradient(double, tensor) as the CI issued.
Tensor denom = wv.apply<float>(sqrtFloat<float>); | ||
denom.add_i(epsilon); | ||
|
||
wv.divide(denom, x_grad); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems eunju is right.
Further more unlike the adam optimizer torch seems to work in the same way as other frameworks do.
Please refer https://pytorch.org/docs/stable/generated/torch.optim.RMSprop.html
In this PR
This PR implements rms prop optimizer.
unittest for rms prop is required.
. implementation of RMSProp Class
. Separate the optimizer properties into optimizer common properties. and some fixes.
. add enum for rms props
Self evaluation: