Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sparse Bug] Test and sparse_remote_update can not co-exsit, crash trainer if necessary #891

Merged
merged 4 commits into from
Dec 16, 2016
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions paddle/trainer/Tester.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,12 @@ Tester::Tester(const std::shared_ptr<TrainerConfigHelper>& config,
gradientMachine_(gradientMachine),
parameterUpdater_(parameterUpdater),
testDataProvider_(testDataProvider) {
if (config_->getOptConfig().use_sparse_remote_updater()) {
LOG(FATAL) << "It's prohibited to set sparse_remote_update "
<< "in some layers if testing will be under going "
<< "in the middle of training. You can do testing "
<< "within separate process.";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's prohibited to set sparse_remote_update when doing train and test jobs in the same process. You could run paddle --job=test in a separate process.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以用grammarly检查一下语法。 不过,在这里面报错,会不会在不同进程里也会报错呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test过程是要求不能用sparse配置的,所以test出错就是错误配置了。所以这种改法应该没有问题。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's prohibited to set sparse_remote_update when doing train and test jobs in the same process. You could run paddle --job=test in a separate process.

follow you comments

}
testEvaluator_.reset(gradientMachine_->makeEvaluator());
if (intconfig_->distributeTest) {
testParameterClient_.reset(new ParameterClient2(true));
Expand Down