Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating the learning (optimisation) from the model #5

Open
patrikhuber opened this issue Jun 3, 2015 · 4 comments
Open

Separating the learning (optimisation) from the model #5

patrikhuber opened this issue Jun 3, 2015 · 4 comments

Comments

@patrikhuber
Copy link
Owner

I think it would be better to separate the optimisation from the actual learned model (the regressors).

Reason: What's not so nice right now is that we store the SupervisedDescentOptimiser to the disk, with all its type information about the solver (e.g. LinearRegressor<PartialPivLUSolver>), and we store the regularisation as well.
Both the solver and the regularisation are only relevant at training time, and there's neither need to store them nor should we need to know the type of the solver when we load the model.
Also, it would mean a user that just wants to use the landmark detection (and not train a model) wouldn't need Eigen, because Eigen is only needed in LinearRegressor::learn() and not in predict().

Regarding the regulariser, we could just choose to exclude it from the serialisation, but I don't think it's very intuitive to only serialise half of a class. I think there must be a better solution that solves the other shortcomings as well. Maybe we can even just make SupervisedDescentOptimiser::train() a free function and get rid of the class.

A related project, tiny-cnn, doesn't separate the model from the optimisation, but I kind of feel like we should.

@songminglong
Copy link

In my opinion, You just need to store regressor's matrix(that is x)

@Thanh-Binh
Copy link

I totaly agree with Xiaohu that we should separate them and store only regressors

@andyhx
Copy link

andyhx commented Jul 8, 2015

@patrikhuber ,when training the images, is it the more images nums,the betrer?I find the result of training 300 pics is better than 800 pics.Is it that the training large numbers of pics will cause the overfit problem? And when in training how many pics will perform a better result,how is ur test result, Is ur testing set from the ibug website or others? thanks so much

@patrikhuber
Copy link
Owner Author

@andyhx: I'll be glad to answer your question, but could you please open a separate ticket for it? This is very off-topic here. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants