Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L2 regularization for constant optimization #23

Open
Smantii opened this issue Oct 18, 2024 · 6 comments
Open

L2 regularization for constant optimization #23

Smantii opened this issue Oct 18, 2024 · 6 comments

Comments

@Smantii
Copy link
Contributor

Smantii commented Oct 18, 2024

Hi,

is there a way to penalize the magnitude of the constants (via, e.g., L2 regularization)? I am trying to fit a SymbolicRegressor with some noisy data and sometimes I get very large values for some constants.
I looked inside the library and it seems that it is possible to choose, as constant optimizer, adamax and amsgrad:

def __init_sgd_update_rule(self):
        if self.sgd_update_rule == 'constant':
            return op.ConstantUpdateRule(0, self.sgd_learning_rate)
        elif self.sgd_update_rule == 'momentum':
            return op.MomentumUpdateRule(0, self.sgd_learning_rate, self.sgd_beta)
        elif self.sgd_update_rule == 'rmsprop':
            return op.RmsPropUpdateRule(0, self.sgd_learning_rate, self.sgd_beta, self.sgd_epsilon)
        elif self.sgd_update_rule == 'adamax':
            return op.AdaMaxUpdateRule(0, self.sgd_learning_rate, self.sgd_beta, self.sgd_beta2)
        elif self.sgd_update_rule == 'amsgrad':
            return op.AmsGradUpdateRule(0, self.sgd_learning_rate, self.sgd_epsilon, self.sgd_beta, self.sgd_beta2)

        raise ValueError('Unknown update rule {}'.format(self.sgd_update_rule))

In the pytorch implementation of these optimizers there is a parameter weight_decay for the $L^2$ regularization and I was guessing if it is possible to do so also for pyoperon.

@foolnotion
Copy link
Member

Hi,

Regularization is currently not supported but we can look into adding it. Is there an official paper or link to the pytorch implementation?

@Smantii
Copy link
Contributor Author

Smantii commented Oct 19, 2024

Hi,

here are some links to the pytorch docs:

@gkronber
Copy link
Member

Note, that regularization can be misleading or unwanted for expressions with nonlinear parameters that GP might produce $f(x) / \theta$ or $f(x) \exp(-\theta)$

Regularization of the raw parameters may lead to an evolutionary advantage of nonlinear transformations of parameters to allow 'virtual' large parameters.

@Smantii
Copy link
Contributor Author

Smantii commented Oct 19, 2024

You are right, but this is true only for specific primitives, right? In other words, if the primitive set is, e.g.,

allowed_symbols = "add,sub,mul,sin,cos,constant,variable"

there is no such problem.

@gkronber
Copy link
Member

Yes, in this case this is less of a problem.

@foolnotion
Copy link
Member

Hi,

here are some links to the pytorch docs:

* [adamax](https://pytorch.org/docs/stable/generated/torch.optim.Adamax.html)

* [Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html)

By the way, here are the implementations available in Operon:
https://github.com/heal-research/operon/blob/main/include/operon/optimizer/solvers/sgd.hpp

Currently not all are exposed in the Python wrapper, but it's trivial to add them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants