L2 regularization for constant optimization #23

Smantii · 2024-10-18T17:25:38Z

Hi,

is there a way to penalize the magnitude of the constants (via, e.g., L2 regularization)? I am trying to fit a SymbolicRegressor with some noisy data and sometimes I get very large values for some constants.
I looked inside the library and it seems that it is possible to choose, as constant optimizer, adamax and amsgrad:

def __init_sgd_update_rule(self):
        if self.sgd_update_rule == 'constant':
            return op.ConstantUpdateRule(0, self.sgd_learning_rate)
        elif self.sgd_update_rule == 'momentum':
            return op.MomentumUpdateRule(0, self.sgd_learning_rate, self.sgd_beta)
        elif self.sgd_update_rule == 'rmsprop':
            return op.RmsPropUpdateRule(0, self.sgd_learning_rate, self.sgd_beta, self.sgd_epsilon)
        elif self.sgd_update_rule == 'adamax':
            return op.AdaMaxUpdateRule(0, self.sgd_learning_rate, self.sgd_beta, self.sgd_beta2)
        elif self.sgd_update_rule == 'amsgrad':
            return op.AmsGradUpdateRule(0, self.sgd_learning_rate, self.sgd_epsilon, self.sgd_beta, self.sgd_beta2)

        raise ValueError('Unknown update rule {}'.format(self.sgd_update_rule))

In the pytorch implementation of these optimizers there is a parameter weight_decay for the $L^2$ regularization and I was guessing if it is possible to do so also for pyoperon.

The text was updated successfully, but these errors were encountered:

foolnotion · 2024-10-19T09:15:13Z

Hi,

Regularization is currently not supported but we can look into adding it. Is there an official paper or link to the pytorch implementation?

Smantii · 2024-10-19T13:31:55Z

Hi,

here are some links to the pytorch docs:

adamax
Adam

gkronber · 2024-10-19T13:47:39Z

Note, that regularization can be misleading or unwanted for expressions with nonlinear parameters that GP might produce $f(x) / \theta$ or $f(x) \exp(-\theta)$

Regularization of the raw parameters may lead to an evolutionary advantage of nonlinear transformations of parameters to allow 'virtual' large parameters.

Smantii · 2024-10-19T15:07:07Z

You are right, but this is true only for specific primitives, right? In other words, if the primitive set is, e.g.,

allowed_symbols = "add,sub,mul,sin,cos,constant,variable"

there is no such problem.

gkronber · 2024-10-19T15:58:03Z

Yes, in this case this is less of a problem.

foolnotion · 2024-10-19T17:54:07Z

Hi,

here are some links to the pytorch docs:

* [adamax](https://pytorch.org/docs/stable/generated/torch.optim.Adamax.html)

* [Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html)

By the way, here are the implementations available in Operon:
https://github.com/heal-research/operon/blob/main/include/operon/optimizer/solvers/sgd.hpp

Currently not all are exposed in the Python wrapper, but it's trivial to add them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

L2 regularization for constant optimization #23

L2 regularization for constant optimization #23

Smantii commented Oct 18, 2024 •

edited

Loading

foolnotion commented Oct 19, 2024

Smantii commented Oct 19, 2024

gkronber commented Oct 19, 2024

Smantii commented Oct 19, 2024 •

edited

Loading

gkronber commented Oct 19, 2024

foolnotion commented Oct 19, 2024

L2 regularization for constant optimization #23

L2 regularization for constant optimization #23

Comments

Smantii commented Oct 18, 2024 • edited Loading

foolnotion commented Oct 19, 2024

Smantii commented Oct 19, 2024

gkronber commented Oct 19, 2024

Smantii commented Oct 19, 2024 • edited Loading

gkronber commented Oct 19, 2024

foolnotion commented Oct 19, 2024

Smantii commented Oct 18, 2024 •

edited

Loading

Smantii commented Oct 19, 2024 •

edited

Loading