larger training set deviance for smaller values of reg_lambda: bug in convergence criterion? #65

hugoguh · 2016-05-09T19:08:17Z

I noticed it while working on cvpyglmnet:

If you notice there is a weird behavior on the training performance: the training set deviance is supposed to always go down as reg_lambda approaches zero (or log(Lambda) becomes more negative).

Here’s some code for you to replicate it:
^{doesn’t happen always, for instance try using np.random.seed(0)}

import numpy as np
import scipy.sparse as sps
from sklearn.preprocessing import StandardScaler
import scikits.bootstrap as boot
from pyglmnet import GLM
np.random.seed(42)

# create an instance of the GLM class
reg_lambda = np.exp(np.linspace(-10, -3, num=100))
model = GLM(distr='poisson', verbose=False, alpha=0.95, reg_lambda=reg_lambda)

n_samples, n_features = 10000, 100

# coefficients
beta0 = np.random.normal(0.0, 1.0, 1)
beta = sps.rand(n_features, 1, 0.1)
beta = np.array(beta.todense())

# training data
Xr = np.random.normal(0.0, 1.0, [n_samples, n_features])
yr = model.simulate(beta0, beta, Xr)

# testing data
Xt = np.random.normal(0.0, 1.0, [n_samples, n_features])
yt = model.simulate(beta0, beta, Xt)

# fit Generalized Linear Model
scaler = StandardScaler().fit(Xr)

fit and compute deviance
^{using average deviance, just a scaler diference but better measure when cross-validating because different folds might have different number of elements, use the commented line for deviance}

yrhat = model.fit_predict(scaler.transform(Xr), yr)
#dev_t = [model.deviance(yr, i) for i in yrhat]
dev_t = [model.deviance(yr, i)/float(np.shape(yrhat)[1]) for i in yrhat]

now plot

%matplotlib inline
import matplotlib.pyplot as plt
upto = 60
plt.plot(np.log(model.reg_lambda[0:upto]), dev_t[0:upto], '-o', c='k')
plt.xlabel('log(Lambda)')
plt.ylabel('Poisson Deviance')

Here’s the output:

play around with the range of reg_lambdas and also try with other random.seed (different simulated Xr and yr).

Any idea on where it is coming from? At first I thought it was a warm start effect but @pavanramkumar says it starts by fitting the larger reg_lambdas.
This might give a hint, it seems to not depend on the actual value of reg_lambda: when it happens it seems to always happen towards the end, see if I use a slightly different range of reg_lambdas when instantiating the model:

reg_lambda = np.exp(np.linspace(-8, -3, num=100))

The text was updated successfully, but these errors were encountered:

jasmainak · 2016-05-10T21:00:20Z

The last plot just looks like a zoomed version of the first plot to me. Did you try other metrics like RMSE or so?

pavanramkumar · 2016-05-11T20:49:41Z

renaming a few issue titles for clarity

jasmainak · 2017-11-09T01:18:52Z

I think this is related to #226. You just need to use a smaller learning_rate

jasmainak · 2020-08-14T20:59:04Z

@pavanramkumar @hugoguh this seems to be fixed now with latest code. Just run this:

import numpy as np
import scipy.sparse as sps
from sklearn.preprocessing import StandardScaler
import scikits.bootstrap as boot
from pyglmnet import GLM, simulate_glm
np.random.seed(42)

# create an instance of the GLM class
reg_lambda = np.exp(np.linspace(-10, -3, num=100))

n_samples, n_features = 10000, 100

# coefficients
beta0 = np.random.normal(0.0, 1.0, 1)[0]
beta = sps.rand(n_features, 1, 0.1)
beta = np.array(beta.todense())[:, 0]

# training data
Xr = np.random.normal(0.0, 1.0, [n_samples, n_features])
yr = simulate_glm('poisson', beta0, beta, Xr)

# testing data
Xt = np.random.normal(0.0, 1.0, [n_samples, n_features])
yt = simulate_glm('poisson', beta0, beta, Xt)

# fit Generalized Linear Model
scaler = StandardScaler().fit(Xr)

dev_t = list()
for rl in reg_lambda:
    glm = GLM(distr='poisson', verbose=True, alpha=0.95, reg_lambda=rl)
    glm.fit(scaler.transform(Xr), yr)
    dev_t.append(glm.score(scaler.transform(Xr), yr))


import matplotlib.pyplot as plt
upto = 60
plt.plot(np.log(reg_lambda[0:upto]), dev_t[0:upto], '-o', c='k')
plt.xlabel('log(Lambda)')
plt.ylabel('Poisson Deviance')

I get:

can you close if you agree?

jasmainak added the bug label May 10, 2016

pavanramkumar changed the title ~~convergence bug?~~ larger training set deviance for smaller values of reg_lambda: bug? May 11, 2016

pavanramkumar changed the title ~~larger training set deviance for smaller values of reg_lambda: bug?~~ larger training set deviance for smaller values of reg_lambda: bug in convergence criterion? May 11, 2016

jasmainak linked a pull request Nov 9, 2017 that will close this issue

[MRG] FIX problem of learning rate #226

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

larger training set deviance for smaller values of reg_lambda: bug in convergence criterion? #65

larger training set deviance for smaller values of reg_lambda: bug in convergence criterion? #65

hugoguh commented May 9, 2016 •

edited

Loading

jasmainak commented May 10, 2016

pavanramkumar commented May 11, 2016

jasmainak commented Nov 9, 2017

jasmainak commented Aug 14, 2020

larger training set deviance for smaller values of reg_lambda: bug in convergence criterion? #65

larger training set deviance for smaller values of reg_lambda: bug in convergence criterion? #65

Comments

hugoguh commented May 9, 2016 • edited Loading

jasmainak commented May 10, 2016

pavanramkumar commented May 11, 2016

jasmainak commented Nov 9, 2017

jasmainak commented Aug 14, 2020

hugoguh commented May 9, 2016 •

edited

Loading