Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

larger training set deviance for smaller values of reg_lambda: bug in convergence criterion? #65

Open
hugoguh opened this issue May 9, 2016 · 4 comments · May be fixed by #226
Open
Labels

Comments

@hugoguh
Copy link
Collaborator

hugoguh commented May 9, 2016

I noticed it while working on cvpyglmnet:

screen shot 2016-05-09 at 2 01 10 pm

If you notice there is a weird behavior on the training performance: the training set deviance is supposed to always go down as reg_lambda approaches zero (or log(Lambda) becomes more negative).

Here’s some code for you to replicate it:
doesn’t happen always, for instance try using np.random.seed(0)

import numpy as np
import scipy.sparse as sps
from sklearn.preprocessing import StandardScaler
import scikits.bootstrap as boot
from pyglmnet import GLM
np.random.seed(42)

# create an instance of the GLM class
reg_lambda = np.exp(np.linspace(-10, -3, num=100))
model = GLM(distr='poisson', verbose=False, alpha=0.95, reg_lambda=reg_lambda)

n_samples, n_features = 10000, 100

# coefficients
beta0 = np.random.normal(0.0, 1.0, 1)
beta = sps.rand(n_features, 1, 0.1)
beta = np.array(beta.todense())

# training data
Xr = np.random.normal(0.0, 1.0, [n_samples, n_features])
yr = model.simulate(beta0, beta, Xr)

# testing data
Xt = np.random.normal(0.0, 1.0, [n_samples, n_features])
yt = model.simulate(beta0, beta, Xt)

# fit Generalized Linear Model
scaler = StandardScaler().fit(Xr)

fit and compute deviance
using average deviance, just a scaler diference but better measure when cross-validating because different folds might have different number of elements, use the commented line for deviance

yrhat = model.fit_predict(scaler.transform(Xr), yr)
#dev_t = [model.deviance(yr, i) for i in yrhat]
dev_t = [model.deviance(yr, i)/float(np.shape(yrhat)[1]) for i in yrhat]

now plot

%matplotlib inline
import matplotlib.pyplot as plt
upto = 60
plt.plot(np.log(model.reg_lambda[0:upto]), dev_t[0:upto], '-o', c='k')
plt.xlabel('log(Lambda)')
plt.ylabel('Poisson Deviance')

Here’s the output:
screen shot 2016-05-09 at 1 48 14 pm

play around with the range of reg_lambdas and also try with other random.seed (different simulated Xr and yr).

Any idea on where it is coming from? At first I thought it was a warm start effect but @pavanramkumar says it starts by fitting the larger reg_lambdas.
This might give a hint, it seems to not depend on the actual value of reg_lambda: when it happens it seems to always happen towards the end, see if I use a slightly different range of reg_lambdas when instantiating the model:

reg_lambda = np.exp(np.linspace(-8, -3, num=100))

screen shot 2016-05-09 at 2 03 17 pm

@jasmainak jasmainak added the bug label May 10, 2016
@jasmainak
Copy link
Member

The last plot just looks like a zoomed version of the first plot to me. Did you try other metrics like RMSE or so?

@pavanramkumar pavanramkumar changed the title convergence bug? larger training set deviance for smaller values of reg_lambda: bug? May 11, 2016
@pavanramkumar pavanramkumar changed the title larger training set deviance for smaller values of reg_lambda: bug? larger training set deviance for smaller values of reg_lambda: bug in convergence criterion? May 11, 2016
@pavanramkumar
Copy link
Collaborator

renaming a few issue titles for clarity

@jasmainak
Copy link
Member

I think this is related to #226. You just need to use a smaller learning_rate

@jasmainak jasmainak linked a pull request Nov 9, 2017 that will close this issue
1 task
@jasmainak
Copy link
Member

@pavanramkumar @hugoguh this seems to be fixed now with latest code. Just run this:

import numpy as np
import scipy.sparse as sps
from sklearn.preprocessing import StandardScaler
import scikits.bootstrap as boot
from pyglmnet import GLM, simulate_glm
np.random.seed(42)

# create an instance of the GLM class
reg_lambda = np.exp(np.linspace(-10, -3, num=100))

n_samples, n_features = 10000, 100

# coefficients
beta0 = np.random.normal(0.0, 1.0, 1)[0]
beta = sps.rand(n_features, 1, 0.1)
beta = np.array(beta.todense())[:, 0]

# training data
Xr = np.random.normal(0.0, 1.0, [n_samples, n_features])
yr = simulate_glm('poisson', beta0, beta, Xr)

# testing data
Xt = np.random.normal(0.0, 1.0, [n_samples, n_features])
yt = simulate_glm('poisson', beta0, beta, Xt)

# fit Generalized Linear Model
scaler = StandardScaler().fit(Xr)

dev_t = list()
for rl in reg_lambda:
    glm = GLM(distr='poisson', verbose=True, alpha=0.95, reg_lambda=rl)
    glm.fit(scaler.transform(Xr), yr)
    dev_t.append(glm.score(scaler.transform(Xr), yr))


import matplotlib.pyplot as plt
upto = 60
plt.plot(np.log(reg_lambda[0:upto]), dev_t[0:upto], '-o', c='k')
plt.xlabel('log(Lambda)')
plt.ylabel('Poisson Deviance')

I get:

Screen Shot 2020-08-14 at 4 58 26 PM

can you close if you agree?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants