Skip to content
This repository has been archived by the owner on Dec 6, 2023. It is now read-only.

L1-constrained regression using Frank-Wolfe #43

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

L1-constrained regression using Frank-Wolfe #43

wants to merge 10 commits into from

Conversation

mblondel
Copy link
Member

@mblondel mblondel commented Dec 3, 2015

I implemented the FW method for L1-constrained regression.
The method is greedy in nature : at most one non-zero coefficient is added at every iteration. I added an option to stop when a model size limit is reached.

Regularization paths of constrained (FW) and penalized (coordinate descent) formulations on the diabetes dataset:
fw

I still need to add docstrings and tests.

CC @fabianp @zermelozf @vene

@fabianp
Copy link
Member

fabianp commented Dec 3, 2015

Cool ! I'll be offline until Monday, then I'll definitely take a look into it.

if sp.issparse(Xs):
Xs_sq = np.dot(Xs.data, Xs.data)
else:
Xs_sq = np.dot(Xs, Xs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the sign doesn't affect this, you could precompute all column square norms outside of the loop, right? (But I guess it's a tradeoff for high dimensional X)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll do that. O(n_features) memory cache is not big deal.

@fabianp
Copy link
Member

fabianp commented Dec 11, 2015

since it is specific to least squares I would call it _frank_wolfe_ls (and also because I would love a FW for arbitrary loss functions :-).

Just FYI yesterday I saw this paper (http://arxiv.org/pdf/1511.05932.pdf), in which they prove that a small modification to FW yield an algorithm with linear convergence rate, although I don't know whether in practice this has always a major effect.

@mblondel
Copy link
Member Author

since it is specific to least squares I would call it _frank_wolfe_ls (and also because I would love a FW for arbitrary loss functions :-)

This shouldn't be too hard. The only difficulty I see is for computing the step size. For arbitrary loss it's not possible to compute the exact step size but a few iterations of one-dimensional Newton method should work.

@fabianp
Copy link
Member

fabianp commented Dec 11, 2015

What about using scipy's line_search for arbitrary loss functions and the exact one for the quadratic one?

@mblondel
Copy link
Member Author

Not sure, does it work for constrained optimization?

@fabianp
Copy link
Member

fabianp commented Dec 14, 2015

you are right, probably not

@vene
Copy link
Contributor

vene commented Mar 24, 2016

I was just reading on hybrid conditional gradient - smoothing and it seems like it could lead to an efficient way to extend this with an additional group lasso penalty.
(see eq 5.3 and alg. 5)

@mblondel
Copy link
Member Author

mblondel commented Jan 17, 2017

@vene @fabianp For Frank-Wolfe, the regularization path matches the one of the penalized version trained by coordinate descent, as expected. However, for FISTA with penalty=l1-ball, it doesn't. Either FISTA is not supposed to handle constrained problems or there is a problem with my implementation. I am observing issues on constrained problems with this implementation too.

@fabianp
Copy link
Member

fabianp commented Jan 17, 2017

In theory FISTA should work with constrained problems using the projection as proximal operator. Have you tried with simple ISTA=projected gradient descent to debug (trivial to implement)?

@vene
Copy link
Contributor

vene commented Jan 18, 2017

Bach et al (p12 here) say that "proximal methods apply" and don't seem to suggest the need for any special treatment.

@mblondel
Copy link
Member Author

There was an issue in project_simplex, see dfb8586.

With a small tweak to the plot code, FISTA is now looking fine.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants