L1-constrained regression using Frank-Wolfe #43

mblondel · 2015-12-03T14:38:43Z

I implemented the FW method for L1-constrained regression.
The method is greedy in nature : at most one non-zero coefficient is added at every iteration. I added an option to stop when a model size limit is reached.

Regularization paths of constrained (FW) and penalized (coordinate descent) formulations on the diabetes dataset:

I still need to add docstrings and tests.

CC @fabianp @zermelozf @vene

fabianp · 2015-12-03T14:44:50Z

Cool ! I'll be offline until Monday, then I'll definitely take a look into it.

vene · 2015-12-08T16:23:23Z

lightning/impl/fw.py

+ if sp.issparse(Xs):
+ Xs_sq = np.dot(Xs.data, Xs.data)
+ else:
+ Xs_sq = np.dot(Xs, Xs)


Since the sign doesn't affect this, you could precompute all column square norms outside of the loop, right? (But I guess it's a tradeoff for high dimensional X)

Good idea, I'll do that. O(n_features) memory cache is not big deal.

fabianp · 2015-12-11T11:03:40Z

since it is specific to least squares I would call it _frank_wolfe_ls (and also because I would love a FW for arbitrary loss functions :-).

Just FYI yesterday I saw this paper (http://arxiv.org/pdf/1511.05932.pdf), in which they prove that a small modification to FW yield an algorithm with linear convergence rate, although I don't know whether in practice this has always a major effect.

mblondel · 2015-12-11T11:38:01Z

since it is specific to least squares I would call it _frank_wolfe_ls (and also because I would love a FW for arbitrary loss functions :-)

This shouldn't be too hard. The only difficulty I see is for computing the step size. For arbitrary loss it's not possible to compute the exact step size but a few iterations of one-dimensional Newton method should work.

fabianp · 2015-12-11T11:54:28Z

What about using scipy's line_search for arbitrary loss functions and the exact one for the quadratic one?

mblondel · 2015-12-11T13:25:02Z

Not sure, does it work for constrained optimization?

fabianp · 2015-12-14T09:29:09Z

you are right, probably not

vene · 2016-03-24T17:30:03Z

I was just reading on hybrid conditional gradient - smoothing and it seems like it could lead to an efficient way to extend this with an additional group lasso penalty.
(see eq 5.3 and alg. 5)

…into fw

mblondel · 2017-01-17T13:43:38Z

@vene @fabianp For Frank-Wolfe, the regularization path matches the one of the penalized version trained by coordinate descent, as expected. However, for FISTA with penalty=l1-ball, it doesn't. Either FISTA is not supposed to handle constrained problems or there is a problem with my implementation. I am observing issues on constrained problems with this implementation too.

fabianp · 2017-01-17T19:36:50Z

In theory FISTA should work with constrained problems using the projection as proximal operator. Have you tried with simple ISTA=projected gradient descent to debug (trivial to implement)?

…into fw

vene · 2017-01-18T01:02:30Z

Bach et al (p12 here) say that "proximal methods apply" and don't seem to suggest the need for any special treatment.

mblondel · 2017-01-18T04:10:35Z

There was an issue in project_simplex, see dfb8586.

With a small tweak to the plot code, FISTA is now looking fine.

mblondel added 2 commits December 3, 2015 23:26

Add Frank-Wolfe implementation of l1 constrained regression.

ada8ee8

Add example.

7fe802d

vene reviewed Dec 8, 2015
View reviewed changes

mblondel added 5 commits October 18, 2016 20:41

Merge branch 'master' into fw

8edba2b

Add FISTA to plot.

21e0711

Merge branch 'fw' of https://github.com/scikit-learn-contrib/lightning …

0b8d00f

…into fw

Add simplex option.

55a4b02

Merge branch 'master' into fw

054136f

Merge branch 'fw' of https://github.com/scikit-learn-contrib/lightning …

b57a459

…into fw

Merge branch 'master' into fw

4eb57e9

Use abs(x) > 1e-9 instead of x != 0.

27500f0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

L1-constrained regression using Frank-Wolfe #43

L1-constrained regression using Frank-Wolfe #43

mblondel commented Dec 3, 2015

fabianp commented Dec 3, 2015

vene Dec 8, 2015

mblondel Dec 10, 2015

fabianp commented Dec 11, 2015

mblondel commented Dec 11, 2015

fabianp commented Dec 11, 2015

mblondel commented Dec 11, 2015

fabianp commented Dec 14, 2015

vene commented Mar 24, 2016

mblondel commented Jan 17, 2017 •

edited

Loading

fabianp commented Jan 17, 2017

vene commented Jan 18, 2017

mblondel commented Jan 18, 2017

L1-constrained regression using Frank-Wolfe #43

Are you sure you want to change the base?

L1-constrained regression using Frank-Wolfe #43

Conversation

mblondel commented Dec 3, 2015

fabianp commented Dec 3, 2015

vene Dec 8, 2015

Choose a reason for hiding this comment

mblondel Dec 10, 2015

Choose a reason for hiding this comment

fabianp commented Dec 11, 2015

mblondel commented Dec 11, 2015

fabianp commented Dec 11, 2015

mblondel commented Dec 11, 2015

fabianp commented Dec 14, 2015

vene commented Mar 24, 2016

mblondel commented Jan 17, 2017 • edited Loading

fabianp commented Jan 17, 2017

vene commented Jan 18, 2017

mblondel commented Jan 18, 2017

mblondel commented Jan 17, 2017 •

edited

Loading