-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speeding up the Highly Adaptive Lasso with recursive screening via MARS #105
base: devel
Are you sure you want to change the base?
Conversation
Im getting the error ``Error: Unable to resolve action |
Looks interesting, Lars --- I'd be curious to learn more about this. Is your intention to create a separate For the question about the failing builds, this is because the GitHub Actions file is out of date, as |
Hi @nhejazi, thank you for the help! The current implementation has separate functions, mainly because If the functions are merged, we could add options |
I have incorporated |
thanks, @Larsvanderlaan, for these changes. thinking through it some more, i think it may be a good idea to keep the constructors for the two separate, as in retaining both also, a few comments on the construction of the PR:
|
Hi @nhejazi , I should note that the SAL implementation here is totally different from the algorithm in the preprint. Essentially I use MARS to learn a I now personally think it should be a part of the Also, thank you very much for the other comments. I agree that more informative commit names are a better route to go. |
That's very helpful clarification, @Larsvanderlaan. I had taken the mention of SAL to reference the pre-print, and I see from your comment that it's similar, but the fact that it is a screening-based variant of the already well-studied HAL (and uses the same underlying functionality implemented in the package) makes a convincing case for it being included within the existing Also, I think following the style set in the constructor should be good enough to avoid argument creep, since by adding these arguments as a list-argument in |
…o screeningHAL
…t_hal_with_screening
58c6fff
to
afc4d66
Compare
Hi @Larsvanderlaan I just merged your other PR (fix to quantile binning) into devel, leading to a merge conflict with this PR. Could you please address this so we can also merge this PR into devel? Also, @nhejazi do you agree this PR is ready for merging once the conflict is addressed? |
Implements a variant of the selectively adaptive lasso (SAL) that uses MARS (earth) to learn important variables and important variable interactions. While MARS is used by SAL to select important variables and interaction variable subgroups, MARS is not used by SAL for selecting specific spline basis functions. In particular, all basis functions, as specified by the params max_degree and num_knots, are generated for the variables and variable subgroups found by MARS. So SAL tends to be much more expressive than MARS.
The earth model is fit with its own internal cross-validation method for pruning (so not the default gcv approach).
Also, earth parameters for the number of basis functions generated/searched are set high/maximal.
Nested cross-validation is implemented to take into account the outcome-dependent variable selection. SAL is able to handle both large n and large p very well and can lead to both substantial speedups and better performance than HAL when there are many noisy variables.
-- I changed defaults of num_knots argument so it varies as a function of sample size.
-- I made a minor change to how basis functions are generated with the num_knots argument. Before, an edge basis function was being generated that could lead to instability in the CV sometimes.
-- The formula bug fixes of #101 are also incorporated here.
Things still to do:
Write some tests
a bit more documentation
minor changes to make sure fit_hal and fit_sal have near identical functionality