Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inclusion of more regressors and classifiers in the Model Selection #1186

Closed
ankitrajixr opened this issue Mar 7, 2021 · 2 comments
Closed

Comments

@ankitrajixr
Copy link

Hi team,

Thank you for such a helpful library. While using tpot library, we found certain regressors and classifiers not included in the model selection of the machine learning pipeline.
It would be great with the addition of some regressors like Gaussian Process Regressor, Voting Regressor and classifiers like Voting Classifier, AdaBoost Classifier.

How to recreate it?

  1. User creates TPOT instance
  2. User calls TPOT fit() function with training data

Expected result

The above-mentioned regressor and classifier are not included in the Machine learning for Model selection.

@JDRomano2
Copy link
Contributor

Hi @ankitrajixr, these may have previously been found to not play well with other parts of TPOT, and that may be why they are not included by default.

However, you can add any scikit-learn classifier or regressor to TPOT by simply including it in a custom configuration dictionary. Please see:
https://epistasislab.github.io/tpot/using/#customizing-tpots-operators-and-parameters

If you can use them and they perform well, we can look into adding them to the built-in configuration dictionaries. I'd recommend giving it a try and letting us know (on this thread) how they perform.

ankitrajixr added a commit to ankitrajixr/tpot that referenced this issue Mar 17, 2021
@ankitrajixr
Copy link
Author

ankitrajixr commented Mar 17, 2021

Thank you for your response @JDRomano2 . I have tried the custom TPOT config dictionary. Below is the code snippet for it.

tpot_config = {
    'kernel' : [1.0*RBF(length_scale=0.5, length_scale_bounds=(1e-05, 100000.0)),
           1.0*RationalQuadratic(length_scale=0.5, alpha=0.1),
           1.0*ExpSineSquared(length_scale=0.5, periodicity=3.0,
                                length_scale_bounds=(1e-05, 100000.0),
                                periodicity_bounds=(1.0, 10.0)),
           ConstantKernel(0.1, (0.01, 10.0))*(DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2),
           1.0**2*Matern(length_scale=0.5, length_scale_bounds=(1e-05, 100000.0),
                        nu=0.5)],
        'alpha': [5e-9,1e-3, 1e-2, 1e-1, 1., 10., 100.],
        'normalize_y' : [True, False],
        'optimizer' : ['fmin_l_bfgs_b']
}

The above code works fine for smaller datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants