Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different pipelines for different variables #89

Open
Peter9192 opened this issue Aug 24, 2022 · 0 comments
Open

Different pipelines for different variables #89

Peter9192 opened this issue Aug 24, 2022 · 0 comments

Comments

@Peter9192
Copy link
Contributor

There is a difference between our pipeline workflow and the pipeline workflow of scikitlearn.

E.g.,

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
pipe = Pipeline([('scaler', StandardScaler()), ('svc', SVC())])
pipe.fit(X_train, y_train)

Note, they have X_train as a simple np.ndarray with shape (samples, features). This is something we do not have. Our X is generally a number of resampled xr.Datasets.

For us, a realistic pipeline would look like:
RF = sklearn.models.RF(...)
Pipeline([RGDR(y).fit(sst_precursor), RGDR(y).fit(z200_precursor), EOF.fit(OLR_precursor), 'merger_of_features', 'feature_selection', RF])

Originally posted by @semvijverberg in #71 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant