Different pipelines for different variables #89

Peter9192 · 2022-08-24T08:52:52Z

There is a difference between our pipeline workflow and the pipeline workflow of scikitlearn.

E.g.,

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
pipe = Pipeline([('scaler', StandardScaler()), ('svc', SVC())])
pipe.fit(X_train, y_train)

Note, they have X_train as a simple np.ndarray with shape (samples, features). This is something we do not have. Our X is generally a number of resampled xr.Datasets.

For us, a realistic pipeline would look like:
RF = sklearn.models.RF(...)
Pipeline([RGDR(y).fit(sst_precursor), RGDR(y).fit(z200_precursor), EOF.fit(OLR_precursor), 'merger_of_features', 'feature_selection', RF])

Originally posted by @semvijverberg in #71 (comment)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different pipelines for different variables #89

Different pipelines for different variables #89

Peter9192 commented Aug 24, 2022

Different pipelines for different variables #89

Different pipelines for different variables #89

Comments

Peter9192 commented Aug 24, 2022