Skip to content

Commit

Permalink
Slep005: Some comments on fit params (scikit-learn#3)
Browse files Browse the repository at this point in the history
  • Loading branch information
glemaitre authored Sep 9, 2019
2 parents 35c140d + e306795 commit 79123fb
Showing 1 changed file with 10 additions and 4 deletions.
14 changes: 10 additions & 4 deletions slep005/proposal.rst
Original file line number Diff line number Diff line change
Expand Up @@ -177,10 +177,15 @@ transformation within a ``FeatureUnion`` or ``ColumnTransformer``. Thus the
.. rubric:: Handling ``fit`` parameters

Sample props or weights cannot be routed to steps downstream of a resampler in
a ``Pipeline``, unless they too are resampled. It's very unclear how this would
work with ``Pipeline``'s current prefix-based fit parameter routing.

TODO: propose solutions
a Pipeline, unless they too are resampled. To support this, a resampler
would need to be passed all props that are required downstream, and
``fit_resample`` should return resampled versions of them. Note that these
must be distinct from parameters that affect the resampler's fitting.
That is, consider the signature ``fit_resample(X, y=None, props=None, sample_weight=None)``.
The ``sample_weight`` passed in should affect the resampling, but does not
itself need to be resampled. A Pipeline would pass ``props`` including the fit
parameters required downstream, which would be resampled and returned by
``fit_resample``.

Example Usage:
~~~~~~~~~~~~~~
Expand All @@ -195,6 +200,7 @@ Example Usage:
est = make_pipeline(
NaNRejector(), RandomUnderSampler(), StandardScaler(), SGDClassifer()
)
est.fit(X,y, sgdclassifier__sample_weight=my_weight)
Alternative implementation
Expand Down

0 comments on commit 79123fb

Please sign in to comment.