recipes 0.2.0
New Steps
-
step_nnmf_sparse()
uses a different implementation of non-negative matrix factorization that is much faster and enables regularized estimation. (#790) -
step_dummy_extract()
creates multiple variables from a character variable by extracting elements using regular expressions and counting those elements. -
step_filter_missing()
can filter columns based on proportion of missingness (#270). -
step_percentile()
replaces the value of a variable with its percentile from the training set. (#765)
Improvements and Other Changes
-
All recipe steps now officially support empty selections to be more aligned with dplyr and other packages that use tidyselect (#603, #531). For example, if a previous step removed all of the columns need for a later step, the recipe does not fail when it is estimated (with the exception of
step_mutate()
). The documentation in?selections
has been updated with advice for writing selectors when filtering steps are used. (#813) -
Fixed bug in
step_harmonic()
printing and changed defaults torole = "predictor"
andkeep_original_cols = FALSE
(#822). -
Improved the efficiency of computations for the Box-Cox transformation (#820).
-
When a feature extraction step (e.g.,
step_pca()
,step_ica()
, etc.) has zero components specified, thetidy()
method now lists the selected columns in theterms
column. -
Deprecation has started for
step_nnmf()
in favor ofstep_nnmf_sparse()
. (#790) -
Steps now have a dedicated subsection detailing what happens when
tidy()
is applied. (#876) -
step_ica()
now runsfastICA()
using a specific set of random numbers so that initialization is reproducible. -
tidy.recipe()
now returns a zero row tibble instead of an error when applied to a empty recipe. (#867) -
step_zv()
now has agroup
argument. The same filter is applied but looks for zero-variance within 1 or more columns that define groups. (#711) -
detect_step()
is no longer restricted to steps created in recipes (#869). -
New
extract_parameter_set_dials()
andextract_parameter_dials()
methods to extract parameter sets and single parameters fromrecipe
objects. -
step_other()
now allow for settingthreshold = 0
which will result in no othering. (#904)
Breaking Changes
-
step_ica()
now indirectly uses thefastICA
package since that package has increased their R version requirement. Recipe objects from previous versions will error when applied to new data. (#823) -
step_kpca*()
now directly use thekernlab
package. Recipe objects from previous versions will error when applied to new data.
Developer
- The print methods have been internally changes to use
print_step()
instead ofprinter()
. This is done for a smoother transition to usecli
in the next version. (#871)