Releases: SchlossLab/mikropml
Releases · SchlossLab/mikropml
mikropml 1.6.1
- Fix roxygen package doc syntax (r-lib/roxygen2#1491, @kelly-sovacool).
mikropml 1.6.0
- New functions:
bootstrap_performance()
allows you to calculate confidence
intervals for the model performance from a single train/test split by
bootstrapping the test set (#329, @kelly-sovacool).calc_balanced_precision()
allows you to calculate balanced
precision and balanced area under the precision-recall curve
(#333, @kelly-sovacool).
- Improved output from
find_feature_importance()
(#326, @kelly-sovacool).- Renamed the column
names
tofeat
to represent each feature or group of correlated features. - New column
lower
andupper
to report the bounds of the empirical 95% confidence interval from the permutation test.
Seevignette('parallel')
for an example of plotting feature importance with confidence intervals.
- Renamed the column
- Minor documentation improvements (#323, #332, @kelly-sovacool).
Full Changelog: v1.5.0...v1.6.0
mikropml 1.5.0
- New example showing how to plot feature importances in the
parallel
vignette (#310, @kelly-sovacool). - You can now use
parRF
, a parallel implementation of therf
method, with
the same default hyperparameters asrf
set automatically (#306, @kelly-sovacool). - New functions to calculate and plot ROC and PRC curves: (#321, @kelly-sovacool)
calc_model_sensspec()
- calculate sensitivity, specificity, and precision for a model.calc_mean_roc()
&plot_mean_roc()
- calculate & plot specificity and mean sensitivity for multiple models.calc_mean_prc()
&plot_mean_prc()
- calculate & plot recall and mean precision for multiple models.
Full Changelog: v1.4.0...v1.5.0
mikropml 1.4.0
- Extra arguments given to
run_ml()
are now forwarded tocaret::train()
(#304, @kelly-sovacool).- Users can now pass any model-specific arguments (e.g.
weights
) tocaret::train()
, allowing greater flexibility.
- Users can now pass any model-specific arguments (e.g.
- Improved tests (#298, #300, #303 #kelly-sovacool)
- Minor documentation improvements.
Full Changelog: v1.3.0...v1.4.0
mikropml 1.3.0
- mikropml now requires R version 4.1.0 or greater due to an update in the randomForest package (#292).
- New function
compare_models()
compares the performance of two models with a permutation test (#295, @courtneyarmour). - Fixed a bug where
cv_times
did not affect the reported repeats for cross-validation (#291, @kelly-sovacool). - Made minor documentation improvements (#293, @kelly-sovacool)
Full Changelog: v1.2.2...v1.3.0
mikropml 1.2.2
This minor patch fixes a test failure on platforms with no long doubles. The actual package code remains unchanged.
Full Changelog: v1.2.1...v1.2.2
mikropml 1.2.1
- Allow
kfold >= length(groups)
(#285, @kelly-sovacool).- When using the groups parameter, groups are kept together in cross-validation partitions when
kfold
<= the number of groups in the training set. Previously, an error was thrown if this condition was not met. Now, if there are not enough groups in the training set for groups to be kept together during CV, groups are allowed to be split up across CV partitions.
- When using the groups parameter, groups are kept together in cross-validation partitions when
- Report p-values for permutation feature importance (#288, @kelly-sovacool).
Full Changelog: v1.2.0...v1.2.1
mikropml 1.2.0
- New parameter
cross_val
added torun_ml()
allows users to define their own custom cross-validation scheme (#278, @kelly-sovacool).- Also added a new parameter
calculate_performance
, which controls whether performance metrics are calculated (default:TRUE
). Users may wish to skip performance calculations when training models with no cross-validation.
- Also added a new parameter
- New parameter
group_partitions
added torun_ml()
allows users to control which groups should go to which partition of the train/test split (#281, @kelly-sovacool). - Modified the
training_frac
parameter inrun_ml()
(#281, @kelly-sovacool).- By default,
training_frac
is a fraction between 0 and 1 that specifies how much of the dataset should be used in the training fraction of the train/test split. - Users can instead give
training_frac
a vector of indices that correspond to which rows of the dataset should go in the training fraction of the train/test split. This gives users direct control over exactly which observations are in the training fraction if desired.
- By default,
mikropml 1.1.1
- Fixed bugs related to grouping correlated features (#276, @kelly-sovacool).
- Also,
group_correlated_features()
is now a user-facing function.
- Also,
mikropml 1.1.0
- New correlation method option for feature importance (#267, @courtneyarmour).
- The default is still "spearman", and now you can use other methods supported by
stats::cor
with thecorr_method
parameter:get_feature_importance(corr_method = "pearson")
- The default is still "spearman", and now you can use other methods supported by
- There are now video tutorials covering mikropml and other skills related to machine learning, created by @pschloss (#270).
- Fixed a bug where
preprocess_data()
converted the outcome column to a character vector (#273, @kelly-sovacool, @ecmaggioncalda).