Skip to content

caret ml classification

Tobias Kind edited this page Oct 1, 2016 · 21 revisions

The caret package covers around 160 methods for binary and multi-class classification approaches. Commonly used performance measures are accuracy (the higher the better), accuracy standard deviation (the lower the better) and the Kappa value (the higher the better).

It is important to understand that many algorithms do not behave deterministic, they may be fast with many cases, but slow with a few. Or they may require exponential computational times when using grid tuning. They may break in case of zero variance variables or require extensive feature selection. Short, many algorithms are not ready for prime time and may need extensive preprocessing. Generally it is recommended to benchmark and compare algorithms against each other with increasing matrix size. Plus, with each package update libraries may break or stop functioning. Here extensive unit testing maybe required. The caret package has inbuilt test methods on Github, however they can only cover a small subset of artificial data sets.


Classification performance measures

These measures are listed in the table below (besides time and memory). Many of them can be easily called via caret functions. Depending on the complexity of the data set they all need to be included.

Performance measures Calculation
false positive values (FP) input
false negative values (FN) input
true positive values (TP) input
true negative values (TN) input
Sensitivity/Recall (true pos rate) [%] TP/(TP+FN)*100%
Specificity (true neg rate) [%] TN/(FP+TN)*100%
Predicitive accuracy [%] TP+TN/(TP+TN+FP+FN)*100%
False positive rate FP/(FP+TN)
False negative rate FN/(TP+FN)
False discovery rate FP/(TP+FN)
Precision (pos pred val) (PPV) TP / (TP + FP)
Negative predictive value (NPV) TN / (TN + FN)
Matthews correlation coefficient too complex
Kappa value NA

Automatic comparison of all fast/working caret models

The below picture shows the training comparison of all 116 classification models in caret that worked with the PimaIndiansDiabetes set from library(mlbench). Runtime was around three hours on a four core system (4.2 Ghz). We can see that some models run 200-times slower than rf and do not provide any better accuracy. All models were used with standard settings and there seems to be more room for grid tuning. Also these values are just training values with the .632 bootstrap and were not yet used for model prediction. The data is presented as sortable DT table in the browser. See all results and download the source code.

all-caret-models-diabetes-classification

Source code link below on the bottom.


Source code

Links:

  • Kappa value - explanation of kappa value from clinical statistics
Clone this wiki locally