Skip to content

Latest commit

 

History

History
59 lines (40 loc) · 1.89 KB

r4ml-examples.md

File metadata and controls

59 lines (40 loc) · 1.89 KB

R4ML Logo

How to Use R4ML

Congratulations, you just installed R4ML, now you are ready to do scalable data analysis and machine learning.

  • Here is very simple example of running the linear model with an R script
  # In this example, we're going to use iris data set, and we'll use linear regression
  # model to predict Sepal Length using the rest of the fearures as predictors.

  # load R4ML library
  library(R4ML)

  r4ml.session(master = "local[*]", sparkHome = "/path/to/apache/spark")

  # create r4ml dataframe from iris data
  iris.df <- as.r4ml.frame(iris)

  # We need to dummycode Species for regression model using preprocessing
  # this returns a list containing $data $metadata
  pp <- r4ml.ml.preprocess(iris.df, transformPath = "/tmp",
                             dummycodeAttrs = c("Species"))

  # convert preprocessed data to r4ml.matrix
  iris.mat <- as.r4ml.matrix(pp$data)

  # create train/test split
  ml.coltypes(iris.mat) <- c("scale", "scale", "scale", "scale",
                             "nominal", "nominal", "nominal")
  s <- r4ml.sample(iris.mat, perc=c(0.2,0.8))
  test <- s[[1]]
  train <- s[[2]]

  # fit linear regression model.
  iris.model <- r4ml.lm(Sepal_Length ~ . , data = train, intercept = TRUE)

  # we can check the coefficients of model by using coef function
  coef(iris.model)

  # we can now make predictions using test data set
  preds <- predict(iris.model, test)

  # the predict function calculates several statitics just like r4ml.lm.
  preds$statistics

  r4ml.session.stop()