Skip to content

Commit

Permalink
remove version from is_installed (#444)
Browse files Browse the repository at this point in the history
* remove version from is_installed

* update GA versions

* fix NOTES about documentation

* preserve randomness in table generation in tests
  • Loading branch information
egillax authored Apr 26, 2024
1 parent 29c3d0b commit 1a1f620
Show file tree
Hide file tree
Showing 102 changed files with 3,521 additions and 2,075 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@ compare_versions
docs/*
_pkgdown.yml
^vignettes/articles$
^doc$
^Meta$
12 changes: 6 additions & 6 deletions .github/workflows/R_CMD_check_Hades.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:
CDM5_SQL_SERVER_USER: ${{ secrets.CDM5_SQL_SERVER_USER }}

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-r@v2
with:
Expand All @@ -66,7 +66,7 @@ jobs:
done < <(Rscript -e 'writeLines(remotes::system_requirements("ubuntu", "22.04"))')
- name: Setup conda
uses: conda-incubator/setup-miniconda@v2
uses: conda-incubator/setup-miniconda@v3

- uses: r-lib/actions/setup-r-dependencies@v2
with:
Expand All @@ -81,7 +81,7 @@ jobs:

- name: Upload source package
if: success() && runner.os == 'macOS' && github.event_name != 'pull_request' && github.ref == 'refs/heads/main'
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: package_tarball
path: check/*.tar.gz
Expand Down Expand Up @@ -110,7 +110,7 @@ jobs:

steps:

- uses: actions/checkout@v2
- uses: actions/checkout@v4
with:
fetch-depth: 0

Expand All @@ -136,7 +136,7 @@ jobs:
draft: false
prerelease: false

- uses: r-lib/actions/setup-r@v1
- uses: r-lib/actions/setup-r@v2
if: ${{ env.new_version != '' }}

- name: Install drat
Expand All @@ -152,7 +152,7 @@ jobs:
- name: Download package tarball
if: ${{ env.new_version != '' }}
uses: actions/download-artifact@v2
uses: actions/download-artifact@v4
with:
name: package_tarball

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/R_CMD_check_main_weekly.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:
CDM5_SPARK_CONNECTION_STRING: ${{ secrets.CDM5_SPARK_CONNECTION_STRING }}

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-r@v2
with:
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,5 @@ standalone/build/*
/python_models/*
/mycache/*
/inst/shiny/DiagnosticsExplorer/rsconnect/*
/doc/
/Meta/
24 changes: 12 additions & 12 deletions R/DataSplitting.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@


#' Create the settings for defining how the plpData are split into test/validation/train sets using
#' default splitting functions (either random stratified by outcome, time or subject splitting).
#' default splitting functions (either random stratified by outcome, time or subject splitting)
#'
#' @details
#' Returns an object of class \code{splitSettings} that specifies the splitting function that will be called and the settings
Expand All @@ -28,9 +28,9 @@
#' @param nfold (numeric) An integer > 1 specifying the number of folds used in cross validation
#' @param splitSeed (numeric) A seed to use when splitting the data for reproducibility (if not set a random number will be generated)
#' @param type (character) Choice of: \itemize{
#' \item{'stratified'}{ Each data point is randomly assigned into the test or a train fold set but this is done stratified such that the outcome rate is consistent in each partition }
#' \item{'time')}{ Older data are assigned into the training set and newer data are assigned into the test set}
#' \item{'subject'}{ Data are partitioned by subject, if a subject is in the data more than once, all the data points for the subject are assigned either into the test data or into the train data (not both).}
#' \item'stratified' Each data point is randomly assigned into the test or a train fold set but this is done stratified such that the outcome rate is consistent in each partition
#' \item'time' Older data are assigned into the training set and newer data are assigned into the test set
#' \item'subject' Data are partitioned by subject, if a subject is in the data more than once, all the data points for the subject are assigned either into the test data or into the train data (not both).
#' }
#'
#' @return
Expand Down Expand Up @@ -87,17 +87,17 @@ createDefaultSplitSetting <- function(testFraction=0.25,
#'
#' @details
#' Returns a list containing the training data (Train) and optionally the test data (Test). Train is an Andromeda object containing
#' \itemize{\item{covariates}{ a table (rowId, covariateId, covariateValue) containing the covariates for each data point in the train data }
#' \item{covariateRef}{ a table with the covariate information}
#' \item{labels)}{ a table (rowId, outcomeCount, ...) for each data point in the train data (outcomeCount is the class label) }
#' \item{folds}{ a table (rowId, index) specifying which training fold each data point is in.}
#' \itemize{\item covariates: a table (rowId, covariateId, covariateValue) containing the covariates for each data point in the train data
#' \item covariateRef: a table with the covariate information
#' \item labels: a table (rowId, outcomeCount, ...) for each data point in the train data (outcomeCount is the class label)
#' \item folds: a table (rowId, index) specifying which training fold each data point is in.
#' }
#' Test is an Andromeda object containing
#' \itemize{\item{covariates}{ a table (rowId, covariateId, covariateValue) containing the covariates for each data point in the test data }
#' \item{covariateRef}{ a table with the covariate information}
#' \item{labels)}{ a table (rowId, outcomeCount, ...) for each data point in the test data (outcomeCount is the class label) }
#' \itemize{\item covariates: a table (rowId, covariateId, covariateValue) containing the covariates for each data point in the test data
#' \item covariateRef: a table with the covariate information
#' \item labels: a table (rowId, outcomeCount, ...) for each data point in the test data (outcomeCount is the class label)
#' }
#'
#'
#'
#'
#' @param plpData An object of type \code{plpData} - the patient level prediction
Expand Down
16 changes: 6 additions & 10 deletions R/DiagnosePlp.R
Original file line number Diff line number Diff line change
Expand Up @@ -213,16 +213,12 @@ diagnoseMultiplePlp <- function(
#' and whether to normalise the covariates before training
#' @param modelSettings An object of class \code{modelSettings} created using one of the function:
#' \itemize{
#' \item{setLassoLogisticRegression()}{ A lasso logistic regression model}
#' \item{setGradientBoostingMachine()}{ A gradient boosting machine}
#' \item{setAdaBoost()}{ An ada boost model}
#' \item{setRandomForest()}{ A random forest model}
#' \item{setDecisionTree()}{ A decision tree model}
#' \item{setCovNN())}{ A convolutional neural network model}
#' \item{setCIReNN()}{ A recurrent neural network model}
#' \item{setMLP()}{ A neural network model}
#' \item{setDeepNN()}{ A deep neural network model}
#' \item{setKNN()}{ A KNN model}
#' \item setLassoLogisticRegression() A lasso logistic regression model
#' \item setGradientBoostingMachine() A gradient boosting machine
#' \item setAdaBoost() An ada boost model
#' \item setRandomForest() A random forest model
#' \item setDecisionTree() A decision tree model
#' \item setKNN() A KNN model
#'
#' }
#' @param logSettings An object of \code{logSettings} created using \code{createLogSettings}
Expand Down
2 changes: 1 addition & 1 deletion R/FeatureEngineering.R
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ featureEngineer <- function(data, featureEngineeringSettings){
#' Returns an object of class \code{featureEngineeringSettings} that specifies the sampling function that will be called and the settings
#'
#' @param type (character) Choice of: \itemize{
#' \item{'none'}{ No feature engineering - this is the default }
#' \item'none' No feature engineering - this is the default
#' }
#'
#' @return
Expand Down
9 changes: 4 additions & 5 deletions R/Fit.R
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,10 @@
#' data extracted from the CDM.
#' @param modelSettings An object of class \code{modelSettings} created using one of the function:
#' \itemize{
#' \item{logisticRegressionModel()}{ A lasso logistic regression model}
#' \item{GBMclassifier()}{ A gradient boosting machine}
#' \item{RFclassifier()}{ A random forest model}
#' \item{GLMclassifier ()}{ A generalised linear model}
#' \item{KNNclassifier()}{ A KNN model}
#' \item setLassoLogisticRegression() A lasso logistic regression model
#' \item setGradientBoostingMachine() A gradient boosting machine
#' \item setRandomForest() A random forest model
#' \item setKNN() A KNN model
#' }
#' @param search The search strategy for the hyper-parameter selection (currently not used)
#' @param analysisId The id of the analysis
Expand Down
4 changes: 2 additions & 2 deletions R/HelperFunctions.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ removeInvalidString <- function(string){


# Borrowed from devtools: https://github.com/hadley/devtools/blob/ba7a5a4abd8258c52cb156e7b26bb4bf47a79f0b/R/utils.r#L44
is_installed <- function (pkg, version = 0) {
is_installed <- function (pkg) {
installed_version <- tryCatch(utils::packageVersion(pkg),
error = function(e) NA)
!is.na(installed_version) && installed_version >= version
!is.na(installed_version)
}

# Borrowed and adapted from devtools: https://github.com/hadley/devtools/blob/ba7a5a4abd8258c52cb156e7b26bb4bf47a79f0b/R/utils.r#L74
Expand Down
19 changes: 7 additions & 12 deletions R/LearningCurve.R
Original file line number Diff line number Diff line change
Expand Up @@ -40,25 +40,20 @@
#' \code{trainFractions}. Note, providing \code{trainEvents} will override
#' your input to \code{trainFractions}. The format should be as follows:
#' \itemize{
#' \item{ \code{c(500, 1000, 1500) } - a list of training events}
#' \item \code{c(500, 1000, 1500) } - a list of training events
#' }
#' @param featureEngineeringSettings An object of \code{featureEngineeringSettings} specifying any feature engineering to be learned (using the train data)
#' @param preprocessSettings An object of \code{preprocessSettings}. This setting specifies the minimum fraction of
#' target population who must have a covariate for it to be included in the model training
#' and whether to normalise the covariates before training
#' @param modelSettings An object of class \code{modelSettings} created using one of the function:
#' \itemize{
#' \item{setLassoLogisticRegression()}{ A lasso logistic regression model}
#' \item{setGradientBoostingMachine()}{ A gradient boosting machine}
#' \item{setAdaBoost()}{ An ada boost model}
#' \item{setRandomForest()}{ A random forest model}
#' \item{setDecisionTree()}{ A decision tree model}
#' \item{setCovNN())}{ A convolutional neural network model}
#' \item{setCIReNN()}{ A recurrent neural network model}
#' \item{setMLP()}{ A neural network model}
#' \item{setDeepNN()}{ A deep neural network model}
#' \item{setKNN()}{ A KNN model}
#'
#' \item \code{setLassoLogisticRegression()} A lasso logistic regression model
#' \item \code{setGradientBoostingMachine()} A gradient boosting machine
#' \item \code{setAdaBoost()} An ada boost model
#' \item \code{setRandomForest()} A random forest model
#' \item \code{setDecisionTree()} A decision tree model
#' \item \code{setKNN()} A KNN model
#' }
#' @param logSettings An object of \code{logSettings} created using \code{createLogSettings}
#' specifying how the logging is done
Expand Down
12 changes: 6 additions & 6 deletions R/Logging.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@
#'
#' @param verbosity Sets the level of the verbosity. If the log level is at or higher in priority than the logger threshold, a message will print. The levels are:
#' \itemize{
#' \item{DEBUG}{Highest verbosity showing all debug statements}
#' \item{TRACE}{Showing information about start and end of steps}
#' \item{INFO}{Show informative information (Default)}
#' \item{WARN}{Show warning messages}
#' \item{ERROR}{Show error messages}
#' \item{FATAL}{Be silent except for fatal errors}
#' \item DEBUG Highest verbosity showing all debug statements
#' \item TRACE Showing information about start and end of steps
#' \item INFO Show informative information (Default)
#' \item WARN Show warning messages
#' \item ERROR Show error messages
#' \item FATAL Be silent except for fatal errors
#' }
#' @param timeStamp If TRUE a timestamp will be added to each logging statement. Automatically switched on for TRACE level.
#' @param logName A string reference for the logger
Expand Down
30 changes: 12 additions & 18 deletions R/RunPlp.R
Original file line number Diff line number Diff line change
Expand Up @@ -49,17 +49,12 @@
#' and whether to normalise the covariates before training
#' @param modelSettings An object of class \code{modelSettings} created using one of the function:
#' \itemize{
#' \item{setLassoLogisticRegression()}{ A lasso logistic regression model}
#' \item{setGradientBoostingMachine()}{ A gradient boosting machine}
#' \item{setAdaBoost()}{ An ada boost model}
#' \item{setRandomForest()}{ A random forest model}
#' \item{setDecisionTree()}{ A decision tree model}
#' \item{setCovNN())}{ A convolutional neural network model}
#' \item{setCIReNN()}{ A recurrent neural network model}
#' \item{setMLP()}{ A neural network model}
#' \item{setDeepNN()}{ A deep neural network model}
#' \item{setKNN()}{ A KNN model}
#'
#' \item setLassoLogisticRegression() A lasso logistic regression model
#' \item setGradientBoostingMachine() A gradient boosting machine
#' \item setAdaBoost() An ada boost model
#' \item setRandomForest() A random forest model
#' \item setDecisionTree() A decision tree model
#' \item setKNN() A KNN model
#' }
#' @param logSettings An object of \code{logSettings} created using \code{createLogSettings}
#' specifying how the logging is done
Expand All @@ -71,13 +66,12 @@
#' An object containing the following:
#'
#' \itemize{
#' \item{inputSettings}{A list containing all the settings used to develop the model}
#' \item{model}{ The developed model of class \code{plpModel}}
#' \item{executionSummary}{ A list containing the hardward details, R package details and execution time}
#' \item{performanceEvaluation}{ Various internal performance metrics in sparse format}
#' \item{prediction}{ The plpData cohort table with the predicted risks added as a column (named value)}
#' \item{covariateSummary)}{ A characterization of the features for patients with and without the outcome during the time at risk}
#' \item{analysisRef}{ A list with details about the analysis}
#' \item model The developed model of class \code{plpModel}
#' \item executionSummary A list containing the hardward details, R package details and execution time
#' \item performanceEvaluation Various internal performance metrics in sparse format
#' \item prediction The plpData cohort table with the predicted risks added as a column (named value)
#' \item covariateSummary A characterization of the features for patients with and without the outcome during the time at risk
#' \item analysisRef A list with details about the analysis
#' }
#'
#'
Expand Down
6 changes: 3 additions & 3 deletions R/Sampling.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@
#' Returns an object of class \code{sampleSettings} that specifies the sampling function that will be called and the settings
#'
#' @param type (character) Choice of: \itemize{
#' \item{'none'}{ No sampling is applied - this is the default }
#' \item{'underSample')}{Undersample the non-outcome class to make the data more ballanced}
#' \item{'overSample'}{Oversample the outcome class by adding in each outcome multiple times}
#' \item 'none' No sampling is applied - this is the default
#' \item 'underSample' Undersample the non-outcome class to make the data more ballanced
#' \item 'overSample' Oversample the outcome class by adding in each outcome multiple times
#' }
#' @param numberOutcomestoNonOutcomes (numeric) An numeric specifying the require number of non-outcomes per outcome
#' @param sampleSeed (numeric) A seed to use when splitting the data for reproducibility (if not set a random number will be generated)
Expand Down
Loading

0 comments on commit 1a1f620

Please sign in to comment.