export createConceptCountsTable() #1074

cebarboza · 2023-09-07T09:26:20Z

cebarboza
Sep 7, 2023
Collaborator

At Darwin EU we are trying to run CohortDiagnostics in a more efficient way when running it for multiple studies. We believe it would be a good enhancement to export the function createConceptCountsTable() to generate the concept_counts before executing the diagnostics.

By doing this, we can perform this calculation just once for a specific vocabulary_version, instead of repeating this process for each study.

In our fork darwin-eu-dev/CohortDiagnostics, the user can createConceptCountsTable(). This table is saved in the cohortDatabaseSchema.

CohortDiagnostics::createConceptCountsTable(connectionDetails = connectionDetails,
                                            cdmDatabaseSchema = cdmDatabaseSchema,
                                            conceptCountsDatabaseSchema = cohortDatabaseSchema,
                                            conceptCountsTable = "concept_counts",
                                            removeCurrentTable = TRUE)

Then we use the parameter useExternalConceptCountsTable in executeDiagnostics(). If TRUE, executeDiagnostics() uses the concept_counts created previously in the cohortDatabaseSchema. The user should specify the name of the external concept counts table, generally concept_counts

CohortDiagnostics::executeDiagnostics(cohortDefinitionSet =  cohortDefinitionSet,
                                      connectionDetails = connectionDetails,
                                      cohortTable = cohortTable,
                                      cohortDatabaseSchema = cohortDatabaseSchema,
                                      cdmDatabaseSchema = cdmDatabaseSchema,
                                      conceptCountsTable = "concept_counts",
                                      exportFolder = exportFolder,
                                      databaseId = "Eunomia",
                                      minCellCount = 5,
                                      useExternalConceptCountsTable = FALSE)

We also modified the CreateConceptCountTable.sql file, to add a new column with the vocabulary_version.

https://github.com/darwin-eu-dev/CohortDiagnostics/blob/ca6d9074bb097b9ce60b7bc6bf72e68a84f650fe/inst/sql/sql_server/CreateConceptCountTable.sql#L102C1-L106C2


{@table_is_temp} ? {} : { 
ALTER TABLE @work_database_schema.@concept_counts_table
ADD vocabulary_version VARCHAR(20) NULL;
UPDATE @work_database_schema.@concept_counts_table SET vocabulary_version = (SELECT vocabulary_version FROM @cdm_database_schema.vocabulary WHERE vocabulary_id = 'None');
}

Then, there are checks in place that evaluate if the vocabulary_version in the concept_counts table is equal to the version of the database the user is running the diagnostics.

https://github.com/darwin-eu-dev/CohortDiagnostics/blob/ca6d9074bb097b9ce60b7bc6bf72e68a84f650fe/R/RunDiagnostics.R#L679C1-L708C4

 # Defines variables and checks version of external concept counts table -----
  if (useExternalConceptCountsTable == FALSE) {
    conceptCountsTableIsTemp <- TRUE
    if (conceptCountsTable != "#concept_counts") {
      conceptCountsTable <- "#concept_counts"
    }
  } else {
    if (conceptCountsTable == "#concept_counts") {
      stop("Temporary conceptCountsTable name. Please provide a valid external ConceptCountsTable name")
    }
    conceptCountsTableIsTemp <- FALSE
    conceptCountsTable <- conceptCountsTable
    dataSourceInfo <- getCdmDataSourceInformation(connection = connection, cdmDatabaseSchema = cdmDatabaseSchema)
    vocabVersion <- dataSourceInfo$vocabularyVersion
    vocabVersionExternalConceptCountsTable <- DatabaseConnector::renderTranslateQuerySql(
      connection = connection,
      sql = "SELECT DISTINCT vocabulary_version FROM @work_database_schema.@concept_counts_table;",
      work_database_schema = cohortDatabaseSchema,
      concept_counts_table = conceptCountsTable,
      snakeCaseToCamelCase = TRUE,
      tempEmulationSchema = getOption("sqlRenderTempEmulationSchena")
    )
    if (!identical(vocabVersion, vocabVersionExternalConceptCountsTable[1,1])) {
      stop(paste0("External concept counts table (", 
                 vocabVersionExternalConceptCountsTable, 
                 ") does not match database (", 
                 vocabVersion, 
                 "). Update concept_counts with createConceptCountsTable()"))
    }
  }

There's also a vignette to explain how to run this functions UseExternalConceptTable.Rmd. We have been testing this approach but we wanted to discuss this before sending a pull request.

azimov · 2023-09-19T19:45:17Z

azimov
Sep 19, 2023
Maintainer

closing discussion as it replicates #1067

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

export createConceptCountsTable() #1074

{{title}}

Replies: 1 comment

{{title}}

Select a reply

export createConceptCountsTable() #1074

cebarboza Sep 7, 2023 Collaborator

Replies: 1 comment

azimov Sep 19, 2023 Maintainer

cebarboza
Sep 7, 2023
Collaborator

azimov
Sep 19, 2023
Maintainer