diff --git a/docs/conventions.html b/docs/conventions.html index 156da4c..230ff5d 100644 --- a/docs/conventions.html +++ b/docs/conventions.html @@ -346,9 +346,16 @@
Git and GitHub offer a collaborative environment for proposing, +discussing, and implementing changes to a reference vocabulary such as +the OMOP Vocabulary.
+However, due to licensing and volume issues, it is not possible to +maintain and develop the entire OMOP vocabulary in a GitHub repository +as flat files.
+To work around this, a group of collaborators can maintain and +contribute to a growing list of edits to the OMOP Vocabulary. We call +this list of edits the “delta vocab”.
+The delta vocab, which is literally a collection of concept and +concept_relationship records exactly as they would represented in the +OMOP Vocabulary table, provides a lightweight representation of any +deviations from the official OMOP Vocabulary. From these tables, the +concept_ancestor table is then programmatically generated.
+Maintaining the change between the official OMOP +Vocabulary release and the Oncology Development Vocabulary allows for +rapid development of OHDSI Oncology studies that are untethered from the +official OMOP Vocabulary release cadence. By preserving only the changed +elements, instead of the entire Oncology Development Vocabulary, this +method provides a lightweight, GitHub-friendly solution, that is also +respectful of (by way of avoiding) the licensed vocabulary terms.
+The simplicity of maintaining as little of the vocabulary as possible +and using scripted logic to apply changes to the existing vocabulary +makes this method easy to implement and ideal for the core use case - +establishing standard concepts and remapping newly destandardized +terms.
+Three steps are necessary to deploy the delta vocabularies to your +local database:
+Download source vocab data and tools
Configure your local database
Ingest delta vocabulary files
To create the Oncology Development Vocabulary, you must download the +vocabTools +and deltaVocab +folders from the OHDSI/OncologyWG repository. It may be simplest to +clone the OHDSI/OncologyWG and work from there:
+git clone https://github.com/OHDSI/OncologyWG.git
These methods assume you have the latest official release of the OMOP +Vocabulary in two identical schemas in a Postgres database: - +prod: The prod schema contains the +official (“production”) OMOP Vocabulary. This vocabulary will not be +changed but can be used to refresh the dev schema. - +dev: The dev schema begins as an exact +copy of the official OMOP Vocabulary, but will be transformed into the +Oncology Development Vocabulary using the deltaVocab files and the +scripts in vocabTools.
+To enable the scripts in vocabTools, enter your database connection +details into the config.txt file.
+Create two folders in the vocabTools folder: concept and +concept_relationship.
+Move the deltaConcept and deltaConceptRelationship files to the new +concept and concept_relationship folders, respectively.
+Run updateConcept.bat to implement the changes from +deltaConcept to the dev schema in your database.
+Run updateConceptRelationship.bat to implement the +changes from deltaConceptRelationship to the dev schema in your +database.
+Run updateConceptAncestor.bat to rebuild +concept_ancestor based on the new concept and concept_relationship +tables in the dev schema.
+Using the delta vocab and helper scripts, a developer with an +official OMOP Vocabulary database can quickly create a full, working +version of the OMOP Vocabulary with all proposed changes implemented, +allowing for advanced testing and use of existing OHDSI tools with a +development version of the vocabulary.
+++See README of the vocabTools +directory for instructions for contributing to the Oncology Delta +Vocabulary
+
A GitHub Project has been created and customized to enable +collaborative and dynamic project management. Notably this project +exists at the organization level, not the repository level, thus +enabling extended functionality including issue triage across multiple +repositories.
+++Orientation and Onramp: GitHub Project +Orientation
+GitHub Project: Oncology Maturity +Sprint
+
We leverage the RMarkdown R Package to create content in Rmd files +and generate them as HTML. Through GitHub Pages, these HTML files can be +easily deployed as a project website. There are several options varying +in technical complexity to contribute to this documentation.
+++See here for more +details
+
Provide a semi-automated and extensible framework for generating, +visualizing, and sharing an assessment of an OMOP-shaped database’s +adherence to the OHDSI Oncology Standard (tables, vocabulary) and the +availabilty and types of oncology data it contains.
+Assessments can be executed against an OMOP-shaped database +to create a characterization and quality report. They are created using +specificications.
+Specifications are JSON files that describe an assessment. +They are composed by compiling analyses together with threshhold +values.
+Analyses execute a query and return a row count or +proportion describing the contents in the database. For example, +analysis_id=1234 returns “the number of cancer diagnosis records derived +from Tumor Registry source data”.
+Threshholds provide study specific context to the results of +analyses. An analysis asks how many cancer diagnoses derived from tumor +registry data are in the database. Using threshholds, an assessment +author can give ranges for “bad”, “questionable”, and “good” analysis +results as they pertain to their study. An example threshhold, which +would be encoded as JSON, could express the sentiment “A database with +0-200 diagnoses from tumor registry data would be unfit for this study, +201-500 diagnoses may be suitable, and over 500 diagnoses will be more +enough.”
+The R package provides functionality for the four major processes +involved in the framework:
+++See README of the validationScripts +directory for instructions for contributing to the Oncology Validation +Framework
+
Provide a semi-automated and extensible framework for generating, -visualizing, and sharing an assessment of an OMOP-shaped database’s -adherence to the OHDSI Oncology Standard (tables, vocabulary) and the -availabilty and types of oncology data it contains.
-The star of the framework is an R Package. Along with cataloguing an -extensible set of queries and analyses used for assessing OMOP-shaped -oncology data, the R package provides functionality for the four major -processes involved in the framework:
-Assessments can be executed against an OMOP-shaped database -to create a characterization and quality report. They are created using -specificications.
-Specifications are JSON files that describe an assessment. -They are composed by compiling analyses together with threshhold -values.
-Analyses execute a query and return a row count or -proportion describing the contents in the database. For example, -analysis_id=1234 returns “the number of cancer diagnosis records derived -from Tumor Registry source data”.
-Threshholds provide study specific context to the results of -analyses. An analysis asks how many cancer diagnoses derived from tumor -registry data are in the database. Using threshholds, an assessment -author can give ranges for “bad”, “questionable”, and “good” analysis -results as they pertain to their study. An example threshhold, which -would be encoded as JSON, could express the sentiment “A database with -0-200 diagnoses from tumor registry data would be unfit for this study, -201-500 diagnoses may be suitable, and over 500 diagnoses will be more -enough.”
-This tool is a product of collaboration. See the validation scripts
-README for detailed instructions on creating analyses (TODO) and using
-the R package to author and execute assessment specifications (TODO).
-
Documentation: Delta +Vocabulary
+Maintaining the change between the official OMOP Vocabulary release +and the Oncology Development Vocabulary allows for rapid development of +OHDSI Oncology studies that are untethered from the official OMOP +Vocabulary release cadence. By preserving only the changed elements, +instead of the entire Oncology Development Vocabulary, this method +provides a lightweight, GitHub-friendly solution, that is also +respectful of (by way of avoiding) the licensed vocabulary terms.
A suite of tools to quickly set up an environment for: 1) making -local edits to the OMOP Vocabularies (concept and concept_relationship -tables) 2) generating derived tables (concept_ancestor) 3) validating -the vocabulary changes 4) outputting the table delta in a sharable -format
+Git and GitHub offer a collaborative environment for proposing, -discussing, and implementing changes to a reference vocabulary such as -the OMOP Vocabulary.
-However, due to licensing and volume issues, it is not possible to -maintain and develop the entire OMOP vocabulary in a GitHub repository -as flat files.
-To work around this, a group of collaborators can maintain and -contribute to a growing list of edits to the OMOP Vocabulary. We call -this list of edits the “delta vocab”.
-The delta vocab, which is literally a collection of concept and -concept_relationship records exactly as they would represented in the -OMOP Vocabulary table, provides a lightweight representation of any -deviations from the official OMOP Vocabulary.
-Using the delta vocab and delta helper scripts, a developer with an -official OMOP Vocabulary database can quickly create a full, working -version of the OMOP Vocabulary with all proposed changes implemented, -allowing for advanced testing and use of existing OHDSI tools with a -development version of the vocabulary.
-The simplicity of maintaining as little of the vocabulary as possible -and using scripted logic to apply changes to the existing vocabulary -makes this method easy to implement, but also limited in its -capabilities. As of now, the primary use for the delta vocab method is -for destandardizing and mapping non-standard terms.
-For details on using the delta vocab and the helper scripts, see the -README and function reference (TODO)
-Documentation: Validation Framework
+A semi-automated and extensible framework for generating, +visualizing, and sharing database characterization and adherence to the +OHDSI Oncology Standard Conventions.