Add validation scripts overview

OHDSI · Jan 11, 2024 · 30b2b06 · 30b2b06
1 parent 5352248
commit 30b2b06
Show file tree

Hide file tree

Showing 2 changed files with 30 additions and 7 deletions.
diff --git a/rmd/tooling.Rmd b/rmd/tooling.Rmd
@@ -53,26 +53,42 @@ This package identifies oncology regimens. Firstly, it identifies all patients w
 
 ---
 
-# **Development & Testing**
+# **Database Characterization and Validation**
 
 <br> 
 
-## Standards Adherence Validation
+## Purpose
 
-<br> 
+Provide a semi-automated and extensible framework for generating, visualizing, and sharing an assessment of an OMOP-shaped database's adherence to the OHDSI Oncology Standard (tables, vocabulary) and the availabilty and types of oncology data it contains.
+
+## Overview
 
+The star of the framework is an R Package. Along with cataloguing an extensible set of queries and analyses used for assessing OMOP-shaped oncology data, the R package provides functionality for the four major processes involved in the framework:
+
+1) Authoring an assessment specification
+2) Executing an assessment specification
+3) Generating assessment results
+4) Visualizing assessment results
 
 ### Approach
 
-<br> 
+_Assessments_ can be executed against an OMOP-shaped database to create a characterization and quality report. They are created using specificications. 
+
+_Specifications_ are JSON files that describe an assessment. They are composed by compiling analyses together with threshhold values. 
 
+_Analyses_ execute a query and return a row count or proportion describing the contents in the database. For example, analysis_id=1234 returns "the number of cancer diagnosis records derived from Tumor Registry source data".
 
-### ?
+_Threshholds_ provide study specific context to the results of analyses. An analysis asks how many cancer diagnoses derived from tumor registry data are in the database. Using threshholds, an assessment author can give ranges for "bad", "questionable", and "good" analysis results as they pertain to their study. An example threshhold, which would be encoded as JSON, could express the sentiment "A database with 0-200 diagnoses from tumor registry data would be unfit for this study, 201-500 diagnoses may be suitable, and over 500 diagnoses will be more enough."
 
+### Extensibility
+
+This tool is a product of collaboration. See the validation scripts README for detailed instructions on creating analyses (TODO) and using the R package to author and execute assessment specifications (TODO).
 <br> 
 
 
----
+# **Development & Testing**
+
+<br>
 
 ## Delta Vocabulary Framework
 

diff --git a/validationScripts/README.md b/validationScripts/README.md
@@ -13,6 +13,13 @@ The star of the framework is an R Package. Along with cataloguing an extensible
 3) Generating assessment results
 4) Visualizing assessment results
 
-_Assessments_ are created using specificications. _Specifications_ are composed by compiling analyses together with threshhold values. _Analyses_ return a number or proportion related to contents in the database. For example, analysis_id=1234 returns "the number of cancer diagnosis records derived from Tumor Registry source data". Threshholds 
+### Approach
 
+_Assessments_ can be executed against an OMOP-shaped database to create a characterization and quality report. They are created using specificications. 
+
+_Specifications_ are JSON files that describe an assessment. They are composed by compiling analyses together with threshhold values. 
+
+_Analyses_ execute a query and return a row count or proportion describing the contents in the database. For example, analysis_id=1234 returns "the number of cancer diagnosis records derived from Tumor Registry source data".
+
+_Threshholds_ provide study specific context to the results of analyses. An analysis asks how many cancer diagnoses derived from tumor registry data are in the database. Using threshholds, an assessment author can give ranges for "bad", "questionable", and "good" analysis results as they pertain to their study. An example threshhold, which would be encoded as JSON, could express the sentiment "A database with 0-200 diagnoses from tumor registry data would be unfit for this study, 201-500 diagnoses may be suitable, and over 500 diagnoses will be more enough."