From 5e86f6bcc1243619cbfd39dea916f92bf7c33209 Mon Sep 17 00:00:00 2001
From: Dan Birman <danbirman@gmail.com>
Date: Sat, 23 Nov 2024 18:52:47 -0800
Subject: [PATCH] docs: updating readme with info copied from main docs

---
 README.md | 144 +++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 104 insertions(+), 40 deletions(-)

diff --git a/README.md b/README.md
index 388e6f7..a1d2075 100644
--- a/README.md
+++ b/README.md
@@ -1,32 +1,78 @@
 # QC Portal
 
-The QC Portal app makes the `quality_control` metadata (see [aind-data-schema](https://github.com/allenNeuralDynamics/aind-data-schema)) explorable and provides tools for manual annotation of metrics.
+The [QC Portal](https://qc.allenneuraldynamics.org/qc_portal_app) is a browser application that allows users to view and interact with the AIND QC metadata and to annotate ``PENDING`` metrics with qualitative evaluations. The portal is currently maintained by Dan Birman in scientific computing, reach out with any questions or concerns.
 
-For general documentation about the QC metadata, go [here](https://aind-data-schema.readthedocs.io/en/latest/quality_control.html)
+The portal works by pulling the metadata from the Document Database (DocDB) and pulling reference figures from Code Ocean (CO) data assets, or from storage in Kachery Cloud.
 
-## Uploading data from CO Capsules
+The portal allows users to annotate `PENDING` metrics. Logged in users can modify the value, state, and notes on metrics. When you make changes the **submit** button will be enabled. Submitting pushes your updates to DocDB along with a timestamp and your name.
+
+For general documentation about the QC metadata, go [here](https://aind-data-schema.readthedocs.io/en/latest/quality_control.html).
+
+## Defining metrics for the QC portal
+
+For AIND users, we expect your metrics to have actionable `value` fields. Either the value should be a number that a rule can be applied to (e.g. a threshold) or it should refer to the state of the reference (e.g. "high drift" when linked to a drift map, or "acceptable contrast" when linked to a video).
+
+All metrics should have a `reference` figure attached. Even if you are just calculating numbers, your reference figures can put those numbers in context for viewers.
+
+**Q: How do reference URLs get pulled into the QC Portal?**
+
+Each metric is associated with a reference figure. We support:
+
+- Vector files (svg, pdf)
+- Images (png, jpg, etc)
+- Videos (mp4)
+- Neuroglancer links (url)
+- Rerun files (rrd)
+
+Figures, images, and videos can be any size, but they will fit best on the screen if they are landscape and shaped roughly like a computer screen (for example, 1280×800 or 1900×1200 px).
+
+You can link to your references in one of four ways:
+
+- Provide a relative path to a file in the data asset's S3 bucket, i.e. "figures/my_figure.png". The mount/asset name should not be included.
+- Provide a url to a publicly accessible file, i.e. "https://mywebsite.com/myfile.png"
+- Provide a path to any public S3 bucket, i.e. "s3://bucket/myfile.png"
+- Provide a kachery-cloud hash, i.e. "sha1://uuid.ext", note that only for FigURL hashes you **must append the filetype**. The easiest way to do this is to set the `label` field to the filename, see below.
+
+**Q: I saw fancy things like dropdowns in the QC Portal, how do I do that?**
+
+The portal supports a few special cases to allow a bit more flexibility or to constrain the actions that manual annotators can take. Install the [`aind-qcportal-schema`](https://github.com/AllenNeuralDynamics/aind-qcportal-schema/blob/dev/src/aind_qcportal_schema/metric_value.py) package and set the `value` field to the corresponding pydantic object to use these. Current options include:
+
+- Dropdowns (optionally the options can auto-set the value)
+- Checkboxes (again options can auto-set the value)
+- Rule-based metrics (the rule is automatically run to set the value)
+- Multi-asset metrics where each asset is assigned it's own value
+
+There are also some custom rules for the value field. If you provide:
+
+- Two strings separated by a semicolon `;` they will be displayed in a "Swipe" pane that lets you swipe back and forth between the two things. Mostly useful for overlay images.
+- A dictionary where every value is a list of equal length, it will be displayed as a table where the keys are column headers and the values are rows. If a key "index" is included the values will be used to name the rows.
+
+## How to upload data from CO Capsules
 
 ### Preferred workflow
 
-Use the preferred workflow if you are generating a data asset. Your `quality_control.json` will go in the top level and your figures will go in a folder. Follow the steps below:
+Use the preferred workflow if you are **generating a data asset**, e.g. when uploading raw data or generating a new derived data asset. Your `quality_control.json` will go in the top level and your figures will go in a folder. Follow the steps below:
 
 1. Develop your QC pipeline, generating metrics and reference figures as needed. Place references in the `results/` folder.
 2. Populate your `QCEvaluation` objects with metrics. The `reference` field should contain the path *relative to the results* folder. I.e. the file `results/figures/my_figure.png` should be included as `QCMetric.reference = "figures/my_figure.png"`. 
 3. Write the standard QC file: `QualityControl.write_standard_file()`
 
-Done!
+Make sure to follow the standard instructions for building derived assets: copy all metadata files, upgrade the data_description to derived, and name your asset according to the expected conventions. Make sure to tag your data asset as `derived` so that it will be picked up by the indexer.
+
+Done! In the preferred workflow no additional permissions are required. Your QC data will appear in the portal within four hours of creation.
 
 ### Alternate workflow
 
-Use the alternate workflow if you are **not** generating a data asset and therefore need to push your QC data back to an existing data asset. You need to push your `QCEvaluation` objects to DocDB and you need to push your figures to `kachery-cloud`.
+Use the alternate workflow if you are **not generating a data asset** and therefore need to push your QC data back to an already existing data asset. You will push your `QCEvaluation` objects directly to DocDB and you will need to push your figures to `kachery-cloud`, an external repository that generates permanent links to uploaded files.
 
-You'll need to run `pip install kachery-cloud aind-data-access-api[docdb]` as part of your environment setup.
+Two things need to be setup in your capsule:
 
-Then, in your capsule settings attach the `aind-codeocean-power-user` role. If you don't have access to this role, ask someone in Scientific Computing to attach it for you.
+1. You'll need to run `pip install kachery-cloud` and `pip install aind-data-access-api[docdb]` as part of your environment setup.
+2. In your capsule settings attach the `aind-codeocean-power-user` role. If you don't have access to this role, ask someone in Scientific Computing to attach it for you.
 
 #### (1) Acquire your DocDB _id using your data asset's name
 
-To upload directly to DocDB you'll need to know your DocDB `_id`. You can get it by adding this code to your capsule and calling `query_docdb_id(asset_name)`.
+To upload directly to DocDB you'll need to know your asset's `_id`. You can get it by adding this code to your capsule and calling `query_docdb_id(asset_name)`. Note that this *is not the data asset id in Code Ocean*!
 
 ```{python}
 def query_docdb_id(asset_name: str):
@@ -57,35 +103,11 @@ def query_docdb_id(asset_name: str):
     return docdb_id
 ```
 
-#### (2) Metadata
-
-Generate your `QCEvaluation` objects. Then run the following code snippet. You can pass all your evaluations as a list or pass them one at a time:
-
-```{python}
-session = boto3.Session()
-credentials = session.get_credentials()
-host = "api.allenneuraldynamics.org"
-
-auth = AWSRequestsAuth(
-aws_access_key=credentials.access_key,
-aws_secret_access_key=credentials.secret_key,
-aws_token=credentials.token,
-aws_host="api.allenneuraldynamics.org",
-aws_region='us-west-2',
-aws_service='execute-api'
-)
-url = f"https://{host}/v1/add_qc_evaluation"
-post_request_content = {"data_asset_id": docdb_id,
-                        "qc_evaluation": qc_eval.model_dump(mode='json')}
-response = requests.post(url=url, auth=auth, 
-                        json=post_request_content)
+#### (2) Generate your QC data
 
-if response.status_code != 200:
-    print(response.status_code)
-    print(response.text)
-```
+Generate your metrics and reference figures. Put your figures in folders in the results, e.g. `results/figures/` and store the filepaths.
 
-#### (3) Figures
+#### (3) Push figures to `kachery-cloud`
 
 Your figures should already exist in folders in your `results/`. Then, in your capsule code, pull the Kachery Cloud credentials using this function:
 
@@ -118,9 +140,13 @@ def get_kachery_secrets():
     os.environ['KACHERY_ZONE'] = kachery_secrets['KACHERY_ZONE']
     os.environ['KACHERY_CLOUD_CLIENT_ID'] = kachery_secrets['KACHERY_CLOUD_CLIENT_ID']
     os.environ['KACHERY_CLOUD_PRIVATE_KEY'] = kachery_secrets['KACHERY_CLOUD_PRIVATE_KEY']
+
+get_kachery_secrets()
 ```
 
-Each of your figures should then be uploaded as a stored file:
+The credentials are now stored as enviroment keys.
+
+Each of your figures should then be uploaded using the `store_file` function:
 
 ```
 import kachery_cloud as kcl
@@ -129,7 +155,41 @@ file_path = "your_file_path.ext"
 uri = kcl.store_file(file_path, label=file_path)
 ```
 
-Finally, set the reference field of each metric to the returned uri `QCMetric.reference = uri`. Each URI is a unique hashed string that will allow the portal to recover your file. Make sure to include the `label` parameter or we won't be able to identify your filetype in the portal.
+#### (4) Generate your QCEvaluation objects
+
+Generate your `QCEvaluation` objects now. Make sure to set the `QCMetric.reference` field of each metric to the returned uri `QCMetric.reference = uri` for that figure. Each URI is a unique hashed string that will allow the portal to recover your file. Make sure to include the `label` parameter or we won't be able to identify your filetype in the portal.
+
+Store all your `QCEvaluation` objects in a list.
+
+#### (5) Push metadata to DocDB
+
+Run the following code snippet. You can pass all your evaluations as a list or pass them one at a time:
+
+```{python}
+session = boto3.Session()
+credentials = session.get_credentials()
+host = "api.allenneuraldynamics.org"
+
+auth = AWSRequestsAuth(
+aws_access_key=credentials.access_key,
+aws_secret_access_key=credentials.secret_key,
+aws_token=credentials.token,
+aws_host="api.allenneuraldynamics.org",
+aws_region='us-west-2',
+aws_service='execute-api'
+)
+url = f"https://{host}/v1/add_qc_evaluation"
+post_request_content = {"data_asset_id": docdb_id,
+                        "qc_evaluation": qc_eval.model_dump(mode='json')}
+response = requests.post(url=url, auth=auth, 
+                        json=post_request_content)
+
+if response.status_code != 200:
+    print(response.status_code)
+    print(response.text)
+```
+
+If you get errors, contact Dan for help debugging.
 
 ### Reference/Figure recommendations
 
@@ -145,11 +205,15 @@ You can use gifs (<10 MB) or mp4 files (<100 MB). Make sure your mp4 files are *
 
 #### Neuroglancer
 
-You can set the reference directly to a neuroglancer link, they will open embedded in the portal in a way that can be easily changed to fullscreen.
+You can set the reference directly to a neuroglancer link, they will open embedded in the portal and can be easily switched to fullscreen mode.
 
 #### Rerun
 
-Rerun files (.rrd) can be linked in the reference, they will open in the rerun app embedded in the portal.
+Rerun files (.rrd) can be linked in the reference, they will open in the rerun app embedded in the portal and can be easily switched to fullscreen mode.
+
+#### Other
+
+We're prepared to support basically any kind of browser-displayable object. Reach out with ideas.
 
 ## Development