Skip to content

Commit

Permalink
data retrieval API & diagram of to GA4GH Visas
Browse files Browse the repository at this point in the history
  • Loading branch information
blankdots committed Dec 5, 2023
1 parent 92cd3c9 commit b297568
Show file tree
Hide file tree
Showing 4 changed files with 33 additions and 15 deletions.
38 changes: 28 additions & 10 deletions docs/dataout.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Data Retrieval API
==================

> NOTE:
> We maintain two Data Out API solutions, for which REST APIs are the
> We maintain two Data Retrieval API solutions, for which REST APIs are the
> same.
SDA-DOA
Expand Down Expand Up @@ -84,17 +84,37 @@ and can't expose REST API (but still can receive RabbitMQ messages).
Handling Permissions
--------------------

Data Out API can be run with connection to an AAI or without. If connection to an AAI provider is not possible, the `PASSPORT_PUBLIC_KEY_PATH` and `CRYPT4GH_PRIVATE_KEY_PATH` need to be
Data Retrieval API can be run with connection to an AAI or without. If connection to an AAI provider is not possible, the `PASSPORT_PUBLIC_KEY_PATH` and `CRYPT4GH_PRIVATE_KEY_PATH` need to be
set.

> NOTE:
> By default we use Elixir AAI as JWT for authentication
> By default we use LifeScience AAI as JWT for authentication
> `OPENID_CONFIGURATION_URL` is set to:
> <https://login.elixir-czech.org/oidc/.well-known/openid-configuration>
> <https://proxy.aai.lifescience-ri.eu/.well-known/openid-configuration>
If connected to an AAI provider the current implementation is based on
[GA4GH
Passports](https://github.com/ga4gh/data-security/blob/master/AAI/AAIConnectProfile.md)
[GA4GH Passports](https://github.com/ga4gh/data-security/blob/master/AAI/AAIConnectProfile.md)

```mermaid
sequenceDiagram
actor client
client->>sda-download: request datasets/data
note right of client: send HTTP Authorization Bearer JWT
activate sda-download
client->>sda-download: check datasets/data exists
sda-download-->AAI: request GA4GH Visas permissions (userinfo endpoint)
activate AAI
AAI->>GA4GH Visa Issuer: get GA4GH Visa from Issuer
GA4GH Visa Issuer->>AAI: GA4GH Visas
deactivate AAI
AAI->>sda-download: GA4GH Visas
note right of sda-download: check known GA4GH Visa Issuer
sda-download->GA4GH Visa Issuer: validate GA4GH visas signature
sda-download->>client: return datasets/data
deactivate sda-download
```

The AAI JWT payload should contain a GA4GH Passport claim in the scope:

Expand Down Expand Up @@ -149,10 +169,8 @@ SDA-download
> Source code repository is available at:
> [https://github.com/neicnordic/sda-download](https://github.com/neicnordic/sda-download)
Recommended provisioning method for production is:

- on a `kubernetes cluster` using the [helm
chart](https://github.com/neicnordic/sensitive-data-archive/tree/main/charts).
Recommended provisioning method for production is on a `kubernetes cluster` using the
[helm chart](https://github.com/neicnordic/sensitive-data-archive/tree/main/charts) `sda-svc` which contains the `sda-download`.

`sda-download` focuses on enabling deployment of a stand-alone version
of SDA, with features such as:
Expand Down
4 changes: 2 additions & 2 deletions docs/guides/deploy-k8s.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ Differences in deployment make concrete examples challenging, here it is explain

For secure deployment of the system one can think it by what can be accessed from where, for all ways of deploying two trust boundaries can be used, external and internal. For an extra layer of security also the storage trust boundary can be separate. The service is provided for customers on the internet therefore an example of deploying the service is using two separate Kubernetes clusters, one for responding to customers and other communication from outside, and another, more secure, storage facing internal cluster.

One thing to consider is where to release the data, that could be closed protected environment with tightly restricted access. If Data out is used to serve unencrypted files the recommendation is to have it available only in an internal cluster.
One thing to consider is where to release the data, that could be closed protected environment with tightly restricted access. If Data Retrieval API is used to serve unencrypted files the recommendation is to have it available only in an internal cluster.

The services could be divided into two trust boundaries
- The services in external cluster are [Inbox](/docs/submission.md#submission-inbox ) and [MQ](/docs/connection.md#local-message-broker)
- The services in internal cluster are [Intercept](/docs/services/intercept.md), [Ingest](/docs/services/ingest.md), [Verify](/docs/services/verify.md), [Mapper](/docs/services/mapper.md), [Finalize](/docs/services/finalize.md), [Backup](/docs/services/backup.md) and [Data out](/docs/dataout.md).
- The services in internal cluster are [Intercept](/docs/services/intercept.md), [Ingest](/docs/services/ingest.md), [Verify](/docs/services/verify.md), [Mapper](/docs/services/mapper.md), [Finalize](/docs/services/finalize.md), [Backup](/docs/services/backup.md) and [Data Retrieval API](/docs/dataout.md).

The innermost trust zone contains the database and the archive, which be can accessed only from internal cluster.

Expand Down
4 changes: 2 additions & 2 deletions docs/static/doa-api.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
openapi: 3.0.0
info:
description: SDA Data Out API Documentation derived from EGA Data API
description: SDA Data Retrieval API Documentation derived from EGA Data API
version: "1.0"
title: SDA Data Out API Documentation
title: SDA Data Retrieval API Documentation
license:
name: Apache 2.0
url: http://www.apache.org/licenses/LICENSE-2.0
Expand Down
2 changes: 1 addition & 1 deletion docs/structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ progress notifications whether the ingestion was successful, or whether there wa
More details about the [Ingestion Workflow](submission.md#ingestion-workflow).

Once a file has been successfully submitted and the ingestion process has been finalised, including receiving an `Accession ID` from Central
EGA. The Data Out API can be utilised to retrieve set file by utilising the `Accession ID`. More details in [Data Retrieval API](dataout.md#data-retrieval-api).
EGA. The Data Retrieval API can be utilised to retrieve set file by utilising the `Accession ID`. More details in [Data Retrieval API](dataout.md#data-retrieval-api).


Inter-communication between services
Expand Down

0 comments on commit b297568

Please sign in to comment.