[DEV] Automation deployment of 13 PT Clusters #63

nghia71 · 2023-11-15T17:47:49Z

As a developer, I want to create scripts that automates the deployment of 13 Clusters.

Each cluster consists of:

A HAPI FHIR server, accessible from Internet
A Patient Browser, accessible from Internet
A PT Analytics Cluster (pt_cluster) - including a Kafka Cluster, a Neo4j instance, two NeoDash instances (for dashboard design/view), a PostgreSQL (for providing access to relational data from PT) - not accessible from Internet, its broker, schema-registry and connect external ports are accessible by the Access Control Gateway of the local Data Governance Gateway,
A Data Governance Gateway, providing:
- UI,accessible from Internet, for data governance in both extraction and access control of PT data
- Access Control Gateway micro services for requests/responses flowing in both f-to-pt and pt-to-f direction. This needs to access the FederalKafka cluster broker, schema-registry and connect external ports.

The deployment process is as follow:

Using Synthea to generate 1K EHRs of synthetic patient population (SPP) (usually this means 1K patients alive and approx 0.17K deaths in addition). Perform minimal test.
Setup a HAPI FHIR server. Perform minimal test.
Upload 1K+ EHRs into the newly setup HAPI FHIR server. Perform minimal test.
Setup a Patient Browser. Perform minimal test.
Setup a PT Analytics Cluster.
Convert EHR events stored in cvs files generated by Synthea into files containing (key|value) pairs in (JSON) AVRO format.
Stream the events - (key|value) pairs in (JSON) AVRO format - into pre-configured Kafka topics. By doing this, most of the 1K+ EHRs data now are in Kafka event streams.
Configure constraints and indexed, then configure sink connectors (for Neo4j) in order to populate the analytics platform (represented by the graph database Neo4j) with the events (from the streams). Once this done, the PT Analytics Platform is ready with populated data.
Configuring source/sink connectors (for Neo4j) in order to receive Federated Analytics Requests (FAR_1 to FAR_9) from the Federal Analytics Platform via a micro service (f-to-pt) of the Access Control Gateway, and to send Federated Analytics Responses (FAS_1 to FAS_9) to the Federal Analytics Platform via another micro service (pt-to-f) of the Access Control Gateway. Finishing this, the PT Analytics Platform is ready to serve federated queries.

The text was updated successfully, but these errors were encountered:

nghia71 assigned simardeep1792 Nov 15, 2023

Provide feedback