Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DEV] Automation deployment of 13 PT Clusters #63

Open
nghia71 opened this issue Nov 15, 2023 · 0 comments
Open

[DEV] Automation deployment of 13 PT Clusters #63

nghia71 opened this issue Nov 15, 2023 · 0 comments
Assignees

Comments

@nghia71
Copy link
Collaborator

nghia71 commented Nov 15, 2023

As a developer, I want to create scripts that automates the deployment of 13 Clusters.

Each cluster consists of:

  • A HAPI FHIR server, accessible from Internet
  • A Patient Browser, accessible from Internet
  • A PT Analytics Cluster (pt_cluster) - including a Kafka Cluster, a Neo4j instance, two NeoDash instances (for dashboard design/view), a PostgreSQL (for providing access to relational data from PT) - not accessible from Internet, its broker, schema-registry and connect external ports are accessible by the Access Control Gateway of the local Data Governance Gateway,
  • A Data Governance Gateway, providing:
    • UI,accessible from Internet, for data governance in both extraction and access control of PT data
    • Access Control Gateway micro services for requests/responses flowing in both f-to-pt and pt-to-f direction. This needs to access the FederalKafka cluster broker, schema-registry and connect external ports.

The deployment process is as follow:

  • Using Synthea to generate 1K EHRs of synthetic patient population (SPP) (usually this means 1K patients alive and approx 0.17K deaths in addition). Perform minimal test.
  • Setup a HAPI FHIR server. Perform minimal test.
  • Upload 1K+ EHRs into the newly setup HAPI FHIR server. Perform minimal test.
  • Setup a Patient Browser. Perform minimal test.
  • Setup a PT Analytics Cluster.
  • Convert EHR events stored in cvs files generated by Synthea into files containing (key|value) pairs in (JSON) AVRO format.
  • Stream the events - (key|value) pairs in (JSON) AVRO format - into pre-configured Kafka topics. By doing this, most of the 1K+ EHRs data now are in Kafka event streams.
  • Configure constraints and indexed, then configure sink connectors (for Neo4j) in order to populate the analytics platform (represented by the graph database Neo4j) with the events (from the streams). Once this done, the PT Analytics Platform is ready with populated data.
  • Configuring source/sink connectors (for Neo4j) in order to receive Federated Analytics Requests (FAR_1 to FAR_9) from the Federal Analytics Platform via a micro service (f-to-pt) of the Access Control Gateway, and to send Federated Analytics Responses (FAS_1 to FAS_9) to the Federal Analytics Platform via another micro service (pt-to-f) of the Access Control Gateway. Finishing this, the PT Analytics Platform is ready to serve federated queries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants