From c7161aaac80e51f4980b58d830a96939e6381246 Mon Sep 17 00:00:00 2001 From: Steven Su <113653645+stevensusas@users.noreply.github.com> Date: Tue, 19 Nov 2024 20:07:04 -0500 Subject: [PATCH] Update README.md --- README.md | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index dd3e3bf..bc2ee39 100644 --- a/README.md +++ b/README.md @@ -14,24 +14,28 @@ Password: 123456 ## Motivation Pitch deck: [CompGrid.pdf](https://github.com/user-attachments/files/17822533/CompGrid.pdf) +Behind every life science discovery are hardware and computing infrastructure that supports the computational cost. You cannot have a powerful deep learning model for predicting protein folding without the GPU required for inference and training, and you cannot have a powerful sequence alignment algorithm without the necessary data storage and distributed computing infrastructure, for example. + +When bioinformatics researchers needs access to computing resources beyong their personal machines, they turn to third-party cloud computing service providers like AWS, GCP and Azure as well as their local research institution's computing cores. In many cases, they are not allowed to use third-party cloud computing service providers due to information sensitivity--for example, HIPAA-protected patient information involved in biomedical research cannot be stored on third-party servers. Therefore, most of the time researchers utilize their own institution's computing cluster for additioal computing resources. + +However, management of local computing clusters for these research institutions poses challenges involving multiple stakeholders. Cluster owners encounter difficulties in setting up clusters and creating custom management interfaces, while cluster users struggle with resource allocation, usage monitoring, and technical troubleshooting. Additionally, billing processes are inefficient, relying on manual methods like emails, phone calls, and forms, along with grant budget codes and manual tracking of resource usage, making the system cumbersome and error-prone. + +The goal of CompGrid is to provide a unified platform that simplifies the local computing resource management process. + ## Engineering Specs image ### Cluster Simulation -### Backend - -#### User Authentication +Since we don't have a physical cluster and bare metal instances to work with, we used **UTM** to spin up a series of ArchLinux virtual machines to simulate bare metal instances. These cluster of virtual machines help us demo our project. Additionally, we used **FastAPI** to create a python middleware with endpoints exposed through **ngrok** to manage our UTM cluster remotely. -### Frontend +### Full Stack Development -#### Nginx Deployment +Our backend is built with **NodeJS** and **ExpressJS**. Additionally, we used JWT to create an authentication middleware responsible for user authentication and login functionalities. Our frontend is buit with **React**, and is deployed in a production setting through **Nginx** to optimize static file serving, SPA fallbacks, and client-side performance. ### Data Storage -#### Database - -#### Redis +I used **PostgreSQL** hosted on **AWS RDS** for persistent, relational data storage. I also used **Redis** for data caching, specifically for enabling the feature of robustly storing users' instance usage logs. ### DevOps