From b10933c76ce5fdfa6d0868ec07de0931da6b0f05 Mon Sep 17 00:00:00 2001 From: Kalibh Halford Date: Fri, 5 Jul 2024 11:42:30 +0100 Subject: [PATCH 1/2] DOC: Added a directory and read me for a project. This project is for the WEX students to work on. --- README.md | 5 +++++ wex_llm_app/README.md | 49 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) create mode 100644 wex_llm_app/README.md diff --git a/README.md b/README.md index afa4deee..396c8147 100644 --- a/README.md +++ b/README.md @@ -106,3 +106,8 @@ One terraform script to create a private network on openstack along with a route A Python script that when run, creates a filter word cloud from the summary of tickets over a time period. [More Here](word_cloud_generator/) + +## wex_llm_app + +Developed by Cloud WEX Students. A web application to host and interact with a large language model instance. +[More Here](wex_llm_app) diff --git a/wex_llm_app/README.md b/wex_llm_app/README.md new file mode 100644 index 00000000..cf1afc3f --- /dev/null +++ b/wex_llm_app/README.md @@ -0,0 +1,49 @@ +**Project Title:** End-end deployment of a Large Language Model web application + +**Project Overview:** + +Create a web application that allows users to send questions to a LLM backend and receive a response. The page will display the user's question, response and the response time. + +**Functional Requirements:** + +1. User Input: + * A text input field for users to enter questions. + * A submit button to send the question to the backend. +2. LLM Backend: + * Intergrate with an Ollama LLM API instance using basic auth or tokens. + * Send streamed/completed responses to the users session. +3. Response Display: + * Display the users question. + * Display the LLM response either streamed or complete. +4. Response Time Measurment: + * Measure and display the round trip time (RTT) and generation time. +5. Instructions for self-hosting: + * Provide simple setup instructions for a user to host their own LLM. + +**Non-Functional Requirements (Priority):** + +1. Performance (Medium): + * Ensure the webpage responds in a timely mannor. Think complete/streamed responses. +2. Security (High): + * Implement basic auth or API tokens for authentication with the backend. + * Set up HTTPS using Let's Encrypt to encrypt traffic to the backend. +3. Usability (Low): + * Design an intuitive interface for users to submit questions and view responses. + +**Technical Requirements:** + +1. Frontend: + * Build the webpage using HTML, CSS and JavaScript. + * Use native JavaScript APIs to make HTTP requests to the backend. +2. Backend: + * Use the Ollama LLM API with authentication for receiving user questions and sending responses. + * (To be investigated) Use a database to store authentication information. +3. Infrastructure: + * Use Terraform to provision multiple VM instances and networking. + * Configure machines/containers using Ansible playbooks to ensure consistency in deployment. + * Run the web server on multiple VM instances, each with its own Docker container. + * Use an Apache container to host the website within each Web VM. + * Set up a separate VM instance (with or without Docker) to run HAProxy and Let's Encrypt services. + * Configure HAProxy to load balance traffic across the multiple web server instances. + * Host the Ollama service directly on a separate GPU VM. + From b6610c82a4ffe4f29fc65b05d1e6a1a26996fa04 Mon Sep 17 00:00:00 2001 From: Kalibh Halford Date: Fri, 5 Jul 2024 11:53:08 +0100 Subject: [PATCH 2/2] DOC: Added useful resource links --- wex_llm_app/README.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/wex_llm_app/README.md b/wex_llm_app/README.md index cf1afc3f..47a0a2e6 100644 --- a/wex_llm_app/README.md +++ b/wex_llm_app/README.md @@ -46,4 +46,14 @@ Create a web application that allows users to send questions to a LLM backend an * Set up a separate VM instance (with or without Docker) to run HAProxy and Let's Encrypt services. * Configure HAProxy to load balance traffic across the multiple web server instances. * Host the Ollama service directly on a separate GPU VM. - + +**Useful Resources:** + +1. [HTML, CSS, JS resources](https://developer.mozilla.org/en-US/) +2. [HTML Tutorial](https://www.w3schools.com/html/) +3. [CSS Tutorial](https://www.w3schools.com/css/) +4. [JavaScript Tutorial](https://www.w3schools.com/js/) +5. [Terraform Documentation](https://registry.terraform.io/providers/terraform-provider-openstack/openstack/latest/docs) +6. [Ansible Documentation](https://docs.ansible.com/) +7. [Docker Documnetation](https://docs.docker.com/) +8. [Ollama Documentation](https://github.com/ollama/ollama)