Skip to content

Commit

Permalink
Merge pull request #153 from stfc/wex-project-brief
Browse files Browse the repository at this point in the history
DOC: Added a directory and read me for a project.
  • Loading branch information
khalford authored Jul 9, 2024
2 parents 66d1861 + b6610c8 commit da67380
Show file tree
Hide file tree
Showing 2 changed files with 64 additions and 0 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,3 +106,8 @@ One terraform script to create a private network on openstack along with a route

A Python script that when run, creates a filter word cloud from the summary of tickets over a time period.
[More Here](word_cloud_generator/)

## wex_llm_app

Developed by Cloud WEX Students. A web application to host and interact with a large language model instance.
[More Here](wex_llm_app)
59 changes: 59 additions & 0 deletions wex_llm_app/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
**Project Title:** End-end deployment of a Large Language Model web application

**Project Overview:**

Create a web application that allows users to send questions to a LLM backend and receive a response. The page will display the user's question, response and the response time.

**Functional Requirements:**

1. User Input:
* A text input field for users to enter questions.
* A submit button to send the question to the backend.
2. LLM Backend:
* Intergrate with an Ollama LLM API instance using basic auth or tokens.
* Send streamed/completed responses to the users session.
3. Response Display:
* Display the users question.
* Display the LLM response either streamed or complete.
4. Response Time Measurment:
* Measure and display the round trip time (RTT) and generation time.
5. Instructions for self-hosting:
* Provide simple setup instructions for a user to host their own LLM.

**Non-Functional Requirements (Priority):**

1. Performance (Medium):
* Ensure the webpage responds in a timely mannor. Think complete/streamed responses.
2. Security (High):
* Implement basic auth or API tokens for authentication with the backend.
* Set up HTTPS using Let's Encrypt to encrypt traffic to the backend.
3. Usability (Low):
* Design an intuitive interface for users to submit questions and view responses.

**Technical Requirements:**

1. Frontend:
* Build the webpage using HTML, CSS and JavaScript.
* Use native JavaScript APIs to make HTTP requests to the backend.
2. Backend:
* Use the Ollama LLM API with authentication for receiving user questions and sending responses.
* (To be investigated) Use a database to store authentication information.
3. Infrastructure:
* Use Terraform to provision multiple VM instances and networking.
* Configure machines/containers using Ansible playbooks to ensure consistency in deployment.
* Run the web server on multiple VM instances, each with its own Docker container.
* Use an Apache container to host the website within each Web VM.
* Set up a separate VM instance (with or without Docker) to run HAProxy and Let's Encrypt services.
* Configure HAProxy to load balance traffic across the multiple web server instances.
* Host the Ollama service directly on a separate GPU VM.

**Useful Resources:**

1. [HTML, CSS, JS resources](https://developer.mozilla.org/en-US/)
2. [HTML Tutorial](https://www.w3schools.com/html/)
3. [CSS Tutorial](https://www.w3schools.com/css/)
4. [JavaScript Tutorial](https://www.w3schools.com/js/)
5. [Terraform Documentation](https://registry.terraform.io/providers/terraform-provider-openstack/openstack/latest/docs)
6. [Ansible Documentation](https://docs.ansible.com/)
7. [Docker Documnetation](https://docs.docker.com/)
8. [Ollama Documentation](https://github.com/ollama/ollama)

0 comments on commit da67380

Please sign in to comment.