diff --git a/examples/ChatQnA/deploy/index.rst b/examples/ChatQnA/deploy/index.rst index 0c3c3e55..31d9cc03 100644 --- a/examples/ChatQnA/deploy/index.rst +++ b/examples/ChatQnA/deploy/index.rst @@ -19,9 +19,10 @@ Single Node Kubernetes ********** +* Getting Started +* Using Helm Charts * Xeon & Gaudi with GMC * Xeon & Gaudi without GMC -* Using Helm Charts Cloud Native ************ diff --git a/examples/ChatQnA/deploy/k8s_getting_started.md b/examples/ChatQnA/deploy/k8s_getting_started.md index 64766ca2..c25f5c1e 100644 --- a/examples/ChatQnA/deploy/k8s_getting_started.md +++ b/examples/ChatQnA/deploy/k8s_getting_started.md @@ -1,11 +1,11 @@ -# Getting Started +# Getting Started with Kubernetes for ChatQnA ## Introduction Kubernetes is an orchestration platform for managing containerized applications, ideal for deploying microservices based architectures like ChatQnA. It offers robust mechanisms for automating deployment, scaling, and operations of application containers across clusters of hosts. Kubernetes supports different deployment modes for ChatQnA, which cater to various operational preferences: -- **Using GMC ( GenAI Microservices Connector)**: GMC can be used to compose and adjust GenAI pipelines dynamically on kubernetes for enhanced service connectivity and management. -- **Using Manifests**: This involves deploying directly using Kubernetes manifest files without the GenAI Microservices Connector (GMC). -- **Using Helm Charts**: Facilitates deployment through Helm, which manages Kubernetes applications through packages of pre-configured Kubernetes resources. +- **Using GMC ( GenAI Microservices Connector)**: GMC can be used to compose and adjust GenAI pipelines dynamically on kubernetes for enhanced service connectivity and management. +- **Using Manifests**: This involves deploying directly using Kubernetes manifest files without the GenAI Microservices Connector (GMC). +- **Using Helm Charts**: Facilitates deployment through Helm, which manages Kubernetes applications through packages of pre-configured Kubernetes resources. This guide will provide detailed instructions on using these resources. If you're already familiar with Kubernetes, feel free to skip ahead to (**Deploy using Helm**) @@ -18,9 +18,9 @@ This guide will provide detailed instructions on using these resources. If you'r **Understanding Kubernetes Deployment Tools and Resources:** -- **kubectl**: This command-line tool allows you to deploy applications, inspect and manage cluster resources, and view logs. For instance, `kubectl apply -f chatqna.yaml` would be used to deploy resources defined in a manifest file. +- **kubectl**: This command-line tool allows you to deploy applications, inspect and manage cluster resources, and view logs. For instance, `kubectl apply -f chatqna.yaml` would be used to deploy resources defined in a manifest file. -- **Pods**: Pods are the smallest deployable units created and managed by Kubernetes. A pod typically encapsulates one or more containers where your application runs. +- **Pods**: Pods are the smallest deployable units created and managed by Kubernetes. A pod typically encapsulates one or more containers where your application runs. **Verifying Kubernetes Cluster Access with kubectl** ```bash @@ -60,17 +60,17 @@ kubectl config set-context --current --namespace=chatqa **Key Components of a Helm Chart:** -- **Chart.yaml**: This file contains metadata about the chart such as name, version, and description. -- **values.yaml**: Stores configuration values that can be customized depending on the deployment environment. These values override defaults set in the chart templates. -- **deployment.yaml**: Part of the templates directory, this file describes how the Kubernetes resources should be deployed, such as Pods and Services. +- **Chart.yaml**: This file contains metadata about the chart such as name, version, and description. +- **values.yaml**: Stores configuration values that can be customized depending on the deployment environment. These values override defaults set in the chart templates. +- **deployment.yaml**: Part of the templates directory, this file describes how the Kubernetes resources should be deployed, such as Pods and Services. **Update Dependencies:** -- A script called **./update_dependency.sh** is provided which is used to update chart dependencies, ensuring all nested charts are at their latest versions. -- The command `helm dependency update chatqna` updates the dependencies for the `chatqna` chart based on the versions specified in `Chart.yaml`. +- A script called **./update_dependency.sh** is provided which is used to update chart dependencies, ensuring all nested charts are at their latest versions. +- The command `helm dependency update chatqna` updates the dependencies for the `chatqna` chart based on the versions specified in `Chart.yaml`. **Helm Install Command:** -- `helm install [RELEASE_NAME] [CHART_NAME]`: This command deploys a Helm chart into your Kubernetes cluster, creating a new release. It is used to set up all the Kubernetes resources specified in the chart and track the version of the deployment. +- `helm install [RELEASE_NAME] [CHART_NAME]`: This command deploys a Helm chart into your Kubernetes cluster, creating a new release. It is used to set up all the Kubernetes resources specified in the chart and track the version of the deployment. For more detailed instructions and explanations, you can refer to the [official Helm documentation](https://helm.sh/docs/). diff --git a/examples/ChatQnA/deploy/k8s_helm.md b/examples/ChatQnA/deploy/k8s_helm.md index d88bb907..23b264c6 100644 --- a/examples/ChatQnA/deploy/k8s_helm.md +++ b/examples/ChatQnA/deploy/k8s_helm.md @@ -7,14 +7,13 @@ be covering one option of doing it for convenience : we will be showcasing how to build an e2e chatQnA with Redis VectorDB and neural-chat-7b-v3-3 model, deployed on a Kubernetes cluster. For more information on how to setup a Xeon based Kubernetes cluster along with the development pre-requisites, please follow the instructions here (*** ### Kubernetes Cluster and Development Environment***). -For a quick introduction on Helm Charts, visit the helm section in (**getting started**) +For a quick introduction on Helm Charts, visit the helm section in [Getting Started with Kubernetes for ChatQnA](./k8s_getting_started.md) ## Overview There are several ways to setup a ChatQnA use case. Here in this tutorial, we will walk through how to enable the below list of microservices from OPEA GenAIComps to deploy a multi-node TGI megaservice solution. -> **Note:** ChatQnA can also be deployed on a single node using Kubernetes, provided that all pods are configured to run on the same node. 1. Data Prep 2. Embedding @@ -22,14 +21,16 @@ GenAIComps to deploy a multi-node TGI megaservice solution. 4. Reranking 5. LLM with TGI +> **Note:** ChatQnA can also be deployed on a single node using Kubernetes, provided that all pods are configured to run on the same node. + ## Prerequisites ### Install Helm First, ensure that Helm (version >= 3.15) is installed on your system. Helm is an essential tool for managing Kubernetes applications. It simplifies the deployment and management of Kubernetes applications using Helm charts. -For detailed installation instructions, please refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) +For detailed installation instructions, refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) ### Clone Repository -First step is to clone the GenAIInfra which is the containerization and cloud native suite for OPEA, including artifacts to deploy ChatQnA in a cloud native way. +Next step is to clone the GenAIInfra which is the containerization and cloud native suite for OPEA, including artifacts to deploy ChatQnA in a cloud native way. ```bash git clone https://github.com/opea-project/GenAIInfra.git @@ -51,12 +52,12 @@ export HF_TOKEN="Your_Huggingface_API_Token" ``` ### Proxy Settings -Make sure to setup Proxies if you are behind a firewall. + For services requiring internet access, such as the LLM microservice, embedding service, reranking service, and other backend services, proxy settings can be essential. These settings ensure services can download necessary content from the internet, especially when behind a corporate firewall. Proxy can be set in the `values.yaml` file, like so: Open the `values.yaml` file using an editor ```bash -vi GenAIInfra/helm-charts/chatqna/values.yaml +vi chatqna/values.yaml ``` Update the following section and save file: ```yaml @@ -569,4 +570,4 @@ chatqna-ui: Once you are done with the entire pipeline and wish to stop and remove all the containers, use the command below: ``` helm uninstall chatqna -``` \ No newline at end of file +```