This README will guide you through the steps to install the project locally or via IBM Code Engine. Additionally, you will learn how to access the Swagger documentation once the project is deployed.
To install this project locally, follow these steps:
-
Clone the repository
git clone https://github.com/blashernandez43/RAG-API-client-PoC.git
-
Navigate to the project directory:
cd RAG-API-client-PoC
-
Create the enviroment, activate it, and install Requirements:
python3 -m venv assetEnv source assetEnv/bin/activate python3 -m pip install -r requirements.txt
-
Update your secrets:
Copy
env
to.env
and fill in the variables with your url, passwords, and apikeys. -
Start the project:
python3 app.py
-
URL access:
Go to
localhost:4050
to verify that the api is running. You should see a "Hello World" message.To access Swagger go to http://0.0.0.0:4050/docs
We have created Terraform scripts to help deploy this on IBM Cloud Code Engine service. Make sure you have this service provisioned.
- Clone the repo:
git clone https://github.com/ibm-build-lab/rag-codeengine-terraform-setup/tree/updatedTF
- Change into the cloned directory
cd rag-codeengine-terraform-setup
- Edit the
terraform.tfvars
file and fill in all the required values. Note for this api, the COS and WD variables are unnecessary and can be left as default. - Update the
variables.tf
file to change the value ofsource_url
to point tohttps://github.com/blashernandez43/RAG-API-client-PoC
- Run
terraform init
to initialize your terraform environment - Run
terraform plan
to see what resources will be created - Run
terraform apply
to create the resources
Verify that this has created a Code Engine project and application.
- From the IBM Cloud search bar, search on
Code Engine
to bring up the service - Go to
Projects
and search for the project you specified in theterraform.tfvars
file - Within the project you should see an application running with a
Ready
status
Wait for the build to complete and access the public URL by selecting the Domain mappings tab of the open Application pane. Or go into the project by selecting Projects from the Code Engine side menu. Open the project, then select Applications. You will see a URL link under the Application Link.
A quick sanity check with <url>/docs
will take you to the swagger ui.
After deploying the application, you can now test the API:
-
Open Swagger by going to
<url>/docs
. -
Authenticate the
queryWXDLLM
api by clicking the lock button to the right. Enter the value you added for theRAG_APP_API_KEY
. -
Click the
Try it out
button and customize your request body:{ "question": "<your question>", "num_results": "5", # how many results from each index should be returned "llm_params": { "model_id": "mistralai/mixtral-8x7b-instruct-v01", "inputs": [], "parameters": { "decoding_method": "greedy", "max_new_tokens": 500, "min_new_tokens": 1, "moderations": { "hap_input": "true", "hap_output": "true", "threshold": 0.75 } } }, "llm_instructions": "[INST]<<SYS>>You are a helpful, respectful, and honest assistant. Always answer as helpfully as possible, while being safe. Be brief in your answers. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don\\'''t know the answer to a question, please do not share false information. <</SYS>>\nGenerate the next agent response by answering the question. You are provided several documents with titles. If the answer comes from different documents please mention all possibilities and use the tiles of documents to separate between topics or domains. Answer with no more than 150 words. If you cannot base your answer on the given document, please state that you do not have an answer.\n{context_str}<</SYS>>\n\n{query_str}. Answer with no more than 150 words. If you cannot base your answer on the given document, please state that you do not have an answer. [/INST]" }
At a minimum, specify:
{ "question": "<your question>" }
All other values have defaults, you can adjust the other parameters to improve your results.
To execute this api from command line, use this command:
curl --location '<application url>/queryWXDLLM' \
--header 'Content-Type: application/json' \
--header 'RAG-APP-API-Key: <your custom RAG-APP-API-KEY value>' \
--data '{
"question": "string"
}'
-
Open a new tab and from the request type dropdown, select POST. In the url, paste your url (in this example, it's localhost):
http://127.0.0.1:4050/queryWXDLLM
-
Under Authorization, choose type API Key, add the following key/value:
RAG-APP-API-Key
/<value for RAG_APP_API_KEY from .env>
-
Under Body, select
raw
and paste the following json:
{
"question": "<your question>",
}
- Hit the blue
SEND
button and wait for your result.