Skip to content

Latest commit

 

History

History
58 lines (34 loc) · 1.73 KB

API_SERVER.md

File metadata and controls

58 lines (34 loc) · 1.73 KB

API Server

(Work in progress)

Note: OpenAI API subscription is required for this API server to work as a chat server.

=> Code

OpenAI API Key

I use .zshrc file on my MacOS to set my OpenAI API key value to OPENAI_API_KEY.

.zshrc

export OPENAI_API_KEY="<your OpenAI API key>"

Starting the API server

MacOS

$ python3 app.py

RAG

This API server runs with RAG to answer questions from the showroom visitors on the exhibition content.

The RAG retrieves info from two sources (General info from Vector DB with metadata and Scenarios from SQL DB) for each query: so called, "hybrid approach" for RAG.

Maintaining the RAG

General info on the locations in the scenes (Vevtor DB with metadata)

Run this Jupyter Notebook to update the embeddings on ChromaDB.

For now, I use the following documents generated by ChatGPT as inputs to RAG.

Scenarios for each image (SQL DB)

Run this Jupyter Notebook

Multimodal AI

This API server also accepts base64-encoded image data from the client to add additional info in a request to the OpenAI's LLM.

=> Details

Text-to-Speech

This API server with the OpenAI's TTS service provides Text-to-Speech API to the client: samples I generated with TTS.

This capability is tentative, because OpenAI will include TTS in GPT-4o-mini in near future.

Reference: https://platform.openai.com/docs/guides/text-to-speech/overview