API Server

(Work in progress)

Note: OpenAI API subscription is required for this API server to work as a chat server.

OpenAI API Key

I use .zshrc file on my MacOS to set my OpenAI API key value to OPENAI_API_KEY.

.zshrc

export OPENAI_API_KEY="<your OpenAI API key>"

Starting the API server

MacOS

$ python3 app.py

RAG

This API server runs with RAG to answer questions from the showroom visitors on the exhibition content.

The RAG retrieves info from two sources (General info from Vector DB with metadata and Scenarios from SQL DB) for each query: so called, "hybrid approach" for RAG.

Maintaining the RAG

General info on the locations in the scenes (Vevtor DB with metadata)

Run this Jupyter Notebook to update the embeddings on ChromaDB.

For now, I use the following documents generated by ChatGPT as inputs to RAG.

tokohama
takanawa_gateway_station
hansaplatz

Scenarios for each image (SQL DB)

Run this Jupyter Notebook

Multimodal AI

This API server also accepts base64-encoded image data from the client to add additional info in a request to the OpenAI's LLM.

=> Details

Text-to-Speech

This API server with the OpenAI's TTS service provides Text-to-Speech API to the client: samples I generated with TTS.

This capability is tentative, because OpenAI will include TTS in GPT-4o-mini in near future.

Reference: https://platform.openai.com/docs/guides/text-to-speech/overview

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API_SERVER.md

API_SERVER.md

API Server

OpenAI API Key

Starting the API server

RAG

Maintaining the RAG

General info on the locations in the scenes (Vevtor DB with metadata)

Scenarios for each image (SQL DB)

Multimodal AI

Text-to-Speech

Files

API_SERVER.md

Latest commit

History

API_SERVER.md

File metadata and controls

API Server

OpenAI API Key

Starting the API server

RAG

Maintaining the RAG

General info on the locations in the scenes (Vevtor DB with metadata)

Scenarios for each image (SQL DB)

Multimodal AI

Text-to-Speech