[GenAI orchestrator] add reranking embedding and guardrail (#1721)

* versions * ajout guardrail et embedding bloomz * ajout compressor * petites modif * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/guardrail/guardrail_provider.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_setting.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_provider.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/guardrail/guardrail_setting.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/routers/em_providers_router.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/bloomz/bloomz_compressor_setting.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/bloomz/bloomz_compressor_setting.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/bloomz/bloomz_compressor_setting.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/contextual_compressor/bloomz_rerank.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/factories/langchain_factory.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/factories/langchain_factory.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/factories/langchain_factory.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Co-authored-by: Diverrez morgan <[email protected]> * modifications pour PR * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/guardrail/bloomz/bloomz_guardrail_setting.py Co-authored-by: Diverrez morgan <[email protected]> * Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/contextual_compressor/bloomz_rerank.py Co-authored-by: assouktim <[email protected]> * Update guardrail error type * Remove dead code * gestion ereur label inconnu * modifications to solve conflicts * 📝 🐛 Fixing severeal stuff and adding documentation about guardrail / compressor (reranking) and embedding using Bloomz * ✏️ IAFMLOPS-516 Review fixes * ✏️ IAFMLOPS-516 Review fixes, missing max_documents * ✏️ IAFMLOPS-516 Peotry lock update --------- Co-authored-by: Diverrez morgan <[email protected]> Co-authored-by: Killian Mahé <[email protected]> Co-authored-by: assouktim <[email protected]> Co-authored-by: Killian Mahé <[email protected]> Co-authored-by: Benjamin BERNARD <[email protected]>
theopenconversationkit · Nov 18, 2024 · 0c76ae7 · 0c76ae7
1 parent 0cb7765
commit 0c76ae7
Show file tree

Hide file tree

Showing 46 changed files with 2,816 additions and 1,283 deletions.
diff --git a/gen-ai/orchestrator-server/src/main/python/server/.pre-commit-config.yaml b/gen-ai/orchestrator-server/src/main/python/server/.pre-commit-config.yaml
@@ -33,6 +33,6 @@ repos:
       hooks:
         - id: pycln
   - repo: https://github.com/pypa/pip-audit
-    rev: v2.7.0
+    rev: v2.7.3
     hooks:
       -   id: pip-audit
diff --git a/gen-ai/orchestrator-server/src/main/python/server/README.md b/gen-ai/orchestrator-server/src/main/python/server/README.md
@@ -49,8 +49,8 @@ Basic usage to create a venv with a specific version of Python for this project
 
 ```sh
 # In gen-ai/orchestrator-server/src/main/python/server
-pyenv install 3.9.18
-pyenv local 3.9.18  # Activate Python 3.9 for the current
+pyenv install 3.10.6
+pyenv local 3.10.6  # Activate Python 3.9 for the current
 which python # Check that you use the python version installed by pyenv
 python --version # Check your python version
 python -m venv .venv # Create a virtual env based on this python version
@@ -65,7 +65,7 @@ poetry install # Install dependencies for this project in the virtual env
 Install python3.9 and poetry :
 
 ```sh
-apt install python3.9 poetry
+apt install python3.10 poetry
 ```
 
 Create a virtual env then install dependencies :
@@ -171,6 +171,81 @@ print(f"Nb of tokens: {num_tokens}")
 
 If your prompt contents can be made public, you can also use [OpenAI's tokenizer](https://platform.openai.com/tokenizer) as a more convenient method.
 
+## RAG - Reranking (Compressor), Guardrail and embedding - Bloomz based models
+
+The Gen AI Orchestrator server RAG chain supports embedding, reranking and guardrails for Bloomz based models.
+Curently this implementation is based on a **custom inference server also Open Sourced at https://github.com/creditMutuelArkea/llm-inference/** 
+
+### Embedding configuration
+
+See [embeddings_bloomz_settings.json](./../tock-llm-indexing-tools/examples/embeddings_bloomz_settings.json) for an exemple of configuration.
+
+### Reranking / Compressor settings
+
+Compressor (aka reranker) will compute a similarity score between the user query and a document (context),
+usually this score is linked to an output of the reranking model associated to a specific label. We may find the name of 
+the label in the model card, for instance LABEL_1 contains this similarity score for *cmarkea/bloomz-3b-reranking* model [mentioned in the model
+card here](https://huggingface.co/cmarkea/bloomz-3b-reranking#:~:text=lambda%20x%3A%20x%5B0%5D%5B%27label%27%5D%20%3D%3D%20%22LABEL_1%22%2C).
+
+You can also specify a `min_score`, used to filter non relevant / similar documents.
+
+Here is an exemple of compressor_settings that uses our llm-inference with *cmarkea/bloomz-3b-reranking* model :
+```json
+{
+    "compressor_setting": {
+      "endpoint": "http://localhost:8082",
+      "min_score": 0.7,
+      "label": "LABEL_1",
+      "provider": "BloomzRerank"
+  }
+}
+``` 
+
+### Guardrail settings
+
+Our guardrail models for instance [cmarkea/bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail)
+(see our organisation for other variants), returns the following scores thought llm-inference serveur :
+```json
+{
+    "response": [
+        [
+            {
+                "label": "insult",
+                "score": 0.7866228222846985
+            },
+            {
+                "label": "obscene",
+                "score": 0.4258439540863037
+            },
+            {
+                "label": "sexual_explicit",
+                "score": 0.1550784707069397
+            },
+            {
+                "label": "identity_attack",
+                "score": 0.05749328061938286
+            },
+            {
+                "label": "threat",
+                "score": 0.022629201412200928
+            }
+        ]
+    ]
+}
+```
+
+When adding the guardrail setting to the RAG chain if any of the following evaluated toxicity score is higher that the
+setting `max_score` the response will be rejected. Here is an exemple of guardrail setting :
+```json
+{
+  "guardrail_setting": {
+    "api_base": "http://localhost:8083",
+    "provider": "BloomzGuardrail",
+    "max_score": 0.5
+  }
+}
+```
+
 <p align="right">(<a href="#readme-top">back to top</a>)</p>
 
 [product-screenshot]: images/screenshot.png

diff --git a/gen-ai/orchestrator-server/src/main/python/server/poetry.lock b/gen-ai/orchestrator-server/src/main/python/server/poetry.lock
diff --git a/...tor-server/src/main/python/server/src/gen_ai_orchestrator/errors/exceptions/exceptions.py b/...tor-server/src/main/python/server/src/gen_ai_orchestrator/errors/exceptions/exceptions.py
@@ -90,3 +90,10 @@ class GenAIPromptTemplateException(GenAIOrchestratorException):
 
     def __init__(self, info: ErrorInfo):
         super().__init__(ErrorCode.GEN_AI_PROMPT_TEMPLATE_ERROR, info)
+
+
+class GenAIUnknownLabelException(GenAIOrchestratorException):
+    """Unknown Label error"""
+
+    def __init__(self, info: ErrorInfo):
+        super().__init__(ErrorCode.GEN_AI_UNKNOWN_LABEL_ERROR, info)
diff --git a/...r/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/__init__.py b/...r/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/__init__.py
@@ -0,0 +1,14 @@
+#   Copyright (C) 2024 Credit Mutuel Arkea
+#
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
diff --git a/...ain/python/server/src/gen_ai_orchestrator/models/contextual_compressor/bloomz/__init__.py b/...ain/python/server/src/gen_ai_orchestrator/models/contextual_compressor/bloomz/__init__.py
@@ -0,0 +1,14 @@
+#   Copyright (C) 2024 Credit Mutuel Arkea
+#
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
diff --git a/.../src/gen_ai_orchestrator/models/contextual_compressor/bloomz/bloomz_compressor_setting.py b/.../src/gen_ai_orchestrator/models/contextual_compressor/bloomz/bloomz_compressor_setting.py
@@ -0,0 +1,48 @@
+#   Copyright (C) 2024 Credit Mutuel Arkea
+#
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
+from typing import Literal, Optional
+
+from pydantic import Field
+
+from gen_ai_orchestrator.models.contextual_compressor.compressor_provider import (
+    ContextualCompressorProvider,
+)
+from gen_ai_orchestrator.models.contextual_compressor.compressor_setting import (
+    BaseCompressorSetting,
+)
+from gen_ai_orchestrator.services.contextual_compressor.bloomz_rerank import (
+    BloomzRerank,
+)
+
+
+class BloomzCompressorSetting(BaseCompressorSetting):
+    provider: Literal[ContextualCompressorProvider.BLOOMZ] = Field(
+        description='The contextual compressor provider.',
+        examples=[ContextualCompressorProvider.BLOOMZ],
+        default=ContextualCompressorProvider.BLOOMZ.value,
+    )
+    min_score: Optional[float] = Field(
+        description='Minimum retailment score.',
+        default=BloomzRerank.__fields__['min_score'].default,
+    )
+    endpoint: str = Field(description='Bloomz scoring endpoint.')
+    max_documents: Optional[int] = Field(
+        description='Maximum number of documents to return to avoid exceeding max tokens for text generation.',
+        default=BloomzRerank.__fields__['max_documents'].default,
+    )
+    label: Optional[str] = Field(
+        description='Label to use for reranking. The output label is usually documented on the huggingface model card '
+                    'or in the model\'s config.json file (id2label..).', default='entailment'
+    )
diff --git a/...python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_provider.py b/...python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_provider.py
@@ -0,0 +1,24 @@
+#   Copyright (C) 2024 Credit Mutuel Arkea
+#
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
+from enum import Enum, unique
+
+
+@unique
+class ContextualCompressorProvider(str, Enum):
+    BLOOMZ = 'BloomzRerank'
+
+    @classmethod
+    def has_value(cls, value) -> bool:
+        return value in cls._value2member_map_
diff --git a/.../python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_setting.py b/.../python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_setting.py
@@ -0,0 +1,26 @@
+#   Copyright (C) 2024 Credit Mutuel Arkea
+#
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
+from pydantic import BaseModel, Field
+
+from gen_ai_orchestrator.models.contextual_compressor.compressor_provider import (
+    ContextualCompressorProvider,
+)
+
+
+class BaseCompressorSetting(BaseModel):
+    provider: ContextualCompressorProvider = Field(
+        description='The contextual compressor provider.',
+        examples=[ContextualCompressorProvider.BLOOMZ],
+    )
diff --git a/...in/python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_types.py b/...in/python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_types.py
@@ -0,0 +1,25 @@
+#   Copyright (C) 2024 Credit Mutuel Arkea
+#
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
+from typing import Annotated, Union
+
+from fastapi import Body
+
+from gen_ai_orchestrator.models.contextual_compressor.bloomz.bloomz_compressor_setting import (
+    BloomzCompressorSetting,
+)
+
+CompressorSetting = Annotated[
+    Union[BloomzCompressorSetting], Body(discriminator='provider')
+]
diff --git a/...trator-server/src/main/python/server/src/gen_ai_orchestrator/models/em/bloomz/__init__.py b/...trator-server/src/main/python/server/src/gen_ai_orchestrator/models/em/bloomz/__init__.py
@@ -0,0 +1,14 @@
+#   Copyright (C) 2024 Credit Mutuel Arkea
+#
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
diff --git a/...rver/src/main/python/server/src/gen_ai_orchestrator/models/em/bloomz/bloomz_em_setting.py b/...rver/src/main/python/server/src/gen_ai_orchestrator/models/em/bloomz/bloomz_em_setting.py
@@ -0,0 +1,38 @@
+#   Copyright (C) 2023-2024 Credit Mutuel Arkea
+#
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
+"""Model for creating BloomzEMSetting."""
+
+from typing import Literal, Optional
+
+from pydantic import Field
+
+from gen_ai_orchestrator.models.em.em_provider import EMProvider
+from gen_ai_orchestrator.models.em.em_setting import BaseEMSetting
+
+
+class BloomzEMSetting(BaseEMSetting):
+    """A class for Bloomz Embedding Model Setting."""
+
+    provider: Literal[EMProvider.BLOOMZ] = Field(
+        description='The Embedding Model provider.', examples=[EMProvider.BLOOMZ]
+    )
+    api_base: str = Field(
+        description='The base url of the provider API.', examples=['http://doc.tock.ai']
+    )
+    pooling: Optional[str] = Field(
+        description='Pooling method.',
+        default='last',
+        examples=['mean', 'last'],
+    )
diff --git a/...chestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/em/em_provider.py b/...chestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/em/em_provider.py
@@ -24,6 +24,7 @@ class EMProvider(str, Enum):
     OPEN_AI = 'OpenAI'
     AZURE_OPEN_AI_SERVICE = 'AzureOpenAIService'
     OLLAMA = 'Ollama'
+    BLOOMZ = 'Bloomz'
 
     @classmethod
     def has_value(cls, value) -> bool:

diff --git a/...rchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/em/em_setting.py b/...rchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/em/em_setting.py
@@ -18,7 +18,9 @@
 from pydantic import BaseModel, Field
 
 from gen_ai_orchestrator.models.em.em_provider import EMProvider
-from gen_ai_orchestrator.models.security.raw_secret_key.raw_secret_key import RawSecretKey
+from gen_ai_orchestrator.models.security.raw_secret_key.raw_secret_key import (
+    RawSecretKey,
+)
 from gen_ai_orchestrator.models.security.security_types import SecretKey
 
 
@@ -31,5 +33,5 @@ class BaseEMSetting(BaseModel):
     api_key: Optional[SecretKey] = Field(
         description='The secret that stores the API key used to authenticate requests to the AI Provider API.',
         examples=[RawSecretKey(value='ab7-14Ed2-dfg2F-A1IV4B')],
-        default=None
+        default=None,
     )
diff --git a/.../orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/em/em_types.py b/.../orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/em/em_types.py
@@ -21,11 +21,17 @@
 from gen_ai_orchestrator.models.em.azureopenai.azure_openai_em_setting import (
     AzureOpenAIEMSetting,
 )
-from gen_ai_orchestrator.models.em.ollama.ollama_em_setting import OllamaEMSetting
+from gen_ai_orchestrator.models.em.bloomz.bloomz_em_setting import (
+    BloomzEMSetting,
+)
+from gen_ai_orchestrator.models.em.ollama.ollama_em_setting import (
+    OllamaEMSetting,
+)
 from gen_ai_orchestrator.models.em.openai.openai_em_setting import (
     OpenAIEMSetting,
 )
 
 EMSetting = Annotated[
-    Union[OpenAIEMSetting, AzureOpenAIEMSetting, OllamaEMSetting], Body(discriminator='provider')
+    Union[OpenAIEMSetting, AzureOpenAIEMSetting, OllamaEMSetting, BloomzEMSetting],
+    Body(discriminator='provider'),
 ]