Skip to content

Commit

Permalink
[GenAI orchestrator] add reranking embedding and guardrail (#1721)
Browse files Browse the repository at this point in the history
* versions

* ajout guardrail et embedding bloomz

* ajout compressor

* petites modif

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/guardrail/guardrail_provider.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_setting.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/compressor_provider.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/guardrail/guardrail_setting.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/routers/em_providers_router.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/bloomz/bloomz_compressor_setting.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/bloomz/bloomz_compressor_setting.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/contextual_compressor/bloomz/bloomz_compressor_setting.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/contextual_compressor/bloomz_rerank.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/factories/langchain_factory.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/factories/langchain_factory.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/factories/langchain_factory.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py

Co-authored-by: Diverrez morgan <[email protected]>

* modifications pour PR

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/guardrail/bloomz/bloomz_guardrail_setting.py

Co-authored-by: Diverrez morgan <[email protected]>

* Update gen-ai/orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/services/contextual_compressor/bloomz_rerank.py

Co-authored-by: assouktim <[email protected]>

* Update guardrail error type

* Remove dead code

* gestion ereur label inconnu

* modifications to solve conflicts

* 📝 🐛 Fixing severeal stuff and adding documentation about guardrail / compressor (reranking) and embedding using Bloomz

* ✏️ IAFMLOPS-516 Review fixes

* ✏️ IAFMLOPS-516 Review fixes, missing max_documents

* ✏️ IAFMLOPS-516 Peotry lock update

---------

Co-authored-by: Diverrez morgan <[email protected]>
Co-authored-by: Killian Mahé <[email protected]>
Co-authored-by: assouktim <[email protected]>
Co-authored-by: Killian Mahé <[email protected]>
Co-authored-by: Benjamin BERNARD <[email protected]>
  • Loading branch information
6 people authored Nov 18, 2024
1 parent 0cb7765 commit 0c76ae7
Show file tree
Hide file tree
Showing 46 changed files with 2,816 additions and 1,283 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,6 @@ repos:
hooks:
- id: pycln
- repo: https://github.com/pypa/pip-audit
rev: v2.7.0
rev: v2.7.3
hooks:
- id: pip-audit
81 changes: 78 additions & 3 deletions gen-ai/orchestrator-server/src/main/python/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ Basic usage to create a venv with a specific version of Python for this project

```sh
# In gen-ai/orchestrator-server/src/main/python/server
pyenv install 3.9.18
pyenv local 3.9.18 # Activate Python 3.9 for the current
pyenv install 3.10.6
pyenv local 3.10.6 # Activate Python 3.9 for the current
which python # Check that you use the python version installed by pyenv
python --version # Check your python version
python -m venv .venv # Create a virtual env based on this python version
Expand All @@ -65,7 +65,7 @@ poetry install # Install dependencies for this project in the virtual env
Install python3.9 and poetry :

```sh
apt install python3.9 poetry
apt install python3.10 poetry
```

Create a virtual env then install dependencies :
Expand Down Expand Up @@ -171,6 +171,81 @@ print(f"Nb of tokens: {num_tokens}")

If your prompt contents can be made public, you can also use [OpenAI's tokenizer](https://platform.openai.com/tokenizer) as a more convenient method.

## RAG - Reranking (Compressor), Guardrail and embedding - Bloomz based models

The Gen AI Orchestrator server RAG chain supports embedding, reranking and guardrails for Bloomz based models.
Curently this implementation is based on a **custom inference server also Open Sourced at https://github.com/creditMutuelArkea/llm-inference/**

### Embedding configuration

See [embeddings_bloomz_settings.json](./../tock-llm-indexing-tools/examples/embeddings_bloomz_settings.json) for an exemple of configuration.

### Reranking / Compressor settings

Compressor (aka reranker) will compute a similarity score between the user query and a document (context),
usually this score is linked to an output of the reranking model associated to a specific label. We may find the name of
the label in the model card, for instance LABEL_1 contains this similarity score for *cmarkea/bloomz-3b-reranking* model [mentioned in the model
card here](https://huggingface.co/cmarkea/bloomz-3b-reranking#:~:text=lambda%20x%3A%20x%5B0%5D%5B%27label%27%5D%20%3D%3D%20%22LABEL_1%22%2C).

You can also specify a `min_score`, used to filter non relevant / similar documents.

Here is an exemple of compressor_settings that uses our llm-inference with *cmarkea/bloomz-3b-reranking* model :
```json
{
"compressor_setting": {
"endpoint": "http://localhost:8082",
"min_score": 0.7,
"label": "LABEL_1",
"provider": "BloomzRerank"
}
}
```

### Guardrail settings

Our guardrail models for instance [cmarkea/bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail)
(see our organisation for other variants), returns the following scores thought llm-inference serveur :
```json
{
"response": [
[
{
"label": "insult",
"score": 0.7866228222846985
},
{
"label": "obscene",
"score": 0.4258439540863037
},
{
"label": "sexual_explicit",
"score": 0.1550784707069397
},
{
"label": "identity_attack",
"score": 0.05749328061938286
},
{
"label": "threat",
"score": 0.022629201412200928
}
]
]
}
```

When adding the guardrail setting to the RAG chain if any of the following evaluated toxicity score is higher that the
setting `max_score` the response will be rejected. Here is an exemple of guardrail setting :
```json
{
"guardrail_setting": {
"api_base": "http://localhost:8083",
"provider": "BloomzGuardrail",
"max_score": 0.5
}
}
```

<p align="right">(<a href="#readme-top">back to top</a>)</p>

[product-screenshot]: images/screenshot.png
Expand Down
1,112 changes: 553 additions & 559 deletions gen-ai/orchestrator-server/src/main/python/server/poetry.lock

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,10 @@ class GenAIPromptTemplateException(GenAIOrchestratorException):

def __init__(self, info: ErrorInfo):
super().__init__(ErrorCode.GEN_AI_PROMPT_TEMPLATE_ERROR, info)


class GenAIUnknownLabelException(GenAIOrchestratorException):
"""Unknown Label error"""

def __init__(self, info: ErrorInfo):
super().__init__(ErrorCode.GEN_AI_UNKNOWN_LABEL_ERROR, info)
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copyright (C) 2024 Credit Mutuel Arkea
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copyright (C) 2024 Credit Mutuel Arkea
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Copyright (C) 2024 Credit Mutuel Arkea
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from typing import Literal, Optional

from pydantic import Field

from gen_ai_orchestrator.models.contextual_compressor.compressor_provider import (
ContextualCompressorProvider,
)
from gen_ai_orchestrator.models.contextual_compressor.compressor_setting import (
BaseCompressorSetting,
)
from gen_ai_orchestrator.services.contextual_compressor.bloomz_rerank import (
BloomzRerank,
)


class BloomzCompressorSetting(BaseCompressorSetting):
provider: Literal[ContextualCompressorProvider.BLOOMZ] = Field(
description='The contextual compressor provider.',
examples=[ContextualCompressorProvider.BLOOMZ],
default=ContextualCompressorProvider.BLOOMZ.value,
)
min_score: Optional[float] = Field(
description='Minimum retailment score.',
default=BloomzRerank.__fields__['min_score'].default,
)
endpoint: str = Field(description='Bloomz scoring endpoint.')
max_documents: Optional[int] = Field(
description='Maximum number of documents to return to avoid exceeding max tokens for text generation.',
default=BloomzRerank.__fields__['max_documents'].default,
)
label: Optional[str] = Field(
description='Label to use for reranking. The output label is usually documented on the huggingface model card '
'or in the model\'s config.json file (id2label..).', default='entailment'
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Copyright (C) 2024 Credit Mutuel Arkea
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from enum import Enum, unique


@unique
class ContextualCompressorProvider(str, Enum):
BLOOMZ = 'BloomzRerank'

@classmethod
def has_value(cls, value) -> bool:
return value in cls._value2member_map_
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright (C) 2024 Credit Mutuel Arkea
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from pydantic import BaseModel, Field

from gen_ai_orchestrator.models.contextual_compressor.compressor_provider import (
ContextualCompressorProvider,
)


class BaseCompressorSetting(BaseModel):
provider: ContextualCompressorProvider = Field(
description='The contextual compressor provider.',
examples=[ContextualCompressorProvider.BLOOMZ],
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Copyright (C) 2024 Credit Mutuel Arkea
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from typing import Annotated, Union

from fastapi import Body

from gen_ai_orchestrator.models.contextual_compressor.bloomz.bloomz_compressor_setting import (
BloomzCompressorSetting,
)

CompressorSetting = Annotated[
Union[BloomzCompressorSetting], Body(discriminator='provider')
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copyright (C) 2024 Credit Mutuel Arkea
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Copyright (C) 2023-2024 Credit Mutuel Arkea
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
"""Model for creating BloomzEMSetting."""

from typing import Literal, Optional

from pydantic import Field

from gen_ai_orchestrator.models.em.em_provider import EMProvider
from gen_ai_orchestrator.models.em.em_setting import BaseEMSetting


class BloomzEMSetting(BaseEMSetting):
"""A class for Bloomz Embedding Model Setting."""

provider: Literal[EMProvider.BLOOMZ] = Field(
description='The Embedding Model provider.', examples=[EMProvider.BLOOMZ]
)
api_base: str = Field(
description='The base url of the provider API.', examples=['http://doc.tock.ai']
)
pooling: Optional[str] = Field(
description='Pooling method.',
default='last',
examples=['mean', 'last'],
)
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ class EMProvider(str, Enum):
OPEN_AI = 'OpenAI'
AZURE_OPEN_AI_SERVICE = 'AzureOpenAIService'
OLLAMA = 'Ollama'
BLOOMZ = 'Bloomz'

@classmethod
def has_value(cls, value) -> bool:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@
from pydantic import BaseModel, Field

from gen_ai_orchestrator.models.em.em_provider import EMProvider
from gen_ai_orchestrator.models.security.raw_secret_key.raw_secret_key import RawSecretKey
from gen_ai_orchestrator.models.security.raw_secret_key.raw_secret_key import (
RawSecretKey,
)
from gen_ai_orchestrator.models.security.security_types import SecretKey


Expand All @@ -31,5 +33,5 @@ class BaseEMSetting(BaseModel):
api_key: Optional[SecretKey] = Field(
description='The secret that stores the API key used to authenticate requests to the AI Provider API.',
examples=[RawSecretKey(value='ab7-14Ed2-dfg2F-A1IV4B')],
default=None
default=None,
)
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,17 @@
from gen_ai_orchestrator.models.em.azureopenai.azure_openai_em_setting import (
AzureOpenAIEMSetting,
)
from gen_ai_orchestrator.models.em.ollama.ollama_em_setting import OllamaEMSetting
from gen_ai_orchestrator.models.em.bloomz.bloomz_em_setting import (
BloomzEMSetting,
)
from gen_ai_orchestrator.models.em.ollama.ollama_em_setting import (
OllamaEMSetting,
)
from gen_ai_orchestrator.models.em.openai.openai_em_setting import (
OpenAIEMSetting,
)

EMSetting = Annotated[
Union[OpenAIEMSetting, AzureOpenAIEMSetting, OllamaEMSetting], Body(discriminator='provider')
Union[OpenAIEMSetting, AzureOpenAIEMSetting, OllamaEMSetting, BloomzEMSetting],
Body(discriminator='provider'),
]
Loading

0 comments on commit 0c76ae7

Please sign in to comment.