You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the user sends a message to the backend, the evaluator LLM is used to check if the user's message is malicious in any way.
Like other LLM calls, this takes a while (maybe ~5s), and the backend waits for this to complete before proceeding.
We could possibly set this evaluator call going in the background while the backend continues as if the message is safe.
At some point the evaluator call will resolve, at that point we can decide if we want to block the message or not.
Some considerations:
Do not send the reply until the evaluator call has resolved.
The user's session is modified during message processing. We shouldn't do this until the evaluator has confirmed the message is ok.
Info on implementation in linked PR
Acceptance criteria? Smoke test on working ok, defences turned on, messaged block etc.
The text was updated successfully, but these errors were encountered:
tested a few examples, attaching the simplest one, happy with the fact that backend does not wait or stop completely to check evaluator llm result but proceeds and goes on speaking to actual model and even QA llm if required but yes does not post a reply until it has got a 'NO' from evaluator llm
When the user sends a message to the backend, the evaluator LLM is used to check if the user's message is malicious in any way.
Like other LLM calls, this takes a while (maybe ~5s), and the backend waits for this to complete before proceeding.
We could possibly set this evaluator call going in the background while the backend continues as if the message is safe.
At some point the evaluator call will resolve, at that point we can decide if we want to block the message or not.
Some considerations:
Info on implementation in linked PR
Acceptance criteria? Smoke test on working ok, defences turned on, messaged block etc.
The text was updated successfully, but these errors were encountered: