Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail gracefully when OpenAI model stops working #29284

Closed
Tracked by #28813
victoralfaro-dotcms opened this issue Jul 18, 2024 · 4 comments · Fixed by #29748, #29761, #29774 or #29806
Closed
Tracked by #28813

Fail gracefully when OpenAI model stops working #29284

victoralfaro-dotcms opened this issue Jul 18, 2024 · 4 comments · Fixed by #29748, #29761, #29774 or #29806

Comments

@victoralfaro-dotcms
Copy link
Contributor

victoralfaro-dotcms commented Jul 18, 2024

Parent Issue

#28813

User Story

As a Java developer, I want to provide a fallback strategy for when our OpenAI client fails. We need to identify the cause of the failure due to three main reasons:

  • The current model in use has been decommissioned by OpenAI
  • The current model is invalid
  • Any other reason like network failure, timeout, etc.

The fallback strategy consists of the following steps:

  • When a model in use has been decommissioned or is invalid:
    • Log a multi-line message in the logs
    • Internally mark the issue as unavailable
    • Switch to the next model in the list
  • When an attempt to use a model in the list fails for other reasons (timeout, network, etc.):
    • Log a single-line message in the logs
    • Try the next model in the list
  • When all models in the list fail:
    • Log a multi-line message in the logs
    • Post a growl
    • Return an error from dotAI

Acceptance Criteria

  • The system can detect if a model has been decommissioned or is invalid.
  • The system logs a multi-line message when a model is decommissioned or invalid.
  • The system marks the decommissioned or invalid model as unavailable internally.
  • The system switches to the next model in the list when a model is decommissioned or invalid.
  • The system logs a single-line message when a model fails due to reasons like timeout or network failure.
  • The system retries with the next model in the list when a model fails due to timeout or network failure.
  • The system logs a multi-line message, posts a growl, and returns an error when all models in the list fail.

dotCMS Version

master

Proposed Objective

Core Features

Proposed Priority

Priority 2 - Important

External Links... Slack Conversations, Support Tickets, Figma Designs, etc.

to_define

Assumptions & Initiation Needs

  • Clear identification of the list of models to be used as fallback.
  • Access to logging and error reporting mechanisms.
  • Development environment set up to simulate different failure scenarios.

Quality Assurance Notes & Workarounds

  • Test cases to simulate decommissioned models, invalid models, and other failure scenarios (timeout, network issues).
  • Verify logging for both multi-line and single-line messages.
  • Ensure the system correctly switches to the next model in the list.
  • Validate the error handling and reporting when all models fail.
@victoralfaro-dotcms victoralfaro-dotcms moved this from New to Next 1-3 Sprints in dotCMS - Product Planning Jul 18, 2024
@john-thomas-dotcms john-thomas-dotcms moved this from Next 1-3 Sprints to Current Sprint Backlog in dotCMS - Product Planning Jul 24, 2024
@victoralfaro-dotcms victoralfaro-dotcms self-assigned this Jul 30, 2024
@victoralfaro-dotcms victoralfaro-dotcms moved this from Current Sprint Backlog to In Progress in dotCMS - Product Planning Jul 30, 2024
github-merge-queue bot pushed a commit that referenced this issue Jul 30, 2024
…29236)

Removing hardcoded OpenAI models at enum class OpenAIModel. Instead they
are now part of the `dotAI.yml` application descriptor so the user can
be the one who configures them not only one but multiple models for
`text`, `image` and `embeddings`.
The way to specify more than one is to provide a comma delimited list in
the new dotAI App params.
Sometimes we accept a model to use in the payload of our AI endpoints,
for this matter we will validate that model and if it's invalid we will
throw an exception.
When is not present in the payload, then our backend will inject the
current model.
Which leads us to the question: How will the current model be
determined?
This is part of the work defined for
#29284 (model fallback)
victoralfaro-dotcms added a commit that referenced this issue Jul 31, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 1, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 2, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 2, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 9, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 10, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 16, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 20, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 22, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 22, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 22, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 22, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 22, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 22, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 22, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 23, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 27, 2024
…29748)

Adding multi model support with a more robust class thans just String to AIModel class. Introducing README file to document new usage of AIClient instead of OpenAIRequest class. Adding AI testing utils as well as necessary Wiremock templates for IT and postman tests.

Refs: #29284
github-merge-queue bot pushed a commit that referenced this issue Aug 27, 2024
…29761)

Adding multi model support with a more robust class thans just String to
AIModel class. Introducing README file to document new usage of AIClient
instead of OpenAIRequest class. Adding AI testing utils as well as
necessary Wiremock templates for IT and postman tests.

Refs: #29284
victoralfaro-dotcms added a commit that referenced this issue Aug 28, 2024
Removing `OpenAIRequest` class in favor of set of classes explainied in `src/main/java/com/dotcms/ai/client/README.md` and integrating it with the corresponding consuming components. Integration tests added/updated.

Refs: #29284
victoralfaro-dotcms added a commit that referenced this issue Aug 28, 2024
Removing `OpenAIRequest` class in favor of set of classes explainied in `src/main/java/com/dotcms/ai/client/README.md` and integrating it with the corresponding consuming components. Integration tests added/updated.

Refs: #29284
victoralfaro-dotcms added a commit that referenced this issue Aug 28, 2024
Removing `OpenAIRequest` class in favor of set of classes explainied in `src/main/java/com/dotcms/ai/client/README.md` and integrating it with the corresponding consuming components. Integration tests added/updated.

Refs: #29284
victoralfaro-dotcms added a commit that referenced this issue Aug 28, 2024
Removing `OpenAIRequest` class in favor of set of classes explainied in `src/main/java/com/dotcms/ai/client/README.md` and integrating it with the corresponding consuming components. Integration tests added/updated.

Refs: #29284
victoralfaro-dotcms added a commit that referenced this issue Aug 28, 2024
Removing `OpenAIRequest` class in favor of set of classes explainied in `src/main/java/com/dotcms/ai/client/README.md` and integrating it with the corresponding consuming components. Integration tests added/updated.

Refs: #29284
victoralfaro-dotcms added a commit that referenced this issue Aug 29, 2024
Removing `OpenAIRequest` class in favor of set of classes explainied in `src/main/java/com/dotcms/ai/client/README.md` and integrating it with the corresponding consuming components. Integration tests added/updated.

Refs: #29284
@victoralfaro-dotcms
Copy link
Contributor Author

victoralfaro-dotcms commented Aug 29, 2024

QA Notes

Configure a dotAI application using a valid key:

  • Specify the value invalid-model,text-davinci-003,gpt-4o-mini in the Model Names field
  • Specify the value other-model,dall-e-3 in the Image Model Names field
  • Specify the value something-else,text-embedding-ada-002 in the Embeddings Model Names field
  • Add an custom property named com.dotcms.ai.debug.logging with a value of true and press the Save button from the Custom Properties section
  • Leave the rest of fields as default
  • When you hit the Save button you should see a warning notification telling that there are some invalid or decommissioned models (invalid-model, text-davinci-003 for this case)

Reset DB for embeddings

  • Run the following command in order to apply vector changes to database schema:
# extract basic auth from out postman collections, it's all over the place
curl --location --request DELETE 'http://localhost:8080/api/v1/ai/embeddings/db' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Basic <basic auth>'

dotAI Portlet:

Go to the Manage Embeddings/Indexes tab

  • In the Content to Index by Query field specify: +contentType:(Blog)
  • Hit the Build Index
    • Check the logs and you should look for something like:
Model [something-else] is not suitable for encoding, marking it as invalid and falling back to other models
  • When finished you should see some listed indexes under the Indexes section, if not refresh the page and go back to the embeddings tab

Go to the Search and Chat with dotCMS tab

  • In the prompt field specify something like: How is the French Polynesia?
  • Choose Semantic Search as the Response Type and hit Submit button
  • You should see some results displayed and coming from our local AI embeddings
  • Choose Streaming Chat as the Response Type and hit Submit button
  • You should see some results displayed and coming from our local AI embeddings
  • Choose REST/JSON Chat as the Response Type and hit Submit button
  • You should see some results displayed and coming from our local AI embeddings
  • Check the logs and you should look for something like:
Model [invalid-model] is not suitable for encoding, marking it as invalid and falling back to other models
...
Model [text-davinci-003] found, setting as current model
...
Model [invalid-model] failed with error: [Model [invalid-model] is not operational]. Trying next model.
...
Model [invalid-model] failed then setting its status to [INVALID].
...
Model [text-davinci-003] failed with response:
{
  "error": {
      "message": "The model `text-davinci-003` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations",
      "type": "invalid_request_error",
      "param": null,
      "code": "model_not_found"
    }
}
Trying next model.
...
Model [text-davinci-003] failed then setting its status to [DECOMMISSIONED].
...
Model [gpt-4o-mini] succeeded. No need to fallback.
  • Refresh the page
  • Toggle the Advanced button and you should see that gpt-40-mini model is selected, this means that a fallback has happened and the invalid-model and text-davinci-003 models are excluded from being used

Go to Image Playground

  • Add something in the prompt to be used when generating the image
  • You should see an image created from what you just specified
  • Check the logs and you should look for something like:
Model [other-model] failed with response:
{
  "error": {
    "code": "invalid_size",
    "message": "The size is not supported by this model.",
    "param": null,
    "type": "invalid_request_error"
  }
}
Trying next model.
...
Model [dall-e-3] succeeded. No need to fallback.

Blog content block editor

  • Repeat the dotAI app configuration process to reset the models once against
  • Open a Blog content
  • Scroll down to Blog Content and inside the editor add a AI Content (by pressing / + AI and choosing AI Content option)
  • In the opened dialog write something to be used when generating text
  • You should see some results displayed
  • Check the logs and you should look for something like:
Model [invalid-model] failed with response:
{
    "error": {
        "message": "The model `xxx` does not exist or you do not have access to it.",
        "type": "invalid_request_error",
        "param": null,
        "code": "model_not_found"
    }
}
Trying next model.
...
Model [invalid-model] failed then setting its status to [INVALID].
...
Model [text-davinci-003] failed with response:
{
   "error": {
       "message": "The model `text-davinci-003` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations",
       "type": "invalid_request_error",
       "param": null,
       "code": "model_not_found"
   }
}
Trying next model.
...
 Model [text-davinci-003] failed then setting its status to [DECOMMISSIONED].
...
Model [gpt-4o-mini] succeeded. No need to fallback
  • Insert the generated text
  • Inside the editor add a AI Content (by pressing / + AI and choosing AI Image option)
  • Add something in the prompt to be used when generating the Image
  • You should see an image created from what you just specified
  • Check the logs and you should look for something like:
Model [other-model] failed with response:
{
  "error": {
    "code": "invalid_size",
    "message": "The size is not supported by this model.",
    "param": null,
    "type": "invalid_request_error"
  }
}
Trying next model.
...
Model [dall-e-3] succeeded. No need to fallback.
  • Insert the image
  • Save and publish the Blog
  • Close it and open it again and make sure the generated content is there

Blog content block editor

  • Repeat the dotAI app configuration process to reset the models once against
  • Open a Blog content
  • Go to the SEO tab
  • Hit the Generate With DotAI button
  • Verify that in fact the text has been generated
  • Check the logs and you should look for something like:
Model [invalid-model] failed with response:
{
    "error": {
        "message": "The model `xxx` does not exist or you do not have access to it.",
        "type": "invalid_request_error",
        "param": null,
        "code": "model_not_found"
    }
}
Trying next model.
...
Model [invalid-model] failed then setting its status to [INVALID].
...
Model [text-davinci-003] failed with response:
{
   "error": {
       "message": "The model `text-davinci-003` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations",
       "type": "invalid_request_error",
       "param": null,
       "code": "model_not_found"
   }
}
Trying next model.
...
 Model [text-davinci-003] failed then setting its status to [DECOMMISSIONED].
...
Model [gpt-4o-mini] succeeded. No need to fallback

No models available scenario

  • Repeat the dotAI app configuration process to reset the models once against but remove the valid models: gpt-4o-mini, dall-e-3 and text-embedding-ada-002 in the corresponding fields
  • Do the Blog content AI Content tests
  • You should see an error message telling that it was not possible to generate the content
  • A notification should be displayed with a message like: All the TEXT models: [invalid-model, text-davinci-003] have been exhausted since they are invalid or has been decommissioned
  • Check the logs and you should look for something like:
com.dotcms.ai.client.AIModelFallbackStrategy:info(99): Model [invalid-model] failed with response:
{
    "error": {
        "message": "The model `invalid-model` does not exist or you do not have access to it.",
        "type": "invalid_request_error",
        "param": null,
        "code": "model_not_found"
    }
}
Trying next model.
...
Model [invalid-model] failed then setting its status to [INVALID].
...
Model [text-davinci-003] failed with response:
{
    "error": {
        "message": "The model `text-davinci-003` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations",
        "type": "invalid_request_error",
        "param": null,
        "code": "model_not_found"
    }
}
Trying next model.
...
Model [text-davinci-003] failed then setting its status to [DECOMMISSIONED].
...
Model [text-davinci-003] is the last one. Cannot fallback anymore.
...

Network/Timeout Error Scenario

  • Repeat the dotAI app configuration process to reset the models once against
  • Trick your machine into thinking OpenAI API is another IP by editing your /etc/hosts and adding a line like this:
123.123.123.123  api.openai.com
  • Perform the Blog content AI Content tests
  • After a while you should see an error message telling that it was not possible to generate the content
  • Check the logs and you should look for something like:
INVALID REQUEST: Connect to api.openai.com:443 [api.openai.com/123.123.123.123] failed: Operation timed out (Connection timed out)
org.apache.http.conn.HttpHostConnectException: Connect to api.openai.com:443 [api.openai.com/123.123.123.123] failed: Operation timed out (Connection timed out)
	at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156) ~[httpclient-4.5.13.jar:4.5.13]
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) ~[httpclient-4.5.13.jar:4.5.13]
	at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) ~[httpclient-4.5.13.jar:4.5.13]
	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) ~[httpclient-4.5.13.jar:4.5.13]
	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) ~[httpclient-4.5.13.jar:4.5.13]
...
  • Undo the changes to /etc/hosts file
  • Perform the Blog content AI Content tests
  • Expect that everything worked out correctly

github-merge-queue bot pushed a commit that referenced this issue Aug 29, 2024
…29774)

Removing `OpenAIRequest` class in favor of set of classes explainied in
`src/main/java/com/dotcms/ai/client/README.md` and integrating it with
the corresponding consuming components. Integration tests added/updated.

Refs: #29284
@victoralfaro-dotcms victoralfaro-dotcms moved this from In Progress to In Review in dotCMS - Product Planning Aug 30, 2024
@victoralfaro-dotcms victoralfaro-dotcms moved this from In Review to Internal QA in dotCMS - Product Planning Aug 30, 2024
victoralfaro-dotcms added a commit that referenced this issue Aug 30, 2024
github-merge-queue bot pushed a commit that referenced this issue Aug 30, 2024
Applying missing feedback
@jcastro-dotcms
Copy link
Contributor

INTERNAL QA: PASSED

  • Docker Image: trunk_ea541c2

The expected results and error messages described in Victor's comment are showing up as expected. Now, its VERY IMPORTANT that, in some scenarios, you reset the DotAI configuration as Victor said:

Repeat the dotAI app configuration process to reset the models once against

Otherwise, you may not get the expected result in the scenario.

@jcastro-dotcms jcastro-dotcms removed their assignment Sep 3, 2024
@jcastro-dotcms jcastro-dotcms moved this from Internal QA to QA - Backlog in dotCMS - Product Planning Sep 3, 2024
@dsilvam dsilvam closed this as completed Sep 3, 2024
@github-project-automation github-project-automation bot moved this from QA - Backlog to Internal QA in dotCMS - Product Planning Sep 3, 2024
@dsilvam dsilvam moved this from Internal QA to QA - Backlog in dotCMS - Product Planning Sep 3, 2024
@josemejias11
Copy link
Contributor

Approved: Tested on trunk_e622545, Docker, macOS 14.5, FF v126.0.1

@josemejias11 josemejias11 moved this from QA - In Progress to Done in dotCMS - Product Planning Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment