Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] o1-preview Model Error: Unsupported system Role in Messages in big-agi-2 branch #639

Open
1 task
zubus opened this issue Sep 13, 2024 · 9 comments
Open
1 task
Labels
type: bug Something isn't working

Comments

@zubus
Copy link

zubus commented Sep 13, 2024

Description

I encountered an error while using the o1-preview model in the big-agi-2 branch. The error message states that messages.role does not support system with this model. This appears to be a limitation of the o1-preview and o1-mini models, which only accepts user and assistant roles in the messages array.

imagen

Device and browser

Firefox

Screenshots and more

No response

Willingness to Contribute

  • 🙋‍♂️ Yes, I would like to contribute a fix.
@zubus zubus added the type: bug Something isn't working label Sep 13, 2024
@hourianto
Copy link

hourianto commented Sep 14, 2024

Yeah, o1 is quite weird currently, here's a quick and dirty patch for it (partially Sonnet-made), also beware that the model itself currently does not support images and streaming.

As a file: https://paste.debian.net/plainh/a4a9f800, so you can just do:

curl -O https://paste.debian.net/plainh/a4a9f800
git apply a4a9f800
npm run build

Regarding the max_tokens change if you're curios - OpenAI has made max_completion_tokens the new argument that's supposed to be used instead, but I didn't completely remove it as older proxies and services might still only accept max_tokens. But OpenAI themselves recommend developers to only use max_completion_tokens for all models from now on, not just o1.

As plain text:

diff --git a/src/modules/aix/client/aix.client.ts b/src/modules/aix/client/aix.client.ts
index efd3f843..a19d81ea 100644
--- a/src/modules/aix/client/aix.client.ts
+++ b/src/modules/aix/client/aix.client.ts
@@ -256,14 +256,14 @@ async function _aix_LL_ChatGenerateContent(
   const contentReassembler = new ContentReassembler(llAccumulator);
 
   try {
-
+    const disableStreaming = getLabsDevNoStreaming() || aixModel.id.startsWith("o1")
     // tRPC Aix Chat Generation (streaming) API - inside the try block for deployment path errors
     const particles = await apiStream.aix.chatGenerateContent.mutate({
       access: aixAccess,
       model: aixModel,
       chatGenerate: aixChatGenerate,
       context: aixContext,
-      streaming: getLabsDevNoStreaming() ? false : aixStreaming, // [DEV] disable streaming if set in the UX (testing)
+      streaming: disableStreaming ? false : aixStreaming, // [DEV] disable streaming if set in the UX (testing)
       connectionOptions: getLabsDevMode() ? { debugDispatchRequestbody: true } : undefined,
     }, {
       signal: abortSignal,
diff --git a/src/modules/aix/server/dispatch/chatGenerate/adapters/openai.chatCompletions.ts b/src/modules/aix/server/dispatch/chatGenerate/adapters/openai.chatCompletions.ts
index b8de11ba..ec1f9aa0 100644
--- a/src/modules/aix/server/dispatch/chatGenerate/adapters/openai.chatCompletions.ts
+++ b/src/modules/aix/server/dispatch/chatGenerate/adapters/openai.chatCompletions.ts
@@ -35,6 +35,8 @@ export function aixToOpenAIChatCompletions(openAIDialect: OpenAIDialects, model:
   const hotFixSquashMultiPartText = openAIDialect === 'deepseek';
   const hotFixThrowCannotFC = openAIDialect === 'deepseek' || openAIDialect === 'openrouter' /* OpenRouter FC support is not good (as of 2024-07-15) */ || openAIDialect === 'perplexity';
 
+  // Check if the model ID starts with "o1"
+  const isO1Model = model.id.startsWith('o1');
 
   // Throw if function support is needed but missing
   if (chatGenerate.tools?.length && hotFixThrowCannotFC)
@@ -53,6 +55,12 @@ export function aixToOpenAIChatCompletions(openAIDialect: OpenAIDialects, model:
   if (hotFixAlternateUserAssistantRoles)
     chatMessages = _fixAlternateUserAssistantRoles(chatMessages);
 
+  // New hotfix: Replace system messages with user messages if model.id starts with "o1"
+  if (isO1Model) {
+    chatMessages = chatMessages.map(message => 
+      message.role === 'system' ? { ...message, role: 'user' } : message
+    );
+  }
 
   // Construct the request payload
   let payload: TRequest = {
@@ -61,8 +69,9 @@ export function aixToOpenAIChatCompletions(openAIDialect: OpenAIDialects, model:
     tools: chatGenerate.tools && _toOpenAITools(chatGenerate.tools),
     tool_choice: chatGenerate.toolsPolicy && _toOpenAIToolChoice(openAIDialect, chatGenerate.toolsPolicy),
     parallel_tool_calls: undefined,
-    max_tokens: model.maxTokens !== undefined ? model.maxTokens : undefined,
-    temperature: model.temperature !== undefined ? model.temperature : undefined,
+    max_tokens: isO1Model ? undefined : (model.maxTokens !== undefined ? model.maxTokens : undefined),
+    max_completion_tokens: isO1Model ? model.maxTokens : undefined,
+    temperature: isO1Model ? 1 : (model.temperature !== undefined ? model.temperature : undefined),
     top_p: undefined,
     n: hotFixOnlySupportN1 ? undefined : 0, // NOTE: we choose to not support this at the API level - most downstram ecosystem supports 1 only, which is the default
     stream: streaming,
diff --git a/src/modules/aix/server/dispatch/wiretypes/openai.wiretypes.ts b/src/modules/aix/server/dispatch/wiretypes/openai.wiretypes.ts
index 290afd5e..10b60260 100644
--- a/src/modules/aix/server/dispatch/wiretypes/openai.wiretypes.ts
+++ b/src/modules/aix/server/dispatch/wiretypes/openai.wiretypes.ts
@@ -199,6 +199,7 @@ export namespace OpenAIWire_API_Chat_Completions {
 
     // common model configuration
     max_tokens: z.number().optional(),
+    max_completion_tokens: z.number().optional(),
     temperature: z.number().min(0).max(2).optional(),
     top_p: z.number().min(0).max(1).optional(),

@enricoros
Copy link
Owner

Thanks guys, @hourianto @zubus. Full support landed in the 2 branch. Includes workarounds, cost accounting, and showing the reasoning tokens breakdown.

Could you try it out and let me know how is your experience?

Screenshot_20240914_134304_Chrome.jpg

@hourianto
Copy link

Thanks, yeah, the new patch works, although I had to go into model settings for OpenAI and press the Model check button again so it updated the model entries.

@zubus
Copy link
Author

zubus commented Sep 15, 2024

Thanks, yeah, the new patch works, although I had to go into model settings for OpenAI and press the Model check button again so it updated the model entries.

I had to follow the same steps. Now both models work well. Thanks for the fix

@enricoros
Copy link
Owner

Should I add a flag to force-reload the models for the users, once updates are needed?

In this case I added a special flag to the model to make the UI more aware of this model (and still want to message this to the users better, so they're informed about the tradeoffs). I don't know if people would enjoy the auto-reload or not, but for large changes like the o1 series, it's important.

@SamKr
Copy link

SamKr commented Sep 16, 2024

Not related to this ticket but a setting to autoreload models would be fantastic. The people I share big-AGI with aren't all tech savvy, and going to preferences -> models -> click model reload -> select next provider -> etc. is hard to get them to do. A notification with 'provider x has new models available, would you like to reload?' would be great as well.

@enricoros
Copy link
Owner

That's a good experience suggestion thank you @SamKr. For the sake of time, I cannot signal that notification, because that will require to keep on scanning the models and show a notification in case of deltas. Easy but also not at the same time, because the user would have modified/deleted models in the meantime.

How would you rank the following options:

  1. (current) manual only with auto-refresh when the server changes environment variables
  2. also auto-refresh when a new version of big-AGI is released (1.15, 1.16, 2.0, etc.)
  3. auto-refresh upon every new load in every user's browser, notifying the user of new models
  4. ...?

@SamKr
Copy link

SamKr commented Sep 19, 2024

In my experience new models only appear when I sync after an update, is that correct? But that's the only time I sync, so perhaps big-AGI can actually fetch new models even without an update?

Either way, I'd say 2 is a superb option, no user interaction at all. That would probably cover all my use cases. It also sounds like it wouldn't cost you too much effort. If there is a new model released between updates, and a user really requires it for some reason, they'd just have to refresh the list manually or wait for an update.

3 is a good option as well, if big-AGI can fetch new models even between updates. Not sure if starting a new conversation counts as a new load, but I'd say exclude that, so it would just auto-refresh when a user browses to big-AGI in the morning (I and most other users I know of keep it open all day).

Option 1 rarely applies I think, since that's not something you'd do on a regular basis.

Thanks for listening :)

@enricoros
Copy link
Owner

Thanks for the suggestion. Prob 2 is the best compromise, great way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants