Mismatch of parameters when using OpenAI Compatible API causing alternative tokens to not be returned? #101

Mithrillion · 2024-12-08T05:55:43Z

Expected:
When using OpenAI Compatible API, you should be able to obtain top 10 alternative tokens and their probabilities as when using llama.cpp API.

Observed:
Whenever using OpenAI Compatible API (tested locally with oobabooga and also remotely with OpenRouter), top token options are never returned.

Possible reason:
Mikupad is trying to pass logprobs: 10 in the completion requests, but as per OpenAI documentation (https://openrouter.ai/docs/requests#request-headers) and this OpenRouter example, the correct parameter name should be top_logprobs, whereas logprobs is a boolean variable serving as a switch. I think maybe the problem is when using OpenAI Compatible API, this parameter is not recognised on the server side due to this mismatch.

The text was updated successfully, but these errors were encountered:

Mithrillion · 2024-12-08T23:40:09Z

Additional note: it seems most models on OpenRouter actually do not return logprobs anyway, but some do, like the o4 models. However, even with these models, the logprobs are not being displayed in Mikupad. I tested the API and it does return a list top token logprobs. I have yet to test whether streaming vs full message produce different results though.

lmg-anon · 2024-12-15T01:29:12Z

Whenever using OpenAI Compatible API (tested locally with oobabooga and also remotely with OpenRouter), top token options are never returned.

Are you sure you're using an hf_* model loader in oobabooga? Unless it stopped working recently, the top tokens used to work correctly very recently.

the correct parameter name should be top_logprobs

top_logprobs is only used for the chat completion API; the text completion API uses logprobs as per the OpenAI API reference.

Additional note: it seems most models on OpenRouter actually do not return logprobs anyway, but some do, like the o4 models. However, even with these models, the logprobs are not being displayed in Mikupad. I tested the API and it does return a list top token logprobs. I have yet to test whether streaming vs full message produce different results though.

You mean, using the text completion API and top_logprobs in the request? If that's the case, I think we could always send the top_logprobs field as well; it shouldn't hurt the other backends.

Mithrillion · 2024-12-15T09:04:07Z

Are you sure you're using an hf_* model loader in oobabooga? Unless it stopped working recently, the top tokens used to work correctly very recently.

I was not aware only certain model loaders may pass through the top token information. I assumed the API would automatically provide identical functionality as the underlying llama.cpp API. I need to double check.

top_logprobs is only used for the chat completion API; the text completion API uses logprobs as per the OpenAI API reference.

This is quite confusing but you are right.

You mean, using the text completion API and top_logprobs in the request? If that's the case, I think we could always send the top_logprobs field as well; it shouldn't hurt the other backends.

I had a look at the response I get when directly I curl the API endpoint. The response does contain all the top token information. I wonder if this response format conforms with what mikupad is expecting?

test_response.json

The corresponding request is like this:

curl https://openrouter.ai/api/v1/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer xxx" \ -d '{ "model": "openai/gpt-4o-2024-11-20", "prompt": "Tell me a short joke.", "temperature": 0.8, "logprobs": true, "top_logprobs": 10, }'

I think OpenRouter may have rerouted traffic to chat completion...

lmg-anon · 2024-12-16T01:04:15Z

I think OpenRouter may have rerouted traffic to chat completion...

I can confirm that OpenRouter is faking text completion by using the chat API for models like gpt4o (which are available only through a chat API), and that's what is causing this issue.
The only way I see to solve this would be to add a checkbox like "force chat API compat" in the UI, but this feels far from ideal. The best thing to do seems to be to finish the Chat API PR instead.

Mithrillion · 2025-01-07T00:22:41Z

It seems when switching to Chat Completion API in the configurations, now the alternative tokens will show properly for models with top tokens support. However I am not able to click on the alternative tokens and re-generate from there. Is this a limitation of the API or fixable?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mismatch of parameters when using OpenAI Compatible API causing alternative tokens to not be returned? #101

Mismatch of parameters when using OpenAI Compatible API causing alternative tokens to not be returned? #101

Mithrillion commented Dec 8, 2024

Mithrillion commented Dec 8, 2024 •

edited

Loading

lmg-anon commented Dec 15, 2024

Mithrillion commented Dec 15, 2024 •

edited

Loading

lmg-anon commented Dec 16, 2024 •

edited

Loading

Mithrillion commented Jan 7, 2025 •

edited

Loading

Mismatch of parameters when using OpenAI Compatible API causing alternative tokens to not be returned? #101

Mismatch of parameters when using OpenAI Compatible API causing alternative tokens to not be returned? #101

Comments

Mithrillion commented Dec 8, 2024

Mithrillion commented Dec 8, 2024 • edited Loading

lmg-anon commented Dec 15, 2024

Mithrillion commented Dec 15, 2024 • edited Loading

lmg-anon commented Dec 16, 2024 • edited Loading

Mithrillion commented Jan 7, 2025 • edited Loading

Mithrillion commented Dec 8, 2024 •

edited

Loading

Mithrillion commented Dec 15, 2024 •

edited

Loading

lmg-anon commented Dec 16, 2024 •

edited

Loading

Mithrillion commented Jan 7, 2025 •

edited

Loading