Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alexa's 10 second timeout for skill response workaround #10

Open
ghzeni opened this issue Nov 22, 2023 · 9 comments
Open

Alexa's 10 second timeout for skill response workaround #10

ghzeni opened this issue Nov 22, 2023 · 9 comments

Comments

@ghzeni
Copy link
Contributor

ghzeni commented Nov 22, 2023

Hello! I've been using this skill for quite a lot of things recently, but the 10-second timeout really is a buzzkill for me.

For those who are not aware, this is what I'm talking about.

Since this limitation is for the skill to provide a response to the user and not for a request inside the skill to provide a response, I was wondering if it isn't possible to create a separate thread for the request to run, and in case the request takes up more than 7 seconds, provide an initial response of alexa saying something like "One moment, please." and provide the actual answer afterwards.

Any feedbacks on this? Thanks!

@xinyonghu2015
Copy link

Same question .During testing, I frequently encounter a timeout error, with the message: "There was a problem with the requested skill's response."

I believe this issue arises when the response from the LaunchRequestHandler exceeds the 8-second limit imposed by the Alexa service. In our current implementation, the handler calls the GPT-3 API, processes the data, and generates a response. This process sometimes takes longer than the allotted time frame, particularly when the prompt content is complex, resulting in a timeout error.

However, I've observed that when the prompt content is relatively simple, the GPT-3 API response time is usually within the 8-second limit and the skill functions as expected.

@xinyonghu2015
Copy link

At present, after I switched the api from the original gpt-3.5-trubo-0613 to gpt-3.5-trubo-1106, the speed is actually faster. The response time is less than 8 seconds, which can temporarily solve the problem, but it is not solved from the code. of

@k4l1sh
Copy link
Owner

k4l1sh commented Nov 23, 2023

gpt-3.5-turbo-1106 model is very fast and helps avoid this issue. Ideally, we would like the Alexa skill to speak each part of the text as it's being made, similar to what's shown in this streaming example. But Alexa Skills work in a way where they can't do this, they can't say each word one by one as separate responses.

A simple solution is to use something called Progressive Response. This keeps the user listening while the skill gets the whole text ready. You can learn more about how to do this from the Amazon Alexa guide on Progressive Responses

"Your skill can send progressive responses to keep the user engaged while your skill prepares a full response to the user's request. A progressive response is interstitial SSML content (including text-to-speech and short audio) that Alexa plays while waiting for your full skill response."

@ghzeni
Copy link
Contributor Author

ghzeni commented Nov 24, 2023

A simple solution is to use something called Progressive Response. This keeps the user listening while the skill gets the whole text ready. You can learn more about how to do this from the Amazon Alexa guide on Progressive Responses

This is precisely what I was looking for!! I think this would be a great addition to the project. I'm gonna try to implement this in the next few days.

@xinyonghu2015
Copy link

A simple solution is to use something called Progressive Response. This keeps the user listening while the skill gets the whole text ready. You can learn more about how to do this from the Amazon Alexa guide on Progressive Responses

This is precisely what I was looking for!! I think this would be a great addition to the project. I'm gonna try to implement this in the next few days.
Hi Bro,If you modified the code, can you share it?

@ghzeni
Copy link
Contributor Author

ghzeni commented Nov 27, 2023

Hi Bro,If you modified the code, can you share it?

Hey! I still haven't gotten around to it, but as soon as I modify it i'll share it :)

@taueres
Copy link

taueres commented Feb 24, 2024

I solved this issue by changing this line:

messages.append({"role": "user", "content": new_question + ". Write max 50 words in the response."})

This limits the response size to around 50 words which really improves the response time.

k4l1sh added a commit that referenced this issue Feb 27, 2024
@k4l1sh
Copy link
Owner

k4l1sh commented Feb 27, 2024

Limiting the answers in the prompt to 50 words really avoids long wait times, it is a good and simple temporary solution. I put this phrase in system content instead of user content to avoid repeating the phrase.
Changes done in commit 65945ae

@badmin-c
Copy link

badmin-c commented May 20, 2024

It seems like the Progressive Response feature feature is not an option to buy us any time while waiting for the response, as the manual clearly states:

"Note: Progressive responses don't change the overall time allowed for a response. When a user invokes a skill, the skill has approximately eight seconds to return a full response. The skill must finish processing any progressive responses as well as the full response within this time."
(https://developer.amazon.com/en-US/docs/alexa/custom-skills/send-the-user-a-progressive-response.html)

I wasn't too happy with the workaround, so I fiddled a bit with different models and custom instructions. I found that I am able to use gtp-4o (which really is a lot faster then former gpt-4 models) using

"max_tokens": 1000,

and

messages = [{"role": "system", "content": "Answer all questions concisely and precisely"}]

Sometimes the first answer tends to be really brief, but follow up questions like "Please be more precise" or "Tell me more details" work just fine, even on complex topics like "What do we know about the universe" or "Explain the connection between our universe and quantum mechanics".

I feel like this is a much smoother approach than the 50 words max hard-limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants