You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Correct me if I am mistaken but HttpTransport.executeWithRetry will before sending a request, increase the timeout for the request based on the number of retries. After the response is received, it uses RetryStrategy to determine what to do with the results.
The function RetryStrategy.decide will in the case of a request time out, will always retry.
During an Indexer service outage in July 13th, it appears that was happening for the entire length of the outage causing the Promises to hang and never fail. This becomes a pretty big issue when using BatchIndexingResponse.waitTask() like in the API examples.
Proposed change
Adding a new configurable property to ConfigBase called something along the lines of maxRetriesPerHost and update RetryStrategy to save that value in its constructor.
Update RetryStrategy.decide to something along the lines of
During the time of the indexing outage, it was not clear immediately why our requests were stuck. Looking at the code it seems like if the servers actually responded with server errors, the StatefulHosts would have been turned off one by one which would be closer to what I would expect from an outage. The current logic has no way of determining this type of issue, causing ever expanding timeouts without any way of intervening.
The text was updated successfully, but these errors were encountered:
Description
Correct me if I am mistaken but HttpTransport.executeWithRetry will before sending a request, increase the timeout for the request based on the number of retries. After the response is received, it uses RetryStrategy to determine what to do with the results.
The function RetryStrategy.decide will in the case of a request time out, will always retry.
During an Indexer service outage in July 13th, it appears that was happening for the entire length of the outage causing the Promises to hang and never fail. This becomes a pretty big issue when using
BatchIndexingResponse.waitTask()
like in the API examples.Proposed change
maxRetriesPerHost
and updateRetryStrategy
to save that value in its constructor.RetryStrategy.decide
to something along the lines ofWhy make the change
During the time of the indexing outage, it was not clear immediately why our requests were stuck. Looking at the code it seems like if the servers actually responded with server errors, the
StatefulHost
s would have been turned off one by one which would be closer to what I would expect from an outage. The current logic has no way of determining this type of issue, causing ever expanding timeouts without any way of intervening.The text was updated successfully, but these errors were encountered: