Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java SDK] RetryInterceptor for streaming calls #398

Closed
rishtigupta opened this issue Oct 24, 2024 · 1 comment
Closed

[Java SDK] RetryInterceptor for streaming calls #398

rishtigupta opened this issue Oct 24, 2024 · 1 comment
Assignees

Comments

@rishtigupta
Copy link
Contributor

rishtigupta commented Oct 24, 2024

Issue Description:

We need to implement a retry interceptor for streaming calls in the Java SDK, based on the investigation conducted in this ticket. During the investigation, it was found that the same RetryInterceptor logic used for unary calls could be applied to streaming calls as well.

However, testing this implementation against Momento Local revealed an issue. After a retry is triggered, the program hangs when performing scalar or batch operations. This indicates that the behavior of Momento Local differs from that of the actual server, potentially leading to the hang.

Next Steps:

To resolve this issue, we need to collaborate with a backend engineer to set up a development environment (dev cell) where we can simulate a "SERVER_UNAVAILABLE" error locally. This will allow us to test the retry logic under conditions that better mimic the actual server behavior, ensuring the retry mechanism functions correctly for both scalar and batch operations.

Action Items:

[x] Coordinate with a backend engineer to configure the dev cell for simulating the "SERVER_UNAVAILABLE" error.
[x] Re-test the RetryInterceptor logic with the new setup to verify if the issue persists.
[x] Ensure that both scalar and batch operations are handled correctly during retries without causing the program to hang.

This will help isolate whether the discrepancy between Momento Local and the actual server is contributing to the issue and enable us to refine the retry logic accordingly.

Code References:

For details on how testing against momento-local was done, check this PR: #394 [WIP-POC]

@rishtigupta rishtigupta self-assigned this Nov 7, 2024
@rishtigupta
Copy link
Contributor Author

Investigation Summary:

During testing on my development environment (developer-rishti-dev), I simulated server-side errors (5XX) to examine the behavior of both unary and streaming calls under realistic service conditions, rather than using momento-local. The investigation revealed the following:

  1. Proper Program Exit: The program exits without hanging when cacheClient.close() is explicitly called. This indicates that certain resources remain open if cacheClient is not closed, suggesting that the RetryInterceptor is keeping resources active.
  2. Retry Success: I successfully retried streaming calls (e.g., getBatch) using the existing RetryInterceptor, confirming its effectiveness for both unary and streaming calls.

Conclusions:

  1. Consistency Across Environments: momento-local and the actual service perform identically, confirming that there are no issues with the momento-local implementation.
  2. Versatility of RetryInterceptor: The current RetryInterceptor, initially designed for unary calls, can also effectively handle retries for streaming calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant