-
-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redis probes list issues - Multi-instance API problems #120
Comments
This comment was marked as outdated.
This comment was marked as outdated.
heres the problem:
|
Hey @patrykcieszkowski, I am trying to address that issue and Dmitriy told me that you had a script to get a list of probes from the specific node instance/process. Could you share it please? If you don't mind. Also, if you have any info on how the issue can be steadily reproduced, that would be very helpful. Thanks! |
I don't recall writing such a script, but it should be as simple as adding a node identifier k/v to the probe data. https://github.com/jsdelivr/globalping/blob/master/src/probe/builder.ts#L90-L105 I also never figured out how to consistently replicate that issue. In fact, it never happened on my local network, even while running over 500 probes. One thing is for certain - even when connecting to the WS pool externally, and pulling the probe list while bypassing the HTTP server, the behaviour mentioned in the comment above was present. I came to the conclusion that some nodes would either never receive the pub requesting the data, or wouldn't respond to it on time (redis-adapter has a timeout). |
I was constantly requesting /probes endpoint from both APIs and what I am observing is:
As I see, the only thing we can do here is to try another adapter implementation and compare the behaviour. For some teams AMPQ adapter showed really good results. |
AMPQ adapter does not support some of the required operations (e.g. fetchSockets()). |
I think we can close that, as under usual load GET /probes works without issues. Only under high load (when redis operations start to take >30 sec) we can observe the issues as well as 500 error. So we should focus on the root cause (redis perf) in other gh issues, which we are already doing. |
This is a task to track the issue where sometimes an API instance shows a partial list of the connected probes.
It has happened in the simple CLI script and production API as well.
Need to verify that our API is 100% stable when running multiple instances with a central Redis DB.
The text was updated successfully, but these errors were encountered: