Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The driver does not deal with changing redis IP addresses in cluster mode #183

Open
akshaymankar opened this issue May 30, 2022 · 20 comments
Labels

Comments

@akshaymankar
Copy link

Hedis loses connection to a redis cluster, if the redis cluster nodes change their IP addresses during a restart.

How to reproduce the problem

  1. Start haskell app which talks to redis in cluster mode
  2. Restart each redis node in a way that IP address of each node changes. This happens when using bitnami redis-cluster helm chart.
  3. See that the app cannot talk to redis anymore

While the app cannot talk to redis, we get these kinds of error messages:

service: Network.Socket.connect: <socket: 51>: does not exist (No route to host)

Impact

This makes using hedis in Kubernetes very difficult.

Speculation

This could be a result of using socketToHandle here. The docs for the function say:

Caveat Handle is not recommended for network programming in Haskell, e.g. merely performing hClose on a TCP socket won't cooperate with peer's gracefulClose, i.e. proper shutdown sequence with appropriate handshakes specified by the protocol.

It could also be some other problem, this is just speculation.

Additional Info

I create ConnectInfo like this:

Redis.defaultConnectInfo
  { Redis.connectHost = "<kubernetes-service-name>"
    Redis.connectPort = 6379,
    Redis.connectTimeout = Just (secondsToNominalDiffTime 5),
    Redis.connectMaxConnections = 100
  }

And connect to redis like this:

Redis.connectCluster connInfo

This connection does not use TLS.

@k-bx
Copy link
Collaborator

k-bx commented May 30, 2022

Please see #184

@flokli
Copy link

flokli commented May 30, 2022

@michaelglass, #168 suggests you're using cluster support quite extensively. Is this something you could take a look into?

@aravindgopall
Copy link
Collaborator

Need to run refreshShardMap which would get the new cluster info (by invoking cluster info call to redis). But in your case how will this query even run, you would need at least one node's static IP to run this query and get info which can be used to connect to all the nodes with new IPs.

PS: the current connections being held in pool might be still used and removed after only exceptions are thrown.

@michaelglass
Copy link
Contributor

sorry about the late response. Maybe @stoeffel or @omnibs might have more context. I don't write much Haskell anymore 😔

@aravindgopall
Copy link
Collaborator

For anyone else looking here, we have fork in juspay (https://github.com/juspay/hedis) with some of these fixes done already. Would like to upstream them in sometime.

@ysangkok
Copy link
Contributor

@aravindgopall Glad to see you forked. Is it the fix/connection branch people are supposed to use? Why is it 104 commits behind? Would you be interested in taking over hedis maintainership?

@aravindgopall
Copy link
Collaborator

Right now it is fix/connection branch, we are testing the changes in our setup. Once done will merge everything to cluster branch followed by master.

Why is it 104 commits behind?
The cluster branch have all of the changes before being merge to upstream hedis (done by @alexjg), there is a PR waiting to make it up to date with upstream. This all might take a week or two to settle.

@ysangkok Would be very much interested on this.

@aravindgopall
Copy link
Collaborator

aravindgopall commented Jan 18, 2023

JFI. fix/connection branch is up to date with upstream master and being tested. Once done will make merge to master and release.

@aravindgopall
Copy link
Collaborator

JFI, cluster branch have the latest code with the above fix and also some other fixes in Juspay Fork.

@ysangkok how do you want me to continue, shall i create a PR to upstream or we can continue on juspay fork.

@jbrechtel
Copy link
Collaborator

JFI, cluster branch have the latest code with the above fix and also some other fixes in Juspay Fork.

@ysangkok how do you want me to continue, shall i create a PR to upstream or we can continue on juspay fork.

I would suggest creating a PR to this repo and we can get it merged.

@jbrechtel
Copy link
Collaborator

Right now it is fix/connection branch, we are testing the changes in our setup. Once done will merge everything to cluster branch followed by master.

Why is it 104 commits behind?
The cluster branch have all of the changes before being merge to upstream hedis (done by @alexjg), there is a PR waiting to make it up to date with upstream. This all might take a week or two to settle.

@ysangkok Would be very much interested on this.

I mistakenly offered to maintain this library some months ago and it turns that I simply don't have the time. Could you email me and we can discuss handing over maintainership?

Hi @aravindgopall - I mistakenly offered to maintain this library some months ago and it turns that I simply don't have the time. Could you email me and we can discuss handing over maintainership? james at flipstone . com

@stefanwire
Copy link

Need to run refreshShardMap which would get the new cluster info (by invoking cluster info call to redis). But in your case how will this query even run, you would need at least one node's static IP to run this query and get info which can be used to connect to all the nodes with new IPs.

PS: the current connections being held in pool might be still used and removed after only exceptions are thrown.

This issue won't be solved by refreshing the shard map from IP addresses existing in the map: after Helm updated the Redis cluster, all Redis nodes will have new IP addresses assigned. That is, no IP address stored in the available shard map would connect to a then running Redis node. The only way to reconnect the client is by resolving a new valid IP address via the Redis cluster's DNS name and then obtain an entirely new shard map.

For our use-case, we retain the DNS name outside Hedis. This is suboptimal, since the DNS name is passed to Hedis for establishing the initial connection and could be retained for failovers when all IPs have become invalid. For instance, due to Helm being a villain.

@aravindgopall
Copy link
Collaborator

aravindgopall commented Jan 31, 2023

@stefanwire This is a valid scenario. Right now it resolves this issue where at least one of the Redis node persisted the ip address. Generally helpful in more frequent scenarios like failovers, scaling etc..

@stefanwire
Copy link

Feel free to check and adapt the code linked in my previous comment to solve the issue where all IPs wouldn't connect anymore. Also, check my open PR which adds the CLUSTER INFO command in order to improve robustness for Redis in cluster mode. We would be happy to use Hedis without local wrappers when this is done and merged.

@qnikst qnikst added the bug label Jun 11, 2023
@omnibs
Copy link

omnibs commented Mar 11, 2024

I did a quick check on Juspay's branch with the following test scenario:

  • AWS Elasticache Redis cluster, 1 shard, 2 nodes
  • Launch program, issue some redis commands
  • Trigger "failover primary" on the EC UI
  • Try to issue any additional redis commands

It doesn't seem to work:

  • Without Juspay's branch: timeout executing redis commands, forever
  • With Juspay's branch: NoNodeException, forever

This means the shardMap MVar introduced in Juspay's branch somehow becomes empty during failover, and there's no recovery mechanism for when that happens.

@aravindgopall
Copy link
Collaborator

Hi @omnibs , can you give a try with Juspay master branch and share the results. In our internal test it recovers without NoNodeException. Please let us know if you need any more info.

@omnibs
Copy link

omnibs commented Mar 12, 2024

With the master branch it does recover from NoNodeException. Thank you for the pointer!

It seemed to take several minutes, though (can't re-test bc I ran out of test failovers for today).

I don't follow in the code what's making it recover, but I'm curious if there's something we can tweak to make it recover faster.

@aravindgopall
Copy link
Collaborator

it's retry mechanism based on MOVED/timeout (this can be configured to recover faster), when it occurs it will get the new shardMap info by picking a random node (again chances this might be down too can cause the time). Post this it will send the command to the new node which is responsible.

Would be great to see some numbers, graphs to understand this. (eg: re sharding time, throughput, how many errors etc..)

@omnibs
Copy link

omnibs commented Mar 13, 2024

Thank you for explaining!

So I re-ran my test in a simple manner:

  • I left curl on a loop, poking the health-check endpoint of our app in an environment every 2s
    • The health-check endpoint runs a PING through hedis, nothing more.
  • I triggered a failover
  • I waited for things to normalize

It took a pretty long time:

  • 15:12:24 - good
  • 15:12:26 - Connection Lost (failover happened)
  • 15:12:29 - NoNodeException
  • ...
  • 15:21:04 - good (recovery)

So it took 8min30s to recover from NoNodeException.

The failover itself is quick. I didn't time it, but restarting another instance of the app brought it back to life immediately, while the instance I was curling kept at NoNodeException for several minutes after that.

it's retry mechanism based on MOVED/timeout (this can be configured to recover faster), when it occurs it will get the new shardMap info by picking a random node

Do my results make sense with this mechanism?

It sounds like after the failover my shardMap was suddenly empty, and kept empty for a long time, so it couldn't have tried to pick a random node. But if this is correct, I don't understand what makes it recover after the 8min30s.

I plan to debug the mechanism further to figure out what's up, but any pointers you might have on what's happening would be appreciated!

ps: just to clarify, there's no resharding time, the failover works over a single shard

@omnibs
Copy link

omnibs commented Mar 14, 2024

Ok, my test was too simplistic, and for Redis Cluster's intended usage, perhaps unrealistic.

Redis outside Elasticache requires at least 3 primaries for cluster mode. Elasticache allows you to set it up with a single primary.

I tested a local redis cluster with 3 primaries, ran a failover with debug segfault and with manual failover, and I didn't even notice the failover. No errors.

I tested failing over an Elasticache cluster with 2 primaries, and all went well. We got a single NoNodeException, which turned into a timeout after 500ms, and fully recovered after 12s, which is very reasonable.

@aravindgopall one last question then, does Juspay plan to upstream its fixes to cluster mode? I noticed there were 2 PRs for upstreaming them which got closed (#192 and #191).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants