-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All slaves down, but no fallback to the master #186
Comments
I could narrow it down to non-select queries (i.e queries absolutely requiring a master) trying to be executed at this point in proxy.rb :
Either I end up with an error that leads to blacklisting of the master node and since there are no alive slaves, it completely falls flat, either it just endlessly stalls, probably because it can't find a live master that it "can" use. (It should be noted that restarting one slave node instantly restores functionality) I wonder if this is not a case of refusing to use the same context because of the current strategy/stickiness logic, but I didn't dive deep in the internals yet so I can't confirm this, but it really feels like it tries to avoid using the master for "update" queries (or anything not matching the appropriate regexp) when it has already been used for "select" as a fallback, until a slave comes back. I can also confirm it tries with insistance to connect to the slaves before getting a connection refused. (This might be even more troublesome if the host was down and it had to time out...) |
Hello, did you solve this ? apparently i'm experiencing the same problem Thxs ! |
@darksoul42 one way I solved this problem is by setting
|
@NKeerthi
Are these errors specifically to handle blacklisting |
I have a Redmine cluster with one master and two slaves running Postgres 9.5 on FreeBSD servers, and everything works as expected when everything is up. (Updates go to the master, selects go to the slave)
However, if one slave node goes down, I get a set_client_encoding error (though I suspect this is due to how the "pg" gem handles its own errors), and this does not happen when not defining "encoding". I could get that "invalid encoding error" to be gracefully handled, but this only revealed an underlying issue.
If both nodes are down or blacklisted, it seems 0.3.9 never falls back to the master, and retries forever only on the slaves, leading to an application timeout, meaning things do not work as advertised in the README.
I was wondering if there shouldn't be a tunable to say whether one wants to fallback to the master or not? I did look in the source code but could not find it.
(Also, as a side-note, if the master is down, given that Redmine requires updating stuff like authentication tokens, only having a slave alive is not enough)
Here is my database.yml :
The text was updated successfully, but these errors were encountered: