-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upstream Health Checks #50
Comments
A few questions:
|
We probably need a separate issue opened in Kiln i think that would do the kubernetes health check. Another side note, if the kubernetes api returns the info about the pods health check we might be able to just use that info vs creating separate annotations. |
I like failing back to TCP. It's not perfect, but it's a lot better than Martin On Thu, Oct 27, 2016 at 1:32 PM, Adam Magaluk [email protected]
Martin Nally, Apigee Engineering |
My updated proposal after reviewing the kubernetes api would be calculate the nginx upstream health checks based on the livenessProbe of the container in the pod. This will allow us to keep the annotations to a minimum. Kiln/Enrober will have the option to populate either a http livenessProbe or a tcp one. k8s-router will fallback to tcp if neither are specified. Based on the Kubernetes options we can map almost a one for one with nginx upstream check module. Mapping from Kubernetes to Nginx: periodSeconds -> interval: the check request's interval time. I think the only useful extra annotation that could be used on the pod could be The only issue i could we see with this approach is each Pod could potentially have different Kubernetes liveliness probe settings and because nginx configures the checks on a per upstream cluster there may need to be some merging of values. I don't see this happening with normal use of Shipyard because all the pods should be spun up with the same spec from the replication controller but theoretically possible. |
HAProxy has another useful option, which is the ability to take an upstream (haproxy calls them backend) server out the pool if the server gives 50x errors on regular traffic, not just health-check traffic. Does Nginx have an equivalent? |
Also, the condition where pods with different health-check values are in the same upstream can happen in shipyard. Multiple applications can claim the same path. This is actually an important capability—this is what allows you to refactor apps. Having said that, I wouldn't worry abut it too much. |
@mpnally No it doesn't. This isn't available in standard nginx at all but from this module https://github.com/apigee/nginx_upstream_check_module |
Is |
I agree that it would be better to have an OSS-only solution and we should Martin On Mon, Oct 31, 2016 at 9:22 AM, Jeremy Whitlock [email protected]
Martin Nally, Apigee Engineering |
Well, |
The As for the annotations, i'd agree if we have to put more stuff in their then a JSON object is probably better. However with the health checks of nginx module and kubernetes being configured almost the same way we could just piggyback on using what's configured in the Kubernetes liveness probe. |
I understand, I'm just saying we've already got a few annotations for routing and we'll soon be making them more complex for weight and other things, this included, so we might get there sooner than I had expected. |
I agree. We could also see if we can find a better solution than JSON, which is not very human-friendly |
I think it would be simpler to just use the Pod's |
Yes, that is the current plan. Adam wrote this up. |
In order to provide robustness to backend services we should have nginx be doing basic health checks on each pod taking unhealthy pods out of the round robin. Using https://github.com/30x/nginx-upstream-check it would allow us to do this.
More information from each pod might be needed to enable http checks.
/status
GET|HEAD
5s
30s
2
5
Should we allow all these to be configured? Are there any we can smart default to? If nothing is provided should we fallback to basic TCP open check?
My proposal for an implementation is adding a separate
healthCheckResource
annotation to each pod that would look like:GET /
to identify the method and path of the http health check. We could also have 4 other optional annotations or combine them in one to specify the other options like timeout etc...The text was updated successfully, but these errors were encountered: