-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outside access over hostport #194
Comments
It sounds interesting, though I'd guess that most clusters need more than a hostport for outside access. What do you mean with "all outside traffic goes to the desired broker directly". How is that different from the current approach with invidual services? Do you mean as opposed to through iptables and firewalls? I'm asking because any setup where clients don't address individual brokers will most likely fail. Edit: I think people are interested, but based on https://kubernetes.io/docs/concepts/configuration/overview/#services I wouldn't merge it. "Don’t specify a hostPort for a Pod unless it is absolutely necessary." |
The difference between a hostport and a nodeport is that a setup with nodeport will forward traffic if it reaches a node which doesn't run the target pod. I assumed that your setup does exactly that, which creates unnecessary traffic in the cluster. Corerct me if I'm wrong. Can you explain to me how traffic from an outside client reaches a specific node? Lets split the discussion up: Question 1: Could you use a mechanism to determine the public dns/ip of the node running specific kafka broker, so that the broker can advertise that address? Question 2: Is using hostPort a good idea or not? My motivation: I'm sending data from outside to Kafka from some "IOT devices" which generate up to 50 MBytes/s each. I want to eliminate any unnecessary traffic inside the cluster. |
Specifically what I don't get is this line: In my cluster this returns the cluster internal address which can't be reached from outside. However this is actually something very similar to what I did with the python lib. UPDATE: |
Looks like the suggestion in #187. While Maybe question 2 then simply becomes a matter of taste, or maybe if you want to bypass iptables or not :) |
Using hostport would also remove the necessity of the outside-services/ which contains one service per node. This seems clumsy as you might want your statefullset size to change over time (maybe automatically?) and this would require creating a service per broker and deleting it. Or is there some nice way to handle that? |
This repo being mostly an example that you fork, the purpose of the services is that you can change them to whatever service type that fits your hosting scenario. It's easy to generate ten such service manifests. I agree the hostport solution wins on simplicity. How about we extend https://github.com/Yolean/kubernetes-kafka/blob/master/outside-services/README.md with the insights from this issue and #187? Then it'll be up to individual use cases to simply not apply the outside-services folder and go directly for node IPs instead. |
Lovely, I'll rewrite the code and make sure it works first, then I'll write up an update to the readme with the snippets and make a pullrequest 👍 |
Thanks a lot for the PR. Relevant to #196 as well. |
I have written an init container that uses the python kubernetes client to query the kubernetes api server for the public dns name of the node which runs kafka. I also used proper Rbac permissions that only enable the pod to query node information.
This can be used as advertise address for kafka together with a hostport. With such a setup all outside traffic goes to the desired broker directly. The disadvantage is that you can't run two kafka instances on the same node (which is a bad idea anyways).
This works with amazon aws, I haven't tested it in other deployments.
Would this be a good change to try and upstream into this repo or is it too specific for my setup?
The text was updated successfully, but these errors were encountered: