Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OUTSIDE_HOST is now resolved with ExternalIP address #187

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

scottc-git
Copy link

@scottc-git scottc-git commented Jun 18, 2018

I believe we need to use ExternalIP (instead of InternalIP) to configure advertised.listerners so that apps outside the Kubernetes cluster can access the brokers. This was tested/verified in GKE and now I can use my various tools in my PC to interact with Kafka brokers in Kubernetes. (Brokers/Zookeepers still communicate with each other using InternalIP)

Below is relevant information from GKE. Please note that minikube does NOT report ExternalIP, so it will not work for minikube. (For that to work, I had to hardcode OUTSIDE_HOST and that (obtaining minikube ip) should be a separate PR)

> kubectl get node gke-my-awesome-cluster-abcdf123456 -o json
{
    "apiVersion": "v1",
    "kind": "Node",
    "metadata": {
        "annotations": {
            "node.alpha.kubernetes.io/ttl": "0",
            "volumes.kubernetes.io/controller-managed-attach-detach": "true"
        },
        "creationTimestamp": "2018-05-31T22:50:12Z",
        "labels": {
            "beta.kubernetes.io/arch": "amd64",
            "beta.kubernetes.io/fluentd-ds-ready": "true",
            "beta.kubernetes.io/instance-type": "n1-highmem-2",
            "beta.kubernetes.io/os": "linux",
            "cloud.google.com/gke-nodepool": "default-pool",
            "failure-domain.beta.kubernetes.io/region": "us-central1",
            "failure-domain.beta.kubernetes.io/zone": "us-central1-a",
            "kubernetes.io/hostname": "gke-my-awesome-cluster-abcdf123456"
        },
        "name": "gke-my-awesome-cluster-abcdf123456",
        "resourceVersion": "3189957",
        "selfLink": "/api/v1/nodes/gke-my-awesome-cluster-abcdf123456",
        "uid": "bla-bla-bla"
    },
    "spec": {
        "externalID": "1234567890",
        "podCIDR": "10.0.0.0/24",
        "providerID": "gce://my-cool-project/us-central1-a/gke-my-awesome-cluster-abcdf123456"
    },
    "status": {
        "addresses": [
            {
                "address": "xx.xxx.x.xx",
                "type": "InternalIP"
            },
            {
                "address": "xxx.xxx.xxx.xxx",
                "type": "ExternalIP"
            },
            {
                "address": "gke-my-awesome-cluster-abcdf123456",
                "type": "Hostname"
            }
        ],

Once this change is applied to the broker containers, you will get the following in the server.properties

listeners=OUTSIDE://:9094,PLAINTEXT://:9092
advertised.listeners=OUTSIDE://xxx.xxx.xxx.xxx:32400,PLAINTEXT://:9092
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL,OUTSIDE:PLAINTEXT
inter.broker.listener.name=PLAINTEXT

I believe we need to use ExternalIP (instead of InternalIP) to configure advertised.listerners so that apps outside the Kubernetes cluster can access the brokers.    This was tested/verified in GKE and now I can use my various tools in my PC to interact with Kafka brokers.  (Brokers/Zookeepers still communicate with each other using InternalIP) 

Below is relevant information from GKE.    Please note that minikube does NOT report ExternalIP,  so it will not work for minikube. 
``` 
> kubectl get node gke-my-awesome-cluster-abcdf123456 -o json
{
    "apiVersion": "v1",
    "kind": "Node",
    "metadata": {
        "annotations": {
            "node.alpha.kubernetes.io/ttl": "0",
            "volumes.kubernetes.io/controller-managed-attach-detach": "true"
        },
        "creationTimestamp": "2018-05-31T22:50:12Z",
        "labels": {
            "beta.kubernetes.io/arch": "amd64",
            "beta.kubernetes.io/fluentd-ds-ready": "true",
            "beta.kubernetes.io/instance-type": "n1-highmem-2",
            "beta.kubernetes.io/os": "linux",
            "cloud.google.com/gke-nodepool": "default-pool",
            "failure-domain.beta.kubernetes.io/region": "us-central1",
            "failure-domain.beta.kubernetes.io/zone": "us-central1-a",
            "kubernetes.io/hostname": "gke-my-awesome-cluster-abcdf123456"
        },
        "name": "gke-my-awesome-cluster-abcdf123456",
        "resourceVersion": "3189957",
        "selfLink": "/api/v1/nodes/gke-my-awesome-cluster-abcdf123456",
        "uid": "bla-bla-bla"
    },
    "spec": {
        "externalID": "1234567890",
        "podCIDR": "10.0.0.0/24",
        "providerID": "gce://my-cool-project/us-central1-a/gke-my-awesome-cluster-abcdf123456"
    },
    "status": {
        "addresses": [
            {
                "address": "xx.xxx.x.xx",
                "type": "InternalIP"
            },
            {
                "address": "xxx.xxx.xxx.xxx",
                "type": "ExternalIP"
            },
            {
                "address": "gke-my-awesome-cluster-abcdf123456",
                "type": "Hostname"
            }
        ],
```
@solsson
Copy link
Contributor

solsson commented Jun 19, 2018

We use the current configuration to mirror between GKE clusters. I've verified now that kubectl -n kafka describe pod kafka-0 | grep outside- shows the internal IP. Maybe your use-case is different, or maybe something has changed in new GKE clusters?

@scottc-git
Copy link
Author

scottc-git commented Jun 19, 2018

Hi Solsson,

My scenario is to be able to use various tools such as console-(consumer|producer).sh or KafkaTools outside the Kubernetes cluster. For doing that, I believe brokers need to 'advertise' its external IP address to outside world. If it advertises their InternalIPs, I wonder how it would work. Maybe I am missing something completely. However, as described above, both internal and external access are all working fine and I am very happy about the change. Perhaps, your mirror set up was using Internal IPs? Test/Verification is easy.

  1. Have a kafka distribution installed
  2. Open a shell and execute ./bin/kafka-console-producer.sh --broker-list <>:<> --topic foo.
  3. Open another shell, and run ./bin/kafka-console-consumer.sh --bootstrap-server <>:<> --topic foo --from-beginning
  4. produce some message and consume the messages.

By the way, I would like to thank you for this great work. I was able to start up my Kafka cluster in GKE within a few hours. This was amazing.

@solsson
Copy link
Contributor

solsson commented Jun 20, 2018

With "outside world", do you mean outside your gcloud project? If so I assume you have also changed some firewall settings?

Thanks for giving credit. I appreciate it.

@scottc-git
Copy link
Author

That is correct. For troubleshooting/development purpose, we whitelisted our work/home IP addresses (or range with CIDR) and ports for our gcloud projects already. GKE cluster uses the same firewall rules.

@solsson
Copy link
Contributor

solsson commented Jun 21, 2018

Ok, whitelisted access is certainly an argument for access to external IP. To my knowledge, whitelisting can't be done using Kubernetes Services.

That said, external access use cases will vary, as shown by the path to #78. Most setups will have their own listener address lookup. Also I suspect as argued in #172, though I haven't tried yet, that Kafka 1.1+ dynamic config is a better fit for listener (re)configuration than the init script is.

I'd like to see some up votes on this PR before I merge it, because it might be a breaking change to existing setups that want to upgrade. It'll also cause conflicts in forks that have edited listener lookup.

@Hermain
Copy link
Contributor

Hermain commented Jul 2, 2018

My usecase is IOT devices streaming to kafka. There I need outside access to kafka and zookeeper.

@stigok
Copy link

stigok commented Jul 2, 2018

FWIW, @Hermain, an alternative for this is to set up an MQTT-to-Kafka adapter and handle authentication in the MQTT server. (e.g. Mosquitto).

@Hermain
Copy link
Contributor

Hermain commented Jul 6, 2018

@stigok thank you for your input. We stream directly to kafka with up to 50Mbits/s per device and it works fairly well so we don't want to change to MQTT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants