Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection refused - Failed to get broker metrics with Kafka Manager #192

Closed
solsson opened this issue Jul 21, 2018 · 10 comments
Closed

Connection refused - Failed to get broker metrics with Kafka Manager #192

solsson opened this issue Jul 21, 2018 · 10 comments

Comments

@solsson
Copy link
Contributor

solsson commented Jul 21, 2018

Topic management works, but the metrics part of Kafka Manager is non-functional. Logs display repeated errors like:

[error] k.m.a.c.BrokerViewCacheActor - Failed to get broker metrics for BrokerIdentity(1,10.132.0.2,5555,false,true,Map(PLAINTEXT -> 32401))
java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 10.132.0.2; nested exception is: 
	java.net.ConnectException: Connection refused (Connection refused)]
	at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369) ~[na:1.8.0_144]
	at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270) ~[na:1.8.0_144]
	at kafka.manager.jmx.KafkaJMX$.doWithConnection(KafkaJMX.scala:57) ~[kafka-manager.kafka-manager-1.3.3.18-sans-externalized.jar:na]
	at kafka.manager.actor.cluster.BrokerViewCacheActor$$anonfun$kafka$manager$actor$cluster$BrokerViewCacheActor$$updateBrokerMetrics$1$$anonfun$apply$27$$anonfun$apply$3.apply$mcV$sp(BrokerViewCacheActor.scala:358) ~[kafka-manager.kafka-manager-1.3.3.18-sans-externalized.jar:na]
	at kafka.manager.actor.cluster.BrokerViewCacheActor$$anonfun$kafka$manager$actor$cluster$BrokerViewCacheActor$$updateBrokerMetrics$1$$anonfun$apply$27$$anonfun$apply$3.apply(BrokerViewCacheActor.scala:355) ~[kafka-manager.kafka-manager-1.3.3.18-sans-externalized.jar:na]
	at kafka.manager.actor.cluster.BrokerViewCacheActor$$anonfun$kafka$manager$actor$cluster$BrokerViewCacheActor$$updateBrokerMetrics$1$$anonfun$apply$27$$anonfun$apply$3.apply(BrokerViewCacheActor.scala:355) ~[kafka-manager.kafka-manager-1.3.3.18-sans-externalized.jar:na]
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) ~[org.scala-lang.scala-library-2.11.12.jar:na]
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) ~[org.scala-lang.scala-library-2.11.12.jar:na]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_144]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_144]
Caused by: javax.naming.ServiceUnavailableException: null
	at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:136) ~[na:1.8.0_144]
	at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205) ~[na:1.8.0_144]
	at javax.naming.InitialContext.lookup(InitialContext.java:417) ~[na:1.8.0_144]
	at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955) ~[na:1.8.0_144]
	at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922) ~[na:1.8.0_144]
	at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287) ~[na:1.8.0_144]
	at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270) ~[na:1.8.0_144]
	at kafka.manager.jmx.KafkaJMX$.doWithConnection(KafkaJMX.scala:57) ~[kafka-manager.kafka-manager-1.3.3.18-sans-externalized.jar:na]
	at kafka.manager.actor.cluster.BrokerViewCacheActor$$anonfun$kafka$manager$actor$cluster$BrokerViewCacheActor$$updateBrokerMetrics$1$$anonfun$apply$27$$anonfun$apply$3.apply$mcV$sp(BrokerViewCacheActor.scala:358) ~[kafka-manager.kafka-manager-1.3.3.18-sans-externalized.jar:na]
	at kafka.manager.actor.cluster.BrokerViewCacheActor$$anonfun$kafka$manager$actor$cluster$BrokerViewCacheActor$$updateBrokerMetrics$1$$anonfun$apply$27$$anonfun$apply$3.apply(BrokerViewCacheActor.scala:355) ~[kafka-manager.kafka-manager-1.3.3.18-sans-externalized.jar:na]
Caused by: java.rmi.ConnectException: Connection refused to host: 10.132.0.2; nested exception is: 
	java.net.ConnectException: Connection refused (Connection refused)
	at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) ~[na:1.8.0_144]
	at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) ~[na:1.8.0_144]
	at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) ~[na:1.8.0_144]
	at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:338) ~[na:1.8.0_144]
	at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:112) ~[na:1.8.0_144]
	at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132) ~[na:1.8.0_144]
	at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205) ~[na:1.8.0_144]
	at javax.naming.InitialContext.lookup(InitialContext.java:417) ~[na:1.8.0_144]
	at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955) ~[na:1.8.0_144]
	at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922) ~[na:1.8.0_144]
Caused by: java.net.ConnectException: Connection refused (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_144]
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[na:1.8.0_144]
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_144]
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_144]
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_144]
	at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_144]
	at java.net.Socket.connect(Socket.java:538) ~[na:1.8.0_144]
	at java.net.Socket.<init>(Socket.java:434) ~[na:1.8.0_144]
	at java.net.Socket.<init>(Socket.java:211) ~[na:1.8.0_144]
	at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) ~[na:1.8.0_144]
@solsson
Copy link
Contributor Author

solsson commented Jul 23, 2018

Based on the first line of the error message it looks like Kafka Manager tries the IP from the "outside" listener. That's a host ip, which won't work with the JMX port (unless we also make it a hostPort).

Based on 4c202f4 I think we list listeners in the proper order.

@stigok
Copy link

stigok commented Jul 25, 2018

The two commits looks very relevant

@solsson
Copy link
Contributor Author

solsson commented Jul 27, 2018

Indeed. Maybe something will behave differently with next release. However, our outside listener is also PLAINTEXT (by default). A single PLAINTEXT listener probably works.

@hfgbarrigas
Copy link

@solsson the configuration to take into account is advertised.listeners or listeners. The order does matter as the official documentation mentions. The configuration you mention here is not the one relevant for zookeeper consumers (I might've interpreted wrong though).
Placing the internal first makes everything work as expected.

@solsson
Copy link
Contributor Author

solsson commented Dec 11, 2018

That's interesting. From the commit comment you reference it looks like I made the switch base don Kafka Manager. Switching back is a risky change to do now because I don't know what it affects.

Do you mean that the referenced configuration is relevant to Kafka clients, but not to stuff that contacts Zookeeper directly?

@TattiQ
Copy link

TattiQ commented Mar 19, 2019

Hi guys, sorry may be a dumb one but why is advertised.listeners set to point to the k8s host ip ? (#init#advertised.listeners=OUTSIDE://#init#,PLAINTEXT://:9092 results in advertised.listeners=OUTSIDE://172.31.221.5:32400,PLAINTEXT://:9092 ) . Nothing listens on 3240 on the hosts.

why isn't it a pod's hostname which would resolve into cluster ip and there would be connection on jmx port for that ip?

@solsson
Copy link
Contributor Author

solsson commented Mar 19, 2019

@TattiQ That's for #78 but see also #187

blinkeye pushed a commit to blinkeye/kubernetes-kafka that referenced this issue Mar 23, 2019
@blinkeye
Copy link

Hi guys, sorry may be a dumb one but why is advertised.listeners set to point to the k8s host ip ? (#init#advertised.listeners=OUTSIDE://#init#,PLAINTEXT://:9092 results in advertised.listeners=OUTSIDE://172.31.221.5:32400,PLAINTEXT://:9092 ) . Nothing listens on 3240 on the hosts.

Thanks for the pointer, implemented and verified with #251.

solsson added a commit that referenced this issue Mar 28, 2019
#192 change order of advertised.listeners (internal first) to fix jmx…
@blinkeye
Copy link

@solsson: I've been experimenting with microk8s and the confluent https://github.com/confluentinc/cp-helm-charts repo. Looking at Confluents Broker Configuration:

cat  /etc/kafka/kafka.properties

broker.id=0
zookeeper.connect=my-confluent-cp-zookeeper-headless:2181
advertised.listeners=PLAINTEXT://my-confluent-cp-kafka-0.my-confluent-cp-kafka-headless.default:9092,EXTERNAL://10.10.10.10:31090
offsets.topic.replication.factor=3
heap.opts=-Xms512M -Xmx512M
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
log.dirs=/opt/kafka/data-0/logs
listeners=PLAINTEXT://0.0.0.0:9092,EXTERNAL://0.0.0.0:31090
jmx.port=5555

I'd say we can confirm to have made the proper change (switching OUTSIDE and PLAINTEXT).

@solsson
Copy link
Contributor Author

solsson commented Mar 31, 2019

Great. Working on Kafka Manager upgrade in #257.

@solsson solsson closed this as completed Mar 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants