This question already has answers here:
NoBrokersAvailable: NoBrokersAvailable-Kafka Error
(7 answers)
Closed 1 year ago.
I have created a cluster (google cloud) with 3 nodes. Zookeeper is running on all nodes and I have started Kafka on one of the nodes. I can communicate (publish/consume) from any machine on the cluster but when i try to connect from a remote machine i get a NoBrokersAvailable exception.
I have opened ports in the firewall for testing and I have tried messing around with advertised_host and port in the Kafka config but I am unable to connect.
What is the expected configuration? - I would have expected, having suitable defaults, that my configuration would work in both the internal and remote case but it does not. I am not sure what part of the configuration of zookeeper/kafka would allow me to tweak this.
What is to be done?
Set the advertised.listeners=PLAINTEXT://<broker_ip>:9092 in server.properties file, and make sure this advertised address ingress is allowed through the GCP VPC firewall. Restart kafka-server and producer as well as consumer (whichever or if both is running)
Please check my answer to the same problem in another thread
NoBrokersAvailable: NoBrokersAvailable-Kafka Error
Related
Not able to connect to mongo standalone node experiencing below error.
ERROR [cluster-ClusterId{value='asdfs', description='null'}-x.x.x.x:27017] org.mongodb.driver.cluster - Expecting replica set member, but found a STANDALONE. Removing x.x.x.x:27017 from client view of cluster.
Is it okay to give multiple Ip's in config file while only one mongo node is there?
Is it okay to give multiple Ip's in config file while only one mongo node is there?
Not for standalone connections, no.
Specify the address of your standalone only.
The case is: to separate client and broker replication communication + introduce security.
Question is: is it possible to separate the communication with some procedure like rolling restart? Without need to have downtime on the whole cluster.
Configuration as is (simple with one port for everything wihout security):
listeners=PLAINTEXT://server1:9092
Wanted configuration (different ports and some with security, replication on 9094 port):
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SASLPLAIN:SASL_PLAINTEXT,REPLICATION:SASL_PLAINTEXT
listeners=PLAINTEXT://server1:9092,SASLPLAIN://server1,REPLICATION://server1:9094
inter.broker.listener.name=REPLICATION
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.enabled.mechanisms=PLAIN
authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer
Progress:
Configuration below is well working. But only way, without putting cluster into inconsistent state i know now, is to stop the cluster, introduce new configuration as shown above, and start cluster again. That´s obviously not wanted by the customer.
Gratefull for any thoughts how to proceed without need to stop/start whole cluster.
I managed to proceed from original, one listener configuration, to desired by below steps.
If someone has any idea to ease up the process, please add.
Original config:
listeners=PLAINTEXT://server1:9092
1.Change server.properties and do rolling restart
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SASLPLAIN:SASL_PLAINTEXT,REPLICATION:SASL_PLAINTEXT
listeners=PLAINTEXT://SERVER1:9092,SASL_PLAINTEXT://SERVER1:9093,REPLICATION://SERVER1:9094
sasl.enabled.mechanisms=PLAIN
Also include jaas config as jvm parameter.
-Djava.security.auth.login.config=/path/to/kafka_server_jaas.conf
2.Modify the server.properties and do rolling restart
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SASLPLAIN:SASL_PLAINTEXT,REPLICATION:SASL_PLAINTEXT
listeners=PLAINTEXT://SERVER1:9092,SASL_PLAINTEXT://SERVER1:9093,REPLICATION://SERVER1:9094
inter.broker.listener.name=REPLICATION
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.enabled.mechanisms=PLAIN
3.Modify server properties one last time and do third rolling restart
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SASLPLAIN:SASL_PLAINTEXT,REPLICATION:SASL_PLAINTEXT
listeners=PLAINTEXT://SERVER1:9092,SASL_PLAINTEXT://SERVER1:9093,REPLICATION://SERVER1:9094
inter.broker.listener.name=REPLICATION
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.enabled.mechanisms=PLAIN
authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer
allow.everyone.if.no.acl.found=true
Is there a method for specific IP address while setting up Orion Context Broker using any of those methods mentioned here? Now I'm running it as a docker container simultaneously with mongodb. I tried modifying docker-compose file, however couldn't find any network settings for orion.
I recently came across many difficulties with Freeboard and OCB connection and it may be because of OCB running on default loopback interface. It was the same deal when fiware's accumulator server started on that interface and after change to other available the connection was established.
You can use the -localIp CLI option in order to specify on which IP interface the broker listens to. By default it listens to all the interfaces.
I have three servers in my quorum. They are running ZooKeeper 3.4.5. Two of them appear to be running fine based on the output from mntr. One of them was restarted a couple days ago due to a deploy, and since then has not been able to join the quorum. Some lines in the logs that stick out are:
2014-03-03 18:44:40,995 [myid:1] - INFO [main:QuorumPeer#429] - currentEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
and:
2014-03-03 18:44:41,233 [myid:1] - INFO [QuorumPeer[myid=1]/0.0.0.0:2181:QuorumCnxManager#190] - Have smaller server identifier, so dropping the connection: (2, 1)
2014-03-03 18:44:41,234 [myid:1] - INFO [QuorumPeer[myid=1]/0.0.0.0:2181:QuorumCnxManager#190] - Have smaller server identifier, so dropping the connection: (3, 1)
2014-03-03 18:44:41,235 [myid:1] - INFO [QuorumPeer[myid=1]/0.0.0.0:2181:FastLeaderElection#774] - Notification time out: 400
Googling for the first ('currentEpoch not found!') led me to JIRA ZOOKEEPER-1653 - zookeeper fails to start because of inconsistent epoch. It describes a bug fix but doesn't describe a way to resolve the issue without upgrading zookeeper.
Googling for the second ('Have smaller server identifier, so dropping the connection') led me to JIRA ZOOKEEPER-1506 - Re-try DNS hostname -> IP resolution if node connection fails. This makes sense because I am using AWS Elastic IPs for the servers. The fix for this issue seems to be to do a rolling restart, which would cause us to temporarily lose quorum.
It looks like the second issue is definitely in play because I see timeouts in the other ZooKeeper server's logs (the ones still in the quorum) when trying to connect to the first server. What I'm not sure of is if the first issue will disappear when I do a rolling restart. I would like to avoid upgrading and/or doing a rolling restart, but if I have to do a rolling restart I'd like to avoid doing it multiple times. Is there a way to fix the first issue without upgrading? Or even better: Is there a way to resolve both issues without doing a rolling restart?
Thanks for reading and for your help!
This is a bug of zookeeper: Server is unable to join quorum after connection broken to other peers
Restart the leader solves this issue.
Everyone has this problem when your pods or hosts rejoining the cluster with different ips using the same id. For your host your Ip could change because specify in your config perhaps 0.0.0.0 or domains name. So Follow these instructions:
1.stop all server, and in config use
server.1=10.x.x.x:1234:5678
server.2=10.x.x.y:1234:5678
server.3=10.x.x.z:1234:5678
not dns name .
Use Your IP LAN as Identifier .
start your server it should work
I'm interested in using Celery for an app I'm working on. It all seems pretty straight forward, but I'm a little confused about what I need to do if I have multiple load balanced application servers. All of the documentation assumes that the broker will be on the same server as the application. Currently, all of my application servers sit behind an Amazon ELB and tasks need to be able to come from any one of them.
This is what I assume I need to do:
Run a broker server on a separate instance
Configure each application instance to connect to that broker server
Each application instance will also be be a celery working (running
celeryd)?
My only beef with that is: What happens if my broker instance dies? Can I run 2 broker instances some how so I'm safe if one goes under?
Any tips or information on what to do in a setup like mine would be greatly appreciated. I'm sure I'm missing something or not understanding something.
For future reference, for those who do prefer to stick with RabbitMQ...
You can create a RabbitMQ cluster from 2 or more instances. Add those instances to your ELB and point your celeryd workers at the ELB. Just make sure you connect the right ports and you should be all set. Don't forget to allow your RabbitMQ machines to talk among themselves to run the cluster. This works very well for me in production.
One exception here: if you need to schedule tasks, you need a celerybeat process. For some reason, I wasn't able to connect the celerybeat to the ELB and had to connect it to one of the instances directly. I opened an issue about it and it is supposed to be resolved (didn't test it yet). Keep in mind that celerybeat by itself can only exist once, so that's already a single point of failure.
You are correct in all points.
How to make reliable broker: make clustered rabbitmq installation, as described here:
http://www.rabbitmq.com/clustering.html
Celery beat also doesn't have to be a single point of failure if you run it on every worker node with:
https://github.com/ybrs/single-beat