What causes MarshallingError when removing a server with reconfig command in Zookeeper - apache-zookeeper

I'm trying to remove one of the five servers using the reconfig command but get KeeperErrorCode = MarshallingError.
Here's the cluster information and the error message:
[zk: [ClientIP](CONNECTED) 2] get /zookeeper/config
server.0=[ip0]:port1:port2:participant
server.1=[ip1]:port1:port2:participant
server.2=[ip2]:port1:port2:participant
server.3=[ip3]:port1:port2:participant
server.4=[ip4]:port1:port2:participant
version=200000000
[zk: [ClientIP](CONNECTED) 3] reconfig -remove server.2=[ip2]:port1:port2:participant
KeeperErrorCode = MarshallingError
I'm not sure why this error occurred and how to solve it. How can I remove one of the servers?

You have 2 problems.
You don't need to specify server IPs on remove:
reconfig [-s] [-v version] [[-file path] | [-members serverID=host:port1:port2;port3[,...]]] | [-add serverId=host:port1:port2;port3[,...]] [-remove serverId[,...]*]
only set the serverId
Example of good command:
[shahar.l]# /opt/kafka/bin/zookeeper-shell.sh localhost:2181 reconfig -remove 5
Connecting to localhost:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
Committed new configuration:
server.4=IP:2888:3888:participant;0.0.0.0:2181
server.6=IP:2888:3888:participant;0.0.0.0:2181
server.7=IP:2888:3888:participant;0.0.0.0:2181
version=1700007e9d

Related

Not Able to Increase Replication Factor of Kafka Topic

We have a 3 node kafka-zookeeper cluster setup with kafka-zookeeper communicating on SSL.
We are currently using apache kafka 2.5 and zookeeper 3.5.7 . We are trying to increase the replication factor in kafka topics using the below method:
To increase the number of replicas for a given topic you have to:
Specify the extra replicas in a custom reassignment json file. For example, you could create increase-replication-factor.json and put this content in it:
{"version":1,
"partitions":[
{"topic":"signals","partition":0,"replicas":[0,1,2]},
{"topic":"signals","partition":1,"replicas":[0,1,2]},
{"topic":"signals","partition":2,"replicas":[0,1,2]}
]}
Use the file with the --execute option of the kafka-reassign-partitions tool
For example:
$ kafka-reassign-partitions --zookeeper localhost:2182 --reassignment-json-file increase-replication-factor.json --execute --command-config zookeeper_client.properties
But we are facing the problem while running the kafka-reassign-partitions , while running this command the connection to zookeeper fails with below error:
Save this to use as the --reassignment-json-file option during rollback
Partitions reassignment failed due to KeeperErrorCode = NoAuth for
/admin/reassign_partitions
org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth
for /admin/reassign_partitions
at org.apache.zookeeper.KeeperException.create(KeeperException.java:120)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at kafka.zookeeper.AsyncResponse.maybeThrow(ZooKeeperClient.scala:564)
at kafka.zk.KafkaZkClient.createRecursive(KafkaZkClient.scala:1644)
at kafka.zk.KafkaZkClient.createPartitionReassignment(KafkaZkClient.scala:871)
We don't have any ACLS created on the topic still we get this issue.

KeeperErrorCode = Unimplemented for /kafka-manager/mutex

The following error is being prompted when it is tried to add a new cluster in 'CMAK' in the K8s cluster.
Yikes! KeeperErrorCode = Unimplemented for /kafka-manager/mutex Try again.
My cluster configurations are as follows,
zookeeper: wurstmeister/zookeeper
kafka-manager: kafkamanager/kafka-manager:3.0.0.4
kafka: wurstmeister/kafka:2.12-2.4.1
I could resolve it by following the steps.
Connect to the 'zookeeper' container in k8s
k exec -it podid -- bash
Connect with zookeeper cli,
./bin/zkCli.sh
Make sure that it has created the 'kafka-manager' path already. if it does not exist, then try to create a cluster in 'kafka-manager' first.
ls /kafka-manager
Hit the following commands to create subsequent paths,
create /kafka-manager/mutex ""
create /kafka-manager/mutex/locks ""
create /kafka-manager/mutex/leases ""
Now try to create the cluster again.
The output would be like this,
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /kafka-manager
[configs, deleteClusters, clusters]
[zk: localhost:2181(CONNECTED) 1] create /kafka-manager/mutex ""
Created /kafka-manager/mutex
[zk: localhost:2181(CONNECTED) 2] create /kafka-manager/mutex/locks ""
Created /kafka-manager/mutex/locks
[zk: localhost:2181(CONNECTED) 3] create /kafka-manager/mutex/leases ""
Created /kafka-manager/mutex/leases
[zk: localhost:2181(CONNECTED) 4]
The original answer is mentioned here,
https://github.com/yahoo/CMAK/issues/731#issuecomment-643880544

How to migrate kafka details to other server?

How do I got all details of kafka server like kafka topic names, partitions, groups etc before shutting kafka server. And use this information to ready new kafka server ?
Is there any option for this type of backup ?
Kafka uses zookeeper to store metadata.
If you want to get overview of all topics, partitions or consumer groups, you can collect from zookeeper shell.
Example : to collect the consumer groups, use ls /consumers as following:
kafka % bin/zookeeper-shell.sh localhost:2181 <<< "ls /consumers"
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is disabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[console-consumer-66605, console-consumer-84350, console-consumer-9354, console-consumer-28182, console-consumer-61085, console-consumer-67016, console-consumer-81504, console-consumer-47711, console-consumer-87328, console-consumer-27998, console-consumer-73330, console-consumer-73529, console-consumer-17369, console-consumer-75626, console-consumer-6886, console-consumer-11693]
Similarly for collecting topic names, use :
ls /brokers/topics
For collecting number of partitions :
ls /brokers/topics/<topic name>/partitions
You can export these details for some file and use for next server

Unable to delete kafka topic

I am using kafka, zookeeper and kafka-manager for managing clusters.
I have 3 nodes cluster. In all the cluster I set since very beginning delete.topic.enable=true
Now when I want to delete a topic it is showing following error.
topicxyz - marked for deletion
but it is not deleted.
I tried to delete from kafka-manager also and it says
Yikes! KeeperErrorCode = NodeExists for /admin/delete_topics/topicxyz
Error logs:
kafka-manager:
[ESC[31merrorESC[0m] k.m.ApiError$ - error : KeeperErrorCode = NodeExists for /admin/delete_topics/topicxyz
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /admin/delete_topics/topicxyz
at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) ~[org.apache.zookeeper.zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[org.apache.zookeeper.zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) ~[org.apache.zookeeper.zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:721) ~[org.apache.curator.curator-framework-2.10.0.jar:na]
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:704) ~[org.apache.curator.curator-framework-2.10.0.jar:na]
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) ~[org.apache.curator.curator-client-2.10.0.jar:na]
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:701) ~[org.apache.curator.curator-framework-2.10.0.jar:na]
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:477) ~[org.apache.curator.curator-framework-2.10.0.jar:na]
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:467) ~[org.apache.curator.curator-framework-2.10.0.jar:na]
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:447) ~[org.apache.curator.curator-framework-2.10.0.jar:na]
[ESC[37minfoESC[0m] k.m.a.KafkaManagerActor - Updating internal state...
kafka has no error log. zookeeper stdout errorlog says only warning and stderr log says Invalid config, exiting abnormally
kafka-version: kafka_2.12-0.10.2.0
Topic description:
$ bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic topicxyz
Topic:topicxyz PartitionCount:1 ReplicationFactor:1 Configs:
Topic: topicxyz Partition: 0 Leader: -1 Replicas: 3 Isr:
Please help.
I am not sure which kafka version are you using. But deleting a topic had a bug previously. Refer here & here.
This is sometimes caused by a corrupt ZooKeeper node found within /admin/delete_topics. Log into the ZK client and delete the misbehaving /admin/delete_topics/your_topic_name entry.
Depending on client version it will go something like this:
bin/zkCli.sh -server 127.0.0.1:2181
ls /admin/delete_topics
ls /brokers/topics
rmr /admin/delete_topics/your_topic_name
You should now be able to use Kafka Manager or Kafka-topics to delete your topics. You can also manually remove your topic by deleting the "/brokers/topics/your_topic_name" entry but I find that is unnecessary after removing the misbehaving "delete_topics" entry.

Kafka QuickStart, advertised.host.name gives kafka.common.LeaderNotAvailableException

I am able to get a simple one-node Kafka (kafka_2.11-0.8.2.1) working locally on one linux machine, but when I try to run a producer remotely I'm getting some confusing errors.
I'm following the quickstart guide at http://kafka.apache.org/documentation.html#quickstart. I stopped the kafka processes and deleted all the zookeeper & karma files in /tmp. I am on a local 10.0.0.0/24 network NAT-ed with an external IP address, so I modified server.properties to tell zookeeper how to broadcast my external address, as per https://medium.com/#thedude_rog/running-kafka-in-a-hybrid-cloud-environment-17a8f3cfc284:
advertised.host.name=MY.EXTERNAL.IP
Then I'm running this:
$ bin/zookeeper-server-start.sh config/zookeeper.properties
--> ...
$ export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M" # small test server!
$ bin/kafka-server-start.sh config/server.properties
--> ...
I opened up the firewall for my producer on the remote machine, and created a new topic and verified it:
$ bin/kafka-topics.sh --create --zookeeper MY.EXTERNAL.IP:2181 --replication-factor 1 --partitions 1 --topic test123
--> Created topic "test123".
$ bin/kafka-topics.sh --list --zookeeper MY.EXTERNAL.IP:2181
--> test123
However, the producer I'm running remotely gives me errors:
$ bin/kafka-console-producer.sh --broker-list MY.EXTERNAL.IP:9092 --topic test123
--> [2015-06-16 14:41:19,757] WARN Property topic is not valid (kafka.utils.VerifiableProperties)
My Test Message
--> [2015-06-16 14:42:43,347] WARN Error while fetching metadata [{TopicMetadata for topic test123 ->
No partition metadata for topic test123 due to kafka.common.LeaderNotAvailableException}] for topic [test123]: class kafka.common.LeaderNotAvailableException (kafka.producer.BrokerPartitionInfo)
--> (repeated several times)
(I disabled the whole firewall to make sure that wasn't the problem.)
The stdout errors in the karma-startup are repeated: [2015-06-16 20:42:42,768] INFO Closing socket connection to /MY.EXTERNAL.IP. (kafka.network.Processor)
And the controller.log gives me this, several times:
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
[2015-06-16 20:44:08,128] INFO [Controller-0-to-broker-0-send-thread], Controller 0 connected to id:0,host:MY.EXTERNAL.IP,port:9092 for sending state change requests (kafka.controller.RequestSendThread)
[2015-06-16 20:44:08,428] WARN [Controller-0-to-broker-0-send-thread], Controller 0 epoch 1 fails to send request Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:1;CorrelationId:7;ClientId:id_0-host_null-port_9092;Leaders:id:0,host:MY.EXTERNAL.IP,port:9092;PartitionState:(test123,0) -> (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:0) to broker id:0,host:MY.EXTERNAL.IP,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread)
Running this seems to indicate that there is a leader at 0:
$ ./bin/kafka-topics.sh --zookeeper MY.EXTERNAL.IP:2181 --describe --topic test123
--> Topic:test123 PartitionCount:1 ReplicationFactor:1 Configs:
Topic: test123 Partition: 0 Leader: 0 Replicas: 0 Isr: 0
I reran this test and my server.log indicates that there is a leader at 0:
...
[2015-06-16 21:58:04,498] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2015-06-16 21:58:04,642] INFO Registered broker 0 at path /brokers/ids/0 with address MY.EXTERNAL.IP:9092. (kafka.utils.ZkUtils$)
[2015-06-16 21:58:04,670] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
[2015-06-16 21:58:04,736] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
I see this error in the logs when I send a message from the producer:
[2015-06-16 22:18:24,584] ERROR [KafkaApi-0] error when handling request Name: TopicMetadataRequest; Version: 0; CorrelationId: 7; ClientId: console-producer; Topics: test123 (kafka.server.KafkaApis)
kafka.admin.AdminOperationException: replication factor: 1 larger than available brokers: 0
at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:70)
I assume this means that the broker can't be found for some reason? I'm confused what this means...
For the recent versions of Kafka (0.10.0 as of this writing), you don't want to use advertised.host.name at all. In fact, even the [documentation] states that advertised.host.name is already deprecated. Moreover, Kafka will use this not only as the "advertised" host name for the producers/consumers, but for other brokers as well (in a multi-broker environment)...which is kind of a pain if you're using using a different (perhaps internal) DNS for the brokers...and you really don't want to get into the business of adding entries to the individual /etc/hosts of the brokers (ew!)
So, basically, you would want the brokers to use the internal name, but use the external FQDNs for the producers and consumers only. To do this, you will update advertised.listeners instead.
Set advertised.host.name to a host name, not an IP address. The default is to return a FQDN using getCanonicalHostName(), but this is only best effort and falls back to an IP. See the java docs for getCanonicalHostName().
The trick is to get that host name to always resolve to the correct IP. For small environments I usually setup all of the hosts with all of their internal IPs in /etc/hosts. This way all machines know how to talk to each other over the internal network, by name. In fact, configure your Kafka clients by name now too, not by IP. If managing all the /etc/hosts files is a burden then setup an internal DNS server to centralize it, but internal DNS should return internal IPs. Either of these options should be less work than having IP addresses scattered throughout various configuration files on various machines.
Once everything is communicating by name all that's left is to configure external DNS with the external IPs and everything just works. This includes configuring Kafka clients with the server names, not IPs.
So to summarize, the solution to this was to add a route via NAT so that the machine can access its own external IP address.
Zookeeper uses the address it finds in advertised.host.name both to tell clients where to find the broker as well as to communicate with the broker itself. The error that gets reported doesn't make this very clear, and it's confusing because a client has no problem opening a TCP connection.
Taking cue from above: for my single node (while still learning) I modified server.properties file having text "advertised.host.name" to value=127.0.01. So finally it looks something like this
advertised.host.name=127.0.0.1
While starting producer it still shows warning, but now it is atleast working while I can see messages on consumer terminal perfectly comming
On your machine where Kafka is installed, check if it is up and running. The error states, 0 brokers are available that means Kafka is not up and running.
On linux machine you can use the netstat command to check if the service is running.
netstat -an|grep port_kafka_is_Listening ( default is 9092)
conf/server.properties:
host.name
DEPRECATED: only used when listeners is not set. Use listeners instead. hostname of broker. If this is set, it will only bind to this address. If this is not set, it will bind to all interfaces