Why root directory node does not exist while nodes under root directory exist in zookeeper instance? - apache-zookeeper

I am running a zookeeper instance aka IA in standalone mode, trying to upgrade to quorum mode, then I prepared another 2 zookeeper instances(IB and IC) with empty snapshot directory, first modified zoo.cfg properly in 3 instances, and created myid respectively, restart the standalone instance IA first, then start the other 2.
What happened to IB and IC is, they have the data, but the root directory is not there:
Both IB and IC:
[zk: localhost:2181(CONNECTED) 14] ls /
Node does not exist: /
[zk: localhost:2181(CONNECTED) 15] ls /zookeeper
[quota]
[zk: localhost:2181(CONNECTED) 16]
besides, there is data loss in IB:
[zk: localhost:2181(CONNECTED) 16] get /demo/version
cZxid = 0x30000006c
ctime = Thu Dec 22 17:49:13 CST 2016
mZxid = 0x30000006c
mtime = Thu Dec 22 17:49:13 CST 2016
pZxid = 0x6003792a0
cversion = 12764622
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 135794
[zk: localhost:2181(CONNECTED) 17]
IA looks like:
[zk: localhost:2181(CONNECTED) 10] get /demo/version
cZxid = 0x30000006c
ctime = Thu Dec 22 17:49:13 CST 2016
mZxid = 0x30000006c
mtime = Thu Dec 22 17:49:13 CST 2016
pZxid = 0x6003792a0
cversion = 12312921
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 587495
[zk: localhost:2181(CONNECTED) 11]
IC looks like:
[zk: localhost:2181(CONNECTED) 10] get /demo/version
cZxid = 0x30000006c
ctime = Thu Dec 22 17:49:13 CST 2016
mZxid = 0x30000006c
mtime = Thu Dec 22 17:49:13 CST 2016
pZxid = 0x6003792a0
cversion = 12312921
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 587495
[zk: localhost:2181(CONNECTED) 11]
btw, the status are just fine:
IA:
[shell#kernel /data/zookeeper/zookeeper-3.4.8/bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/zookeeper/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower
IB:
[shell#kernel /data/zookeeper/zookeeper-3.4.8/bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/zookeeper/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower
IC:
[shell#kernel /data/zookeeper/zookeeper-3.4.8/bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/zookeeper/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: leader
As shown above, the version is
3.4.8
Thank you in advance

I managed to fix this issue by changing the initLimit and syncLimit to 100 and 50 respectively, tickTime remains to 2000, then migrate from standalone mode to quorum mode, wait a moment, everything went fine then.

Related

how can zookeeper cli create an empty node?

The create command:
create [-s] [-e] path data
Unspecified the data field while creating node.
It is possible using ZooInspector
I have used the following command: create /test "".
Get command on zkCli results:
[zk: localhost:2181(CONNECTED) 14] get /test
cZxid = 0x4
ctime = Fri Sep 07 09:38:31 IRDT 2018
mZxid = 0x4
mtime = Fri Sep 07 09:38:31 IRDT 2018
pZxid = 0x4
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0
Finally, I download the zooInspector to check how it works and I create two znodes: fromcli and fromInspector. Accordingly, results are presented:
[zk: localhost:2181(CONNECTED) 20] ls /
[fromInspector, zookeeper, fromcli]
[zk: localhost:2181(CONNECTED) 21] get /fromcli
cZxid = 0x23
ctime = Fri Sep 07 11:11:39 IRDT 2018
mZxid = 0x23
mtime = Fri Sep 07 11:11:39 IRDT 2018
pZxid = 0x23
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0
[zk: localhost:2181(CONNECTED) 22] get /fromInspector
cZxid = 0x24
ctime = Fri Sep 07 11:12:01 IRDT 2018
mZxid = 0x24
mtime = Fri Sep 07 11:12:01 IRDT 2018
pZxid = 0x24
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0

How to remove an inconsistent kafka topic metadata data from kafka_2.10-0.8.1.1

Wondering how to recover from a unique situation where zookeeper seems
to have the topic (T_60036) metadata, but broker doesn't have the corresponding
log file causing producers to fail with exception
kafka.common.FailedToSendMessageException
Below is what we noticed:
In zookeeper both /brokers/topics/T_60036 and /config/topics/T_60036 paths exist.
kafka#kafka-3:~$ /opt/kafka/kafka_2.10-0.8.1.1/bin/zookeeper-shell.sh
localhost:2181 get /brokers/topics/T_60036/partitions/0/state
Connecting to localhost:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
{"controller_epoch":6,"leader":1,"version":1,"leader_epoch":0,"isr":[1,2]}
cZxid = 0x80013308c
ctime = Wed Jun 06 04:55:37 UTC 2018
mZxid = 0x80013308c
mtime = Wed Jun 06 04:55:37 UTC 2018
pZxid = 0x80013308c
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 74
numChildren = 0
kafka#kafka-3:~$ /opt/kafka/kafka_2.10-0.8.1.1/bin/zookeeper-shell.sh
localhost:2181 get /config/topics/T_60036
Connecting to localhost:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
{"version":1,"config":{}}
cZxid = 0x800132992
ctime = Wed Jun 06 04:55:13 UTC 2018
mZxid = 0x800132992
mtime = Wed Jun 06 04:55:13 UTC 2018
pZxid = 0x800132992
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 25
numChildren = 0
But there are no log files for this topic:
kafka#kafka-3:~$ ls -l /var/kafka/topics/T_60036*
ls: cannot access /var/kafka/topics/T_60036*: No such file or directory
I did read the second comment for topic deletion here but I am afraid it may destabilize the entire cluster. My question is will it be safe to delete the orphan zookeeper entries ("/config/topics/T_60036",
"/brokers/topics/T_60036") from zookeeper without restarting or
jeopardizing the cluster.
Here is our configuration
Version: kafka_2.10-0.8.1.1
Cluster Configuration: 4 kafka brokers + 4 zookeeper
Topic Partiton: 1
Topic Replicas: 2
This is what seems to have worked without bringing down the cluster:
First delete the corrupted topic using a hidden feature of 0.8.1.1
kafka#kafka-3:~$ /opt/kafka/kafka_2.10-0.8.1.1/bin/kafka-run-class.sh kafka.admin.DeleteTopicCommand --zookeeper localhost:2181 --topic T_60036
Re-create the topic
kafka#kafka-3:~$/opt/kafka/kafka_2.10-0.8.1.1/bin/kafka-topics.sh --create --topic T_60036 --zookeeper localhost:2181 --partitions 1 --replication-factor 2
Want to let folks know that if you try the proposed solution in a newer cluster version (tested in version 2.8), using hidden feature 0.8.1.1 kafka-run-class.sh kafka.admin.DeleteTopicCommand, this will led to an inconsistent state in zookeeper topic configuration.
So i would recommend not to do it.
Maybe it worked for previous versions, but not for 2.8

Zookeeper watches :: No notification in case of child node modification

We are seeing an issue with zookeeper watches. We creating a node “/newtest” and intent is to add/modify nodes inside it. We are putting a watch on "/newtest”. Our observation is that if a child is added or deleted we get the notification but if a child is modified we do not get the notification.
Below is the output from zkCli.sh commands
========
[zk: localhost:2181(CONNECTED) 21] ls /newtest watch <=== to get the child nodes plus the watch
[1, 5, 4] <=== 1,5, 4 are the child nodes
[zk: localhost:2181(CONNECTED) 24] set /newtest/5 hello6 <=== updating the data for node “5”, no watch notification
cZxid = 0xc16
ctime = Fri Mar 11 01:03:29 UTC 2016
mZxid = 0xc78
mtime = Fri Mar 11 01:19:48 UTC 2016
pZxid = 0xc16
cversion = 0
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 6
numChildren = 0
[zk: localhost:2181(CONNECTED) 25] create /newtest/6 hello6 <=== creating a new node
WATCHER::
Created /newtest/6
WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/newtest <== watcher notification
[zk: localhost:2181(CONNECTED) 26] ls /newtest watch <=== Again watch
[1, 6, 5, 4]
[zk: localhost:2181(CONNECTED) 27] set /newtest/6 hello6 <== updating node “6”, no notification
cZxid = 0xc79
ctime = Fri Mar 11 01:19:59 UTC 2016
mZxid = 0xc86
mtime = Fri Mar 11 01:23:18 UTC 2016
pZxid = 0xc79
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 6
numChildren = 0
========
Please suggest a solution. Zookeeper version is zookeeper.version=3.4.6--1
I would suggest not using zkCli.sh for anything other than testing and small/quick operations. If you want to get notifications when a child node is modified I would suggest writing your own watchers in Java using Apache Curator, and more specifically using a Tree Cache.

Kafka Consumer startup error: Failed to add leader for partitions [calls,0] - NotLeaderForPartitionException

Note: this is for an older version : We are running kafka_2.9.2-0.8.1
When attempting to run the kafka-console-consumer.bat against an existing topic on windows7 we receive "failed to add leader" and "NotLeaderForPartition" exceptions
Here is the command line
set GROUP=group1234
kafka-console-consumer.bat --group %GROUP% --zookeeper localhost:2181 --topic calls --from-beginning
Here are the errors:
[2014-05-26 15:02:12,997] WARN [group1234_S80035683-SC01-1401141732400-98745e28-leader-finder-thread], Failed to add leader for partitions [calls,0];
will retry (kafka.consumer.ConsumerFetcherManager$LeaderFinderThread:89)
kafka.common.NotLeaderForPartitionException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at java.lang.Class.newInstance(Class.java:374)
at kafka.common.ErrorMapping$.exceptionFor(ErrorMapping.scala:73)
at kafka.consumer.SimpleConsumer.earliestOrLatestOffset(SimpleConsumer.scala:160)
at kafka.consumer.ConsumerFetcherThread.handleOffsetOutOfRange(ConsumerFetcherThread.scala:60)
at kafka.server.AbstractFetcherThread$$anonfun$addPartitions$2.apply(AbstractFetcherThread.scala:179)
at kafka.server.AbstractFetcherThread$$anonfun$addPartitions$2.apply(AbstractFetcherThread.scala:174)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:119)
at kafka.server.AbstractFetcherThread.addPartitions(AbstractFetcherThread.scala:174)
at kafka.server.AbstractFetcherManager$$anonfun$addFetcherForPartitions$2.apply(AbstractFetcherManager.scala:86)
at kafka.server.AbstractFetcherManager$$anonfun$addFetcherForPartitions$2.apply(AbstractFetcherManager.scala:76)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:119)
at kafka.server.AbstractFetcherManager.addFetcherForPartitions(AbstractFetcherManager.scala:76)
And we are unable to consume any messages in the end.
We are running kafka_2.9.2-0.8.1
Note: Zookeeper is seeing the Consumer attempt to connect: I go into zkCli and can see the new group1234
[zk: localhost:2181(CONNECTED) 2] ls2 /consumers/group1234
[owners, ids]
cZxid = 0x123
ctime = Mon May 26 15:02:12 PDT 2014
mZxid = 0x123
mtime = Mon May 26 15:02:12 PDT 2014
pZxid = 0x128
cversion = 2
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 2
And here is info on the requested calls topic in ZK:
[zk: localhost:2181(CONNECTED) 7] ls2 /brokers/topics/calls
[partitions]
cZxid = 0x18
ctime = Sat May 24 23:15:16 PDT 2014
mZxid = 0x18
mtime = Sat May 24 23:15:16 PDT 2014
pZxid = 0x1c
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 36
numChildren = 1
In case there were corruption on the topic, I just dropped it in ZK and then recreated it via kafka-topics.bat. Here is the new ZK output
[zk: localhost:2181(CONNECTED) 15] ls2 /brokers/topics/calls
[]
cZxid = 0x136
ctime = Mon May 26 16:02:51 PDT 2014
mZxid = 0x136
mtime = Mon May 26 16:02:51 PDT 2014
pZxid = 0x136
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 36
numChildren = 0
A search shows that now, 7 years later, this is no longer a known problem for current versions.
There have also been multiple patches which resolve errors that may or may not be the same, and it is almost certain that one of these fixed the issue.
As such the only practical solution for anyone seems to be upgrading to a newer version. (For Kafka, Zookeeper as well as Windows.)
If the problem persists in currently relevant versions please ask a new question as the root cause is unlikely to be the same.

getData() in CuratorFramework not returning any data

When I run
get <path>
in zookeepr CLI, I get the following
192.168.0.102
cZxid = 0x2e93
ctime = Wed Feb 06 15:12:20 GMT+05:30 2013
mZxid = 0x2e93
mtime = Wed Feb 06 15:12:20 GMT+05:30 2013
pZxid = 0x2e93
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x13cae2a97ed001f
dataLength = 13
numChildren = 0
For the same path I am trying to get the data as follow
client.getData().forPath(path);
I deserialize the data. But it is not returning anything.
I also tried
client.getData().inBackround().forPath(path);
client.getData().watched().inBackGround().forPath(path);
It's because you are using inBackground().
inBackground() causes the request to execute asynchronously. By removing inBackground() you should get the desired outcome.