How to access a protected znode from ZooKeeper using zkCli? - apache-zookeeper

I have created a znode using:
zookeeper-0:/opt/zookeeper/bin # ./zkCli.sh create /mynode content digest:user:pass:cdrwa
How to access the znode using the zkCli.sh utility now?
zookeeper-0:/opt/zookeeper/bin # ./zkCli.sh get /mynode
Connecting to localhost:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
Authentication is not valid : /mynode
zookeeper-0:/opt/zookeeper/bin #
The getAcl is showing the following:
zookeeper-0:/opt/zookeeper/bin # ./zkCli.sh getAcl /mynode
Connecting to localhost:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
'digest,'user:pass
: cdrwa
zookeeper-0:/opt/zookeeper/bin #

You need to create the digest ACL using the hashed password.
ZooKeeper Programmer's Guide
digest uses a username:password string to generate MD5 hash which is then used as an ACL ID identity. Authentication is done by sending the username:password in clear text. When used in the ACL the expression will be the username:base64 encoded SHA1 password digest.
Generate the hashed password
$ java -cp "./zookeeper-3.4.13.jar:./lib/slf4j-api-1.7.25.jar" \
org.apache.zookeeper.server.auth.DigestAuthenticationProvider user:pass
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
user:pass->user:smGaoVKd/cQkjm7b88GyorAUz20=
Create a node using the hashed password
[zk: zookeeper(CONNECTED) 0] create /mynode content digest:user:smGaoVKd/cQkjm7b88GyorAUz20=:cdrwa
Created /mynode
Accessing the protected node
[zk: zookeeper(CONNECTED) 1] get /mynode
Authentication is not valid : /mynode
[zk: zookeeper(CONNECTED) 2] addauth digest user:pass
[zk: zookeeper(CONNECTED) 3] get /mynode
content
cZxid = 0x14
ctime = Wed Sep 12 19:37:48 GMT 2018
mZxid = 0x14
mtime = Wed Sep 12 19:37:48 GMT 2018
pZxid = 0x14
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 0

If you look at the content of the zkcli.sh script, you will see a commented out block showing how to configure an environment variable with credentials:
#SOLR_ZK_CREDS_AND_ACLS="-DzkACLProvider=org.apache.solr.common.cloud.VMParamsAllAndReadonlyDigestZkACLProvider \
# -DzkCredentialsProvider=org.apache.solr.common.cloud.VMParamsSingleSetCredentialsDigestZkCredentialsProvider \
# -DzkDigestUsername=admin-user -DzkDigestPassword=CHANGEME-ADMIN-PASSWORD \
# -DzkDigestReadonlyUsername=readonly-user -DzkDigestReadonlyPassword=CHANGEME-READONLY-PASSWORD"
You can configure the environment variable SOLR_ZK_CREDS_AND_ACLS on your local system with the correct credentials following this template and then the zkcli.sh script will use them when communicating with ZooKeeper.

Related

how to remaine the kafka retentions-bytes and kafka retention-segment even after kafka machine reboot [duplicate]

This question already exists:
kafka + how to set specific retention byets for specific topic as permamnet
Closed 2 years ago.
we set retention bytes value - 104857600 for topic - topic_test
[root#confluent01 ~]# kafka-topics --zookeeper localhost:2181 --alter --topic topic_test --config retention.bytes=104857600
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
Going forward, please use kafka-configs.sh for this functionality
Updated config for topic "topic_test".
Now we verify the retention bytes from the zookeeper:
[root#confluent01 ~]# zookeeper-shell confluent01:2181 get /config/topics/topic_test
Connecting to confluent1:2181
{"version":1,"config":{"retention.bytes":"104857600"}}
cZxid = 0xb30a00000038
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
ctime = Mon Jun 29 11:42:30 GMT 2020
mZxid = 0xb31100008978
mtime = Wed Jul 22 19:22:20 GMT 2020
pZxid = 0xb30a00000038
cversion = 0
dataVersion = 7
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 54
numChildren = 0
now we performed reboot to the kafka confluent01 machine
after machines started and kafka service start successfully , we checked again the retention-bytes from zookeeper:
but now ( after machine reboot ) we can see that the retention bytes isn't configured in zookeeper
[root#confluent01 ~]#zookeeper-shell confluent01:2181 get /config/topics/topic_test
Connecting to confluent1:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null  no retention bytes value
{"version":1,"config":{}}
cZxid = 0xb30a00000038
ctime = Mon Jun 29 11:42:30 GMT 2020
mZxid = 0xb3110000779b
mtime = Wed Jul 22 14:09:19 GMT 2020
pZxid = 0xb30a00000038
cversion = 0
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 25
numChildren = 0
the question is - how to remain the retention bytes even after restart of kafka machine ?
NOTE - we not want to use the retention bytes from server.properties
because we set different retention bytes to each topic
Zookeeper and Kafka default to store data in /tmp
If you reboot the machines, /tmp is cleared
Otherwise, if you use confluent start command, this is not permanent data.
If you use Docker/Kubernetes without mounting any volumes, this is also not permanent.
You sould also be using kafka-topics --describe command rather than zookeeper-shell as Zookeeper will be removed completely in upcoming Kafka releases

how to capture the retention bytes values from zookeeper meta data

we are trying to capture the retention bytes values from the topic - topic_test
we try the following example , but seems this isnt the right path from zookeeper
zookeeper-shell kafka1:2181,kafka2:2181,kafka3:2181 <<< "ls /brokers/topics/topic_test/partitions/88/state"
Connecting to kafka1:2181,kafka2:2181,kafka3:2181
Welcome to ZooKeeper!
JLine support is disabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[]
any idea where are the values of retention bytes per topic that can capture from zookeeper?
I did the following , but not see the retention bytes ( what is wrong here ) , we have kafka confluent version - 0.1
zookeeper-shell confluent1:2181 get /config/topics/test_topic
Connecting to kafka1:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
{"version":1,"config":{}}
cZxid = 0xb30a00000038
ctime = Mon Jun 29 11:42:30 GMT 2020
mZxid = 0xb30a00000038
mtime = Mon Jun 29 11:42:30 GMT 2020
pZxid = 0xb30a00000038
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 25
numChildren = 0
Configurations are stored in Zookeeper under the /config path.
For example, for the topic topic_test:
# Create topic
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --create \
--topic topic_test --partitions 1 --replication-factor 1 --config retention.bytes=12345
# Retrieve configs from Zookeeper
./bin/zookeeper-shell.sh localhost get /config/topics/topic_test
Connecting to localhost
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
{"version":1,"config":{"retention.bytes":"12345"}}
Note in most cases, you should not rely on direct access to Zookeeper but instead use the Kafka API to retrieve these values.
Using:
kafka-topics.sh:
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic topic_test
kafka-configs.sh:
./bin/kafka-configs.sh --bootstrap-server localhost:9092 --describe --entity-type topics --entity-name topic_test
The Admin API using describeConfigs()

How to check Kafka server status or details?

Is there a command to show the details of Kafka server or the status of Kafka server? (I am not trying to find out if the kafka server is running.)
I can only find information on topic, partition, producer, and consumer CLI commands.
If you are looking for the Kafka cluster broker status, you can use zookeeper cli to find the details for each broker as given below:
ls /brokers/ids returns the list of active brokers IDs on the cluster.
get /brokers/ids/<id> returns the details of the broker with the given ID.
Example :
kafka_2.12-1.1.1 % ./bin/zookeeper-shell.sh localhost:2181 ls /brokers/ids
Connecting to localhost:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[0]
kafka_2.12-1.1.1 % ./bin/zookeeper-shell.sh localhost:2181 get /brokers/ids/0
Connecting to localhost:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://localhost:9092"],"jmx_port":-1,"host":"localhost","timestamp":"1558428038778","port":9092,"version":4}
cZxid = 0x116
ctime = Tue May 21 08:40:38 UTC 2019
mZxid = 0x116
mtime = Tue May 21 08:40:38 UTC 2019
pZxid = 0x116
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x16ad9830f16000b
dataLength = 188
numChildren = 0
You can put these steps in some shell script to get the details for all brokers.
You can activate JMX metrics by setting environment variable JMX_PORT.
$ export JMX_PORT=9010
$ ./bin/kafka-server-start.sh ./config/server.properties
Then, you can use jconsole or Java Mission Control to display cluster metrics.

How to see a topic creation and alteration timestamp in Kafka

Or at least one of them? I don't get it when I use kafka-topics.sh --list or --describe, perhaps I'm missing the option for verbosity, although I don't see them in the attribute list for topic configuration at all. Is it not sensible information with Kafka?
You can see the Kafka topic creation time(ctime) and last modified time(mtime) in zookeeper stat.
First login to zookeeper shell
kafka % bin/zookeeper-shell.sh localhost:2181 stat /brokers/topics/test-events
It will return below details:
Connecting to localhost:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
cZxid = 0x1007ac74c
ctime = Thu Nov 01 10:38:39 UTC 2018
mZxid = 0x4000f6e26
mtime = Mon Jan 07 05:22:25 UTC 2019
pZxid = 0x1007ac74d
cversion = 1
dataVersion = 8
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 112
numChildren = 1
You can refer this to understand the attributes : https://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#sc_zkStatStructure
Kafka does not publicly state the date of topic creation/alteration.
The timing data itself is not required by Kafka to work. The current topic config values are kept by the Zookeeper ensemble that the whole Kafka cluster requires to function, so it's kept in-sync by the underlying zookeeper process, and for the part that Kafka is required to syncrhonize, only the offsets within the topic are required to partially-order the messages as they come, the timestamp is not required information.
If you want to keep topic modifications actionable, maybe your best bet is to have a Kafka topic to save such modifications so that you can later read it.

Appropriate settings to register broker/port information for Kafka Cluster

I read the following from confluence wiki for kafka and I am quoting it below:
Why do I see error "Should not set log end offset on partition" in the
broker log?
Typically, you will see errors like the following.
kafka.common.KafkaException: Should not set log end offset on
partition [test,22]'s local replica 4 ERROR
[ReplicaFetcherThread-0-6], Error for partition [test,22] to broker
6:class
kafka.common.UnknownException(kafka.server.ReplicaFetcherThread)
A common problem is that more than one broker registered the same
host/port in Zookeeper. As a result, the replica fetcher is confused
when fetching data from the leader. To verify that, you can use a
Zookeeper client shell to list the registration info of each broker.
The Zookeeper path and the format of the broker registration is
described in Kafka data structures in Zookeeper. You want to make sure
that all the registered brokers have unique host/port.
According to the official documentation, if I do PLAINTEXT://:9092 then all interfaces will register using 9092 port. 0.0.0.0 means default interface will register using 9092 port.
If this is true, then I don't see how 0.0.0.0:9092 broker registration can never confuse zookeeper? I think if I don't explicitly specify the hostname or ipaddr with portname, Zookeeper will always get confuse since all brokers will register with same interface and port number. I have confirmed that using Zookeeper-shell.bat and running command get /broker/ids/{id} command.
The following is from Zookeeper Client Shell enquiry on /brokers/ids
get /brokers/ids/1
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://0.0.0.0:9092"],"jmx_port":-1,"host":"0.0.0.0","timestamp":"1500646657734","port":9092,"version":4}
cZxid = 0xe0000000f
ctime = Fri Jul 21 14:17:37 UTC 2017
mZxid = 0xe0000000f
mtime = Fri Jul 21 14:17:37 UTC 2017
pZxid = 0xe0000000f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x15d6582c70b0001
dataLength = 184
numChildren = 0
get /brokers/ids/2
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://0.0.0.0:9092"],"jmx_port":-1,"host":"0.0.0.0","timestamp":"1500646657006","port":9092,"version":4}
cZxid = 0xe0000000b
ctime = Fri Jul 21 14:17:37 UTC 2017
mZxid = 0xe0000000b
mtime = Fri Jul 21 14:17:37 UTC 2017
pZxid = 0xe0000000b
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x15d6582c70b0000
dataLength = 184
numChildren = 0
get /brokers/ids/3
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://0.0.0.0:9092"],"jmx_port":-1,"host":"0.0.0.0","timestamp":"1500646656895","port":9092,"version":4}
cZxid = 0xe00000008
ctime = Fri Jul 21 14:17:36 UTC 2017
mZxid = 0xe00000008
mtime = Fri Jul 21 14:17:36 UTC 2017
pZxid = 0xe00000008
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x35d6582c7800000
dataLength = 184
numChildren = 0
Has anyone got a better idea?
In kafka server.properties , there are two property keys:
listeners
The address the socket server listens on. It will get the value returned from
java.net.InetAddress.getCanonicalHostName() if not configured.
FORMAT:
listeners = listener_name://host_name:port
EXAMPLE:
listeners = PLAINTEXT://your.host.name:9092
advertised.listeners
Hostname and port the broker will advertise to producers and
consumers. If not set, it uses the value for "listeners" if
configured. Otherwise, it will use the value returned from
java.net.InetAddress.getCanonicalHostName().
OK. Pay attention to the details for advertised.listeners. if you don't configure this property, it will use the listeners default. when you set listeners to 0.0.0.0:9092, It will listen all net interface of your Kafka server. But if the advertised.listeners also set to 0.0.0.0, then others will not know how to connect to your Kafka server, Consumer, Producer and Zookeeper. all of these will fail to find where is your Kafka server.
So in a word, The advertised.listeners should be set your public net ip which other machine in Internet can connnect to your server with this ip.