Kafka zookeeper authentication not working - apache-kafka

I am trying to enable SASL username and password for a Kafka cluster with no ssl. I followed the steps on this Stackoverflow:
Kafka SASL zookeeper authentication
SASL authentication seems to be working for Kafka brokers. consumers and producers have to authenticate before writing to or reading from a topic. So far so good.
The problem is with creating and deleting topics on kafka. when I try to use the following command for example:
~/kafka/bin/kafka-topics.sh --list --zookeeper 10.x.y.z:2181
I am able to list all topics in the kafka cluster and create or delete any topic with no authentication at all.
I tried to follow the steps here:
Super User Authentication and Authorization
but nothing seem to work.
Any help in this matter is really appreciated.
Thanks & Regards,
Firas Khasawneh

You need to add zookeeper.set.acl=true to your Kafka server.properties so that Kafka will create everything in zookeeper with ACL set. For the topics which are already there, there will be no ACL and everyone can remove them directly from zookeeper.
Actually because of that mess, I had to delete everything from my zookeeper and Kafka and start from scratch.
But once everything is set, you can open zookeeper shell to verify that the ACL is indeed set:
KAFKA_OPTS="-Djava.security.auth.login.config=/path/to/your/jaas.conf" bin/zookeeper-shell.sh XXXXX:2181
From the shell you can run: getAcl /brokers/topics and check that not anyone from world have cdrwa
On a side note, the link you provided doesn't seem to reflect how the current version of Kafka stores information in zookeeper. I briefly looked at the codes and for those kafka-topics.sh commands, the topics information is from /brokers/topics instead of /config/topics

Related

Kafka: Topic list discrepancy between zookeeper and bootstrap-server

On a couple of my clusters I'm seeing a discrepancy between the list of topics returned by zookeeper as compared to the broker i.e the following commands return different (fewer in the case of the broker) results
kafka-topics.sh --zookeeper $zookeeper --list
kafka-topics.sh --bootstrap-server $broker --command-config $clientProperties --list
I've seen this behaviour with multiple client versions which leads me to assume that the issue is on the server side, but I have no idea what the root cause is or how to fix it.
It causes an issue for me because I'm using some code that uses the brokers for GET operations like listing topics, and zookeeper for SET operations (create/updating topics). If the broker doesn't return a topic in a listing, then the code path leads to a CREATE action against zookeeper and that will be rejected (it will fail). Unfortunately, I don't control the code so I can't apply a fix there.
Nonetheless, surely the list of topics in zookeeper should be identical to the list in the broker?
I'm using Kafka (Amazon MSK) version 2.2.1
Thanks for the suggestions in this post. This is the explanation and solution:
The command "kafka-topics.sh --zookeeper" and "kafka-topics.sh --bootstrap-server" return two different outputs because the latter takes into account the configured ACLs which, in this case, prevent access to the topic metadata. Hence, the command through zookeeper provides the full list of topics, whereas the command through the broker provides only the topics for which ACLs are not configured.
In order to ensure the second command works as expected, you need to explicitly add to the ACL list of the affected topics access to the "DESCRIBE" operation
(^^ kudos to AWS Support for figuring this out)

Kafka topic not able to assign leaders after creation

I was using a kafka topic, and it's metadata as well in my application. I hard deleted the topic from the zookeeper shell, by deleting the directories corresponding to that topic. After creating the topic again, I described the topic and found that no leaders have been assigned to this newly created topic. In the consumer, I can see repeated logs printing LEADER_NOT_AVAILABLE. Any reason as to what am I doing wrong? Or maybe is there a way to delete the metadata related to the kafka topic as well that I'm unaware of? Thanks in advance!
Deleting topics in Kafka hasn't been straightforward until recently. In general, you shouldn't attempt to delete Kafka topics by deleting metadata in Zookeeper. You should always use the included command line utilities.
First you need to make sure that deleting topics is enabled in the server.properties file on all brokers, and do a rolling restart if needed:
delete.topic.enable=true
After you restart the brokers to enable topic deletion, you should issue the delete command using the command line utilities:
./kafka-topics.sh —zookeeper <zookeeper_host>:2181 —delete —topic <topic_name>
If at this point, it's still stuck, try to run these two commands from the zookeeper shell to make sure and remove all metadata for that particular topic:
rmr /brokers/topics/<topic_name>
rmr /admin/delete_topics/<topic_name>
A few more details here:
https://medium.com/#contactsunny/manually-delete-apache-kafka-topics-424c7e016ff3

How to delete kafka topic from cluster version : 0.10.2.1

I am not able to delete kafka topic, Its marked for deletion but never gets deleted. Iam running kafka cluster with zookeeper cluster.
version of kafka : 0.10.2.1
Can anyone help me , with the list of steps that one needs to follow in order to delete a topic in kafka cluster.
Went through various queries in stackoverflow but could not find a valid workable answer.
You should have enabled its property at config before starting kafka server; it is disabled at default. To enable deletion property first stop kafka server and then open the server.properties in config file
and then uncomment #delete.topic.enable=true or add
delete.topic.enable=true
at the end of the file.
Now you can start kafka server and then you can delete any topic you want via:
bin/kafka-topics.sh --delete --zookeeper localhost:2181 --topic YOUR_TOPIC_NAME.
You could use Kafka Tool
Download link here
Then connect to your kafka server .
After that you could see the available topics in that server. From there you can select and delete the topics .

Not Kerberized Kafka broker connection to Kerberized Zookeeper

I couldn't find any info about this issue, so I'd be glad if someone could help me on this.
I have a Kerberized cluster with services such as Hbase, MapReduce, HDFS, Zookeeper,... all kerberized and working.
Let's imagine I want to add some kafka brokers to the cluster, but I do not want to Kerberize Kafka, since a shot in the testicles makes me feel better than the idea of a kerberized Kafka.
I don't know if I'm missing something, some parameter... probably I am.. but can the zookeeper be told that also has to accept PLAINTEXT petitions for some nodes, or for some specific directories, such as kafka in the example:
zookeeper:2181/kafka
Resuming, the question is:
Is there any option to include a non kerberized Kafka Broker and make it work against the already kerberized Zookeeper in the cluster?
If you need configuration like:
[zookeeper] <----- SASL ----> [kafka] <----- non-authenticated request ---> [clients]
then yes, it's possible. You need just to
Create principal (with keytabs) for brokers that will be used to communicate with Zookeeper.
Configure Zookeeper ACLs, setting cdrwa access to the node zookeeper:2181/kafka to that user
Copy the keytab to brokers and configure Kafka jaas file like this:
ZookeeperClient {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/path/to/keytab"
principal="user#REALM";
};
Then, set zookeeper.set.acl=true in Kafka configuration, but do not set any authorizer.class.name (this would enable authentication for Kafka consumers and producers)

Why do we need to mention Zookeeper details even though Apache Kafka configuration file already has it?

I am using Apache Kafka in (Plain Vanilla) Hadoop Cluster for the past few months and out of curiosity I am asking this question. Just to gain additional knowledge about it.
Kafka server.properties file already has the below parameter :
zookeeper.connect=localhost:2181
And I am starting Kafka Server/Broker with the following command :
bin/kafka-server-start.sh config/server.properties
So I assume that Kafka automatically infers the Zookeeper details by the time we start the Kafka server itself. If that's the case, then why do we need to explicitly mention the zookeeper properties while we create Kafka topics the syntax for which is given below for your reference :
bin/kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic test
As per the Kafka documentation we need to start zookeeper before starting Kafka server. So I don't think Kafka can be started by commenting out the zookeeper details in Kafka's server.properties file
But atleast can we use Kafka to create topics and to start Kafka Producer/Consumer without explicitly mentioning about zookeeper in their respective commands ?
The zookeeper.connect parameter in the Kafka properties file is needed for having each Kafka broker in the cluster connecting to the Zookeeper ensemble.
Zookeeper will keep information about connected brokers and handling the controller election. Other than that, it keeps information about topics, quotas and ACL for example.
When you use the kafka-topics.sh tool, the topic creation happens at Zookeeper level first and then thanks to it, information are propagated to Kafka brokers and topic partitions are created and assigned to them (thanks to the elected controller). This connection to Zookeeper will not be needed in the future thanks to the new Admin Client API which provides some admin operations executed against Kafka brokers directly. For example, there is a opened JIRA (https://issues.apache.org/jira/browse/KAFKA-5561) and I'm working on it for having the tool using such API for topic admin operations.
Regarding producer and consumer ... the producer doesn't need to connect to Zookeeper while only the "old" consumer (before 0.9.0 version) needs Zookeeper connection because it saves topic offsets there; from 0.9.0 version, the "new" consumer saves topic offsets in real topics (__consumer_offsets). For using it you have to use the bootstrap-server option on the command line insteand of the zookeeper one.