I am using Kafka v0.9.0.1 (Scala v2.11) and the com.101tec:zkclient v0.7. I am trying to use AdminUtils to create a kafka topic. My code is the following.
String zkServers = "node1:2181,node2:2181,node3:2181,node4:2181";
Integer sessionTimeout = (int)TimeUnit.SECONDS.toMillis(10L);
Integer connectionTimeout = (int)TimeUnit.SECONDS.toMillis(8L);
ZkSerializer zkSerializer = ZKStringSerializer$.MODULE$;
Boolean isSecureKafkaCluster = false;
String topic = "test";
Integer partitions = 1;
Integer replication = 3;
ZkClient zkClient = new ZkClient(zkServers, sessionTimeout, connectionTimeout, zkSerializer);
ZkUtils zkUtils = new ZkUtils(zkClient, new ZkConnection(zkServers), isSecureKafkaCluster)
if(!AdminUtils.topicExists(zkUtils, topic)) {
AdminUtils.createTopic(zkUtils, topic, partitions, replications, new Properties());
}
The topic is actually created as verified by the following command.
bin/kafka-topics.sh --describe --zookeeper node1:2181 --topic test
However, the output is not as expected.
Topic:test PartitionCount:1 ReplicationFactor:1 Configs:
Topic: test Partition: 0 Leader: -1 Replicas: 4 Isr:
If I use the script.
bin/kafka-topics.sh --create --zookeeper node1:2181 --replication-factor 3 --partitions 1 --topic topic1
Then I see the following.
Topic:test1 PartitionCount:1 ReplicationFactor:3 Configs:
Topic: test1 Partition: 0 Leader: 2 Replicas: 2,3,4 Isr: 2
Any ideas on what I'm doing wrong? The effect is that if I use a Producer to send a ProducerRecord to the topic, nothing shows up on the topic.
I had the same issue.
Solution:
Clean zk meta info (/brokers/topic)
Clean all /data dir to remove all topic-partition folders belongs to that topic
Restart the whole kafka cluster all brokers at once.
Recreate that topic.
This solved my problem. And I think the root cause was the defect from kafka itself failing to handle clean removal topics (this has been fixed since v1.0.0).
Edit:
even with Kafka(>= v1.0.0), sometimes deleting topic will stuck if you are deleting an empty topic or if your kafka cluster is under extreme load.
solution would be as simple as restarting the controller broker. (you can always find the controller broker under ZK: /controller by get /controller). so just restarting one broker instead of the whole kafka cluster.
Related
i want to view topic level properties something like
"message.timestamp.type": "LogAppendTime",
"cleanup.policy":"compact"
is it possible to view what all properties is set at topic level?
is there any command where i can view my topic level properties as mentioned above i google lot and found a cmd but its not work for me
cmd is as follow
kafka-configs.bat --describe --zookeeper localhost:2181 --entity-type topics --entity-name test
An alternative could be
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
that lists some basic information about the provided Kafka topic, along with all non-default configurations (Configs) on topic-level. For example,
Topic:my-topic PartitionCount:1 ReplicationFactor:3 Configs: compression.type=gzip,segment.bytes=1073741824,retention.ms=100,max.message.bytes=100001200,delete.retention.ms=100000
Topic: my-topic Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
It would be helpful to know why you think kafka-configs script is not working.
Remember that the script only print the configurations with a non default value.
If the script only prints something like:
Configs for topic 'test' are
Probably, the topic configuration has not been changed.
If you are familiar with docker, i recommend you to run a container
with landoop/kafka-topics-ui image.
https://hub.docker.com/r/landoop/kafka-topics-ui/
It is a usefull docker image that shows all the data of the topics, messages, offsets, configurations ...
Here you can see and example of configuration view:
Its my early days in learning kafka. And I am checking out every kafka property/concept in my local machine.
So I came across this property min.insync.replicas and here is my understanding. Please correct me if I've misunderstood anything.
Once a message is sent to a topic, the message must be written to at least min.insync.replicas number of followers.
min.insync.replicas also includes the leader.
If number of available live brokers( indirectly, in sync replicas ) are less than the specified min.insync.replicas , then producer will raise an exception failing to publish the message.
Following are the steps I followed to create the above scenario
Started 3 brokers in local with broker Ids 0, 1 and 2
created the topic insync and set min.insync.replicas to 2
using the following command
sudo ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic insync --config min.insync.replicas=2
Describe the topic resulted in the following
Topic:insync PartitionCount:1 ReplicationFactor:3 Configs:min.insync.replicas=2
Topic: insync Partition: 0 Leader: 2 Replicas: 2,0,1 Isr: 1,2,0
At this point, I made sure the property I've provided is picked by kafka
I started sending messages and consuming them from terminal using following command
Producer: ./kafka-console-producer.sh --broker-list localhost:9092 --topic insync --producer.config ../config/producer.properties
Consumer: ./kafka-console-consumer.sh --zookeeper localhost:2181 --topic insync
At this point, I was able to send and receive messages successfully.
Bought down 2 brokers (0 and 2) and described the topic and resulted in following
Topic:insync PartitionCount:1 ReplicationFactor:3 Configs:min.insync.replicas=2
Topic: insync Partition: 0 Leader: 1 Replicas: 2,0,1 Isr: 1
At this point, the In Sync Replicas are just 1(Isr: 1)
Then I tried to produce the message and it worked. I was able to send messages from console-producer and I could see those messages in console consumer.
My Kafka version: kafka_2.10-0.10.0.0
following are the producer properties:
bootstrap.servers=localhost:9092
compression.type=none
batch.size=20
acks=all
I expected the producer to fail with NotEnoughReplicasException as mentioned in this.
public class NotEnoughReplicasException
extends RetriableException
Number of insync replicas for the partition is lower than >min.insync.replicas
but it worked normally.
Am I missing something? How can I create the scenario?
*************** EDIT **********************
Instead of producing the messages from console producer, I tried to generate messages from java code. This time, I got the expected exception in the kafka broker. Although I expected it in the producer (java code). As this experiment is raising more questions, I've posted another question.
is acks set to "all"? if not, try setting it to all
I believe that error is for transactional producer, you may need to add this config:
transactional.id=TID-TEST
if still not working, please check your replicator factor and min insync isr for the internal topic: __transaction_state
Step 1: create Topic with only one partition:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
Step 2: Produce some message to topic test.
Step 3: Start a consume on topic test. It can get all messages which is pushed in Step 2.
It works fine with topic with 1 partition.
But when I try to use topic with 2 partitions, consumer only get messages which are generated after the consumer is up.
Reproduce:
Step 1: create Topic with only one partition:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test2
Step 2: Produce some message to topic test2.
Step 3: Start a consumer on topic test2. It can't get messages in Step 2.
Step 4: keep consumer on, produce some message to topic test2, then now the consumer can get messages.
Does it work fine? Or I miss something?
auto.offset.reset option's default value is 'latest'
If you want to read the message that was sent before the consumer
set auto.offset.reset:earliest
I have a partition with the following Replicas:
Topic: topicname Partition: 10 Leader: 1 Replicas: 1,2,4,3 Isr: 1,2,3
Where Replica 4 is a non-existent broker. I accidentally added this broker into the replica set as a typo.
I want to remove 4 from the Replica set. but after running kafka-reassign-partitions.sh, the reassignment to remove Replica #4 never finishes.
kafka-reassign-partitions.sh --zookeeper myzookeeperhost:2181 --reassignment-json-file remove4.txt --execute
Where remove4.txt looks like
{ "partitions": [
{ "topic": "topicname", "partition": 2, "replicas": [1,2,3] }
], "version": 1 }
The reassignment is stuck:
kafka-reassign-partitions.sh --zookeeper myzookeeperhost:2181 --reassignment-json-file remove4.txt --verify
Status of partition reassignment:
Reassignment of partition [topicname,10] is still in progress
I checked the controller log, it looks like the reassignment command was picked up, but nothing happens afterwards:
[2017-08-01 06:46:07,653] DEBUG [PartitionsReassignedListener on 101 (the controller broker)]: Partitions reassigned listener fired for path /admin/reassign_partitions. Record partitions to be reassigned {"version":1,"partitions":[{"topic":"topicname","partition":10,"replicas":[1,2,3]}]} (kafka.controller.PartitionsReassignedListener)
Any ideas on what I'm doing wrong? How to I remove broker #4 from the replica set?
update: I'm running kafka 10
I was able to solve this issue by spinning up a new broker with a broker-id matching the one that was added (in your case 4).
The Kafka Quickstart guide shows you how to spin up a broker with a specific id. Once you have your node up with id 4, run:
./bin/kafka-topics.sh --zookeeper localhost:2181 --topic badbrokertest --describe
You should see that all the replicas are in the isr column like so:
Topic:badbrokertest PartitionCount:3 ReplicationFactor:3 Configs:
Topic: badbrokertest Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: badbrokertest Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: badbrokertest Partition: 2 Leader: 1 Replicas: 1,2,3,4 Isr: 1,2,3,4
Now you can reassign your partitions!
./bin/kafka-reassign-partitions.sh --reassignment-json-file badbroker2.json --zookeeper localhost:2181
Where badbroker2.json looks like:
{
"version":1,
"partitions":[
{"topic":"badbrokertest","partition":0,"replicas":[1,2,3]},
{"topic":"badbrokertest","partition":1,"replicas":[1,2,3]},
{"topic":"badbrokertest","partition":2,"replicas":[1,2,3]}
]
}
So in short, once you've sync'd all your replicas by adding the missing broker you can remove the unneeded broker.
If you're working with several servers, be sure to set the listeners field in the config to make your temporary broker available to the others brokers. The Quickstart guide doesn't consider that case.
listeners=PLAINTEXT://10.9.1.42:9093
I installed kafka on a linux server. I defined a topic with a few partitions. I know that each partition is mapped to a physical file on disk, but I don't know where it is.
Where are the partition files saved ?
In your config/server.properties you'll find a section on "Log Basics". The property log.dirs is defining where your logs/partitions will be stored on disk.
By default on Linux it is stored in /tmp/kafka-logs. If you will navigate to this folder you will see something like this:
recovery-point-offset-checkpoint
replication-offset-checkpoint
topic-0
msg-0
msg-1
Which means that you have two topics (topic which has 1 partition and msg which has 2).
As it was noted by Ludd, you can find the location inside config/server.properties file by looking for log.dirs.
Try running this command
bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic test
you will get output
Topic:test Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
now try going to \config file
cat server.properties
and search for broker_id
if broker_id matches with leader number then topic partition is stored in that broker