How do I delete/clean Kafka queued messages without deleting Topic - apache-kafka

Is there any way to delete queue messages without deleting Kafka topics?
I want to delete queue messages when activating the consumer.
I know there are several ways like:
Resetting retention time
$ ./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic MyTopic --config retention.ms=1000
Deleting kafka files
$ rm -rf /data/kafka-logs/<topic/Partition_name>

In 0.11 or higher you can run the bin/kafka-delete-records.sh command to mark messages for deletion.
https://github.com/apache/kafka/blob/trunk/bin/kafka-delete-records.sh
For example, publish 100 messages
seq 100 | ./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic mytest
then delete 90 of those 100 messages with the new kafka-delete-records.sh
command line tool
./bin/kafka-delete-records.sh --bootstrap-server localhost:9092 --offset-json-file ./offsetfile.json
where offsetfile.json contains
{"partitions": [{"topic": "mytest", "partition": 0, "offset": 90}], "version":1 }
and then consume the messages from the beginning to verify that 90 of the 100 messages are indeed marked as deleted.
./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic mytest --from-beginning
91
92
93
94
95
96
97
98
99
100

To delete all messages in a specific topic, you can run kafka-delete-records.sh
For example, I have a topic called test, which has 4 partitions.
Create a Json file , for example j.json:
{
"partitions": [
{
"topic": "test",
"partition": 0,
"offset": -1
}, {
"topic": "test",
"partition": 1,
"offset": -1
}, {
"topic": "test",
"partition": 2,
"offset": -1
}, {
"topic": "test",
"partition": 3,
"offset": -1
}
],
"version": 1
}
now delete all messages by this command :
/opt/kafka/confluent-4.1.1/bin/kafdelete-records --bootstrap-server 192.168.XX.XX:9092 --offset-json-file j.json
After executing the command, this message will be displayed
Records delete operation completed:
partition: test-0 low_watermark: 7
partition: test-1 low_watermark: 7
partition: test-2 low_watermark: 7
partition: test-3 low_watermark: 7

Related

Kafka console consumer to read avro messages in HDP 3

Trying to consume kafka Avro messages from console consumer and not exactly sure how to deserialize the messages.
sh /usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh --bootstrap-server localhost:6667 --topic test --consumer.config /home/user/kafka.consumer.properties --from-beginning --value-deserializer ByteArrayDeserializer
The Avro Schema in Schema Registry for the test topic is:
{
"type": "record",
"namespace": "test",
"name": "TestRecord",
"fields": [
{
"name": "Name",
"type": "string",
"default": "null"
},
{
"name": "Age",
"type": "int",
"default": -1
}
]
}
Using HDP 3.1 version and Kafka-clients-2.0.0.3.1.0.0-78
Could someone help me what would be the Deserializer required to read Avro messages from console.
Use kafka-avro-console-consumer
e.g.
sh /usr/hdp/current/kafka-broker/bin/kafka-avro-console-consumer.sh \
--bootstrap-server localhost:6667 \
--topic test \
--from-beginning \
--property schema.registry.url=http://localhost:8081

Clean Kafka topic in a cluster

I know I can clean Kafka topic on a broker by either deleting logs under /data/kafka-logs/topic/* or by setting retention.ms config to 1000. I want to know how can clean topics in a multi-node cluster. Should I stop Kafka process on each broker, delete logs and start Kafka or only leader broker would suffice? If I want to clean by setting retension.ms to 1000, do I need to set it on each broker?
To delete all messages in a specific topic, you can run kafka-delete-records.sh
For example, I have a topic called test, which has 4 partitions.
Create a Json file , for example j.json:
{
"partitions": [
{
"topic": "test",
"partition": 0,
"offset": -1
}, {
"topic": "test",
"partition": 1,
"offset": -1
}, {
"topic": "test",
"partition": 2,
"offset": -1
}, {
"topic": "test",
"partition": 3,
"offset": -1
}
],
"version": 1
}
now delete all messages by this command :
/opt/kafka/confluent-4.1.1/bin/kafdelete-records --bootstrap-server 192.168.XX.XX:9092 --offset-json-file j.json
After executing the command, this message will be displayed
Records delete operation completed:
partition: test-0 low_watermark: 7
partition: test-1 low_watermark: 7
partition: test-2 low_watermark: 7
partition: test-3 low_watermark: 7
if you want to delete one topic, you can use kafka-topics :
for example, I want to delete test topic :
/opt/kafka/confluent-4.0.0/bin/kafka-topics --zookeeper 109.XXX.XX.XX:2181 --delete --topic test
You do not need to restart Kafka

Kafka consumer not able to consume messages using bootstrap server name

I am facing an issue while consuming message using the bootstrap-server i.e. Kafka server. Any idea why is it not able to consume messages without zookeeper?
Kafka Version: kafka_2.11-1.0.0
Zookeeper Version: kafka_2.11-1.0.0
Zookeeper Host and port: zkp02.mp.com:2181
Kafka Host and port: kfk03.mp.com:9092
Producing some message:
[kfk03.mp.com ~]$ /bnsf/kafka/bin/kafka-console-producer.sh --broker-list kfk03.mp.com:9092 --topic test
>hi
>hi
Consumer not able to consume messages if I give –-bootstrap-server:
[kfk03.mp.com ~]$
/bnsf/kafka/bin/kafka-console-consumer.sh --bootstrap-server kfk03.mp.com:9092 --topic test --from-beginning
Consumer able to consume messages when --zookeeper server is given instead of --bootstrap-server -:
[kfk03.mp.com ~]$ /bnsf/kafka/bin/kafka-console-consumer.sh --zookeeper zkp02.mp.com:2181 --topic test --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
{"properties": {"messageType": "test", "sentDateTime": "2018-02-25T21:46:00.000+0000"}, "name": "Uttam Anand", "age": 29}
{"properties": {"messageType": "test", "sentDateTime": "2018-02-25T21:46:00.000+0000"}, "name": "Uttam Anand", "age": 29}
{"properties": {"messageType": "test", "sentDateTime": "2018-02-25T21:46:00.000+0000"}, "name": "Uttam Anand", "age": 29}
hi
{"properties": {"messageType": "test", "sentDateTime": "2018-02-25T21:46:00.000+0000"}, "name": "Uttam Anand", "age": 29}
{"properties": {"messageType": "test", "sentDateTime": "2018-02-25T21:46:00.000+0000"}, "name": "Uttam Anand", "age": 29}
{"properties": {"messageType": "test", "sentDateTime": "2018-02-25T21:46:00.000+0000"}, "name": "Uttam Anand", "age": 29}
{"properties": {"messageType": "test", "sentDateTime": "2018-02-25T21:46:00.000+0000"}, "name": "Uttam Anand", "age": 29}
{"properties": {"messageType": "test", "sentDateTime": "2018-02-25T21:46:00.000+0000"}, "name": "Uttam Anand", "age": 29}
hi
hi
uttam
hi
hi
hi
hello
hi
^CProcessed a total of 17 messages
While consuming messages from kafka using bootstrap-server parameter, the connection happens via the kafka server instead of zookeeper. Kafka broker stores offset details in __consumer_offsets topic.
Check if __consumer_offsets is present in your topics list. If it's not there, check kafka logs to find the reason.
We faced a similar issue. In our case the __consumer_offsets was not created because of the following error:
ERROR [KafkaApi-1001] Number of alive brokers '1' does not meet the required replication factor '3' for the offsets topic (configured via 'offsets.topic.replication.factor').

Increasing Replication Factor in Kafka gives error - "There is an existing assignment running"

I am trying to increase the replication factor of a topic in Apache Kafka.In order to do so I am using the command
kafka-reassign-partitions --zookeeper ${zookeeperid} --reassignment-json-file ${aFile} --execute
Initially my topic has a replication factor of 1 and has 5 partitions, I am trying to increase it's replication factor to 3.There are quite a bit of messages in my topic. When I run the above command the error is - "There is an existing assignment running".
My json file looks like this :
{
"version": 1,
"partitions": [
{
"topic": "IncreaseReplicationTopic",
"partition": 0,
"replicas": [2,4,0]
},{
"topic": "IncreaseReplicationTopic",
"partition": 1,
"replicas": [3,2,1]
}, {
"topic": "IncreaseReplicationTopic",
"partition": 2,
"replicas": [4,1,0]
}, {
"topic": "IncreaseReplicationTopic",
"partition": 3,
"replicas": [0,1,3]
}, {
"topic": "IncreaseReplicationTopic",
"partition": 4,
"replicas": [1,4,2]
}
]
}
I am not able to figure out where I am getting wrong. Any pointers will be greatly appreciated.
This message means that there is already another assignment of any topic is being executed.
Please try it again after some time. Then you won't see this message

Kafka high available feature not work

I am trying the quickstart of kafka documentation,link is, https://kafka.apache.org/quickstart.
I have deploy 3 brokers and create a topic.
➜ kafka_2.10-0.10.1.0 bin/kafka-topics.sh --describe --zookeeper
localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3
Configs:
Topic: my-replicated-topic Partition: 0 Leader: 2 Replicas: 2,0,1
Isr: 2,1,0
Then I use the "bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic" to test producer.
And use "bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic my-replicated-topic to test consumer"
the producer and consumer work well.
if I kill server 1 or 2, the producer and consumer work properly.
but if I kill server 0, and I type the message in producer terminal, the consumer can't read new messages.
when I kill server 0,the consumer print the log:
[2017-06-23 17:29:52,750] WARN Auto offset commit failed for group console-consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:52,974] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,085] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,195] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,302] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,409] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
then I restart the server 0,the consumer print the message and some warn logs:
hhhh
hello
[2017-06-23 17:32:32,795] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:32:32,902] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
This confused me.Why server 0 is so special, and the server 0 is not the leader.
And i noticed that server log printed by server 0 has much information as below:
[2017-06-23 17:32:33,640] INFO [Group Metadata Manager on Broker 0]: Finished
loading offsets from [__consumer_offsets,23] in 38 milliseconds.
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,641] INFO [Group Metadata Manager on Broker 0]: Loading
offsets and group metadata from [__consumer_offsets,26]
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,646] INFO [Group Metadata Manager on Broker 0]: Finished
loading offsets from [__consumer_offsets,26] in 4 milliseconds.
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,646] INFO [Group Metadata Manager on Broker 0]: Loading
offsets and group metadata from [__consumer_offsets,29]
(kafka.coordinator.GroupMetadataManager)
but server1 and serve2 log don't have that content.
can somebody explains it for me ,thanks very much!
Solved:
The replication factor on the _consumer-offsets topic is the root cause. It's an issue: issues.apache.org/jira/browse/KAFKA-3959
kafka-console-producer defaults to acks = 1 so that's not fault tolerant at all. Add the flag or config parameter to set acks = all and if your topic and the _consumer-offsets topic were both created with replication factor of 3 your test will work.
The servers share their load for managing Consumer Groups.
Usually each independant consumer has a unique Consumer Group ID and you use the same Group ID when you want to split the consuming process between multiple consumers.
That being said: being the leader broker, for a Kafka server within the cluster, is just for coordination of other brokers. The leader has nothing to do (directly) with the server that is currently managing the Group ID and commits for a specific consumer!
So, whenever you subscribe, you are designated a server which will handle the offset commits for your group and this has nothing to do with leader election.
Shut down that server and you might have issue for your group consumption until the Kafka cluster stabilizes again (reallocates your consumer to move the Group management to other servers or waits for the nodes to respond again... I am not expert enough from there to tell you exactly how the failover happens).
Probably, the topic __consumer_offsets has the "Replicas" set to 0.
To confirm this, verify the topic __consumer_offsets:
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic __consumer_offsets
Topic: __consumer_offsets PartitionCount: 50 ReplicationFactor: 1 Configs: compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 1 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 2 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 3 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 4 Leader: 0 Replicas: 0 Isr: 0
...
Topic: __consumer_offsets Partition: 49 Leader: 0 Replicas: 0 Isr: 0
Notice the "Replicas: 0 Isr: 0". This is the reason when you stop the broker 0, the consumer doesn't get the messages anymore.
To correct this, you need to alter the "Replicas" of the topic __consumer_offsets, including the other brokers.
Create a json file like this (config/inc-replication-factor-consumer_offsets.json):
{"version":1,
"partitions":[
{"topic":"__consumer_offsets", "partition":0, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":1, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":2, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":3, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":4, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":5, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":6, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":7, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":8, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":9, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":10, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":11, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":12, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":13, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":14, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":15, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":16, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":17, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":18, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":19, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":20, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":21, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":22, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":23, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":24, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":25, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":26, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":27, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":28, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":29, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":30, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":31, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":32, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":33, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":34, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":35, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":36, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":37, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":38, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":39, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":40, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":41, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":42, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":43, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":44, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":45, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":46, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":47, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":48, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":49, "replicas":[0, 1, 2]}
]
}
Execute the following command:
kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --zookeeper localhost:2181 --reassignment-json-file ../config/inc-replication-factor-consumer_offsets.json --execute
Confirm the "Replicas":
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic __consumer_offsets
Topic: __consumer_offsets PartitionCount: 50 ReplicationFactor: 3 Configs: compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets Partition: 0 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets Partition: 1 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets Partition: 2 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets Partition: 3 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
...
Topic: __consumer_offsets Partition: 49 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
Now you can stop only the broker 0, produce some messages and see the result on the consumer.