How to manipulate offsets of the source database for Debezium - apache-kafka

So I've been experimenting with Kafka and I am trying to manipulate/change the offsets of the source database using this link https://debezium.io/documentation/faq/. I was successfully able to do it but I was wondering how I would do this in native kafka commands instead of using kafkacat.
So these are the kafka commands that I'm using
kafkacat -b kafka:9092 -C -t connect_offsets -f 'Partition(%p) %k %s\n'
and
echo '["In-house",{"server":"optimus-prime"}]|{"ts_sec":1657643280,"file":"mysql-bin.000200"","pos":2136,"row":1,"server_id":223344,"event":2}' | \
kafkacat -P -b kafka:9092 -t connect_offsets -K \| -p 2
It basically reverts the offset of the source system back to a previous binlog and I can be able to read the db from a previous point in time. So this works well, but was wondering what I would need to compose via native kafka since we don't have kafkacat on our dev/prod servers although I do see it's value and maybe that will be installed later in the future. This is what I have so far for the transalation but it's not quite doing what I'm thinking.
./kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic
connect_offsets --property print.offset=true --property print.partition=true --
property print.headers=true --property print.timestamp=true --property
print.key=true --from-beginning
After I run this I get these results.
This works well for the kafka consumer command but when I try to translate the producer command I run into issues.
./kafka-console-producer.sh --bootstrap-server kafka:9092 --topic connect_offsets
--property parse.partition=true --property parse.key=true --property
key.separator=":"
I get a prompt after the producer command and I enter this
["In-house",{"server":"optimus-prime"}]:{"ts_sec":1657643280,"file":"mysql-bin.000200","pos":2136,"row":1,"server_id":223344,"event":2}:2
But it seems like it's not taking the command because the bin log position doesn't update after I run the consumer command again. Any ideas? Let me know.
EDIT: After applying OneCricketeer's changes I'm getting this stack trace.

key.separator=":" looks like it will be an issue considering it will split your data at ["In-house",{"server":
So, basically you produced a bad event into the topic, and maybe broke the connector...
If you want to literally use the same command, keep your key separator as |, or any other character that will not be part of the key.
Also, parse.partition isn't a property that is used, so you should remove :2 at the end... I'm not even sure kafka-console-producer can target a specific partition.

Related

Read Avro messages from Kafka in terminal - kafka-avro-console-consumer alternative

I'm trying to find easiest way how to read Avro messages from the Kafka topics in readable format. There is option to use Confluent kafka-avro-console-consumer in following way
./kafka-avro-console-consumer \
--topic topic \
--from-beginning \
--bootstrap-server bootstrap_server_url \
--max-messages 10 \
--property schema.registry.url=schema_registry_url
but for this I need to download whole Confluent platform (1.7 GB) that I see as an overkill in my scenario.
Is there any alternative how I could get Avro messages from the Kafka topics in the terminal easily?
I was able to get the last Avro messages in the readable form with kcat
kcat -C -b bootstrap_server \
-t topic \
-r schema_registry \
-p 0 -o -1 -s value=avro -e
You will need to download additional Kafka tools that supports the Schema Registry and Avro format, such as ksqlDB or Conduktor or AKHQ or similar GUI tools
kcat might support Avro now, I cannot recall
You could write your own consumer script. The Python library from Confluent doesn't require much code to consume Avro records
You could also clone the Schema Registry project from Github and build it on its own, then use the CLI scripts there
Another way to read messages in Avro is to use Kafdrop. Test it by adding such section to your docker-compose.yml along with broker, server-regestry and other containers:
kafdrop:
image: obsidiandynamics/kafdrop
restart: "no"
ports:
- "9001:9000"
environment:
KAFKA_BROKERCONNECT: "broker:9092"
JVM_OPTS: "-Xms16M -Xmx48M -Xss180K -XX:-TieredCompilation -XX:+UseStringDeduplication -noverify"
CMD_ARGS: "--message.format=AVRO --schemaregistry.connect=http://schema-registry:8081"
depends_on:
- "broker"
After that, open Kafdrop at http://localhost:9001, click at topic where avro-messages will be put, and choose AVRO message format in drop down.

kafka delete a topic using bootstrap server vs zookeeper

I want to know the different between these commands.
-- With bootstrap server
kafka-topics \
--bootstrap-server b-1.bhuvi-cluster-secure.jj9mhr.c3.kafka.ap-south-1.amazonaws.com:9098,b-2.bhuvi-cluster-secure.jj9mhr.c3.kafka.ap-south-1.amazonaws.com:9098 \
--delete \
--topic debezium-my-topic \
--command-config /etc/kafka/client.properties
-- With zookeeper
kafka-topics \
--zookeeper z-3.bhuvi-cluster-secure.jj9mhr.c3.kafka.ap-south-1.amazonaws.com:2182,z-1.bhuvi-cluster-secure.jj9mhr.c3.kafka.ap-south-1.amazonaws.com:2181,z-2.bhuvi-cluster-secure.jj9mhr.c3.kafka.ap-south-1.amazonaws.com:2181 \
--delete \
--topic debezium-my-topic
The reason behind this is, the Kafka ACL for delete topic is restricted. If I run the first command it's giving an error like Topic authorization failed which is correct(due to ACL) but the second command didn't check anything from ACL and deleted the topic directly.
Your authorizer.class.name configured on the brokers may only depend on Kafka, and doesn't use Zookeeper AdminClient to verify ACLs
CLI --zookeeper option is considered deprecated, and will be completely removed with KIP-500

Kafka console Consume Command not accepting bootstrap-servers

I have installed Kafka on my windows machine. My Kafka version is kafka_2.12-2.4.0.
I have started the Zookeeper server, then Kafka server then creates the topic and then produces the message to created Topic. Till here everything is fine.
But when I run the Consume command, it is giving me below error.
'--bootstrap-servers' is not recognized as an internal or external
command, operable program or batch file.
I am using the below command.
.\bin\windows\kafka-console-consumer.bat --bootstrap-servers localhost:9092 --topic TEST_TOPIC --from-beginning
Please suggest me what could be the problem.
You should use --bootstrap-server instead of --bootstrap-servers (note 's' at the end):
Try:
kafka/bin/kafka-console-consumer.bat \
--bootstrap-server localhost:9092 \
--topic TEST_TOPIC \
--from-beginning
I resolved the issue on my side. The issue was that in the command prompt, for some reason, I had a double arrow >>:
C:\kafka_2.13-2.4.0>> .\bin\windows\kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic myTopic
'--bootstrap-server' is not recognized as an internal or external command,
operable program or batch file.
Once I removed the double arrow, the error went away. Now I appear to have other issues where Kafka is not running on the port I thought it was, but that is a separate issue.

Check message has been written to kafka topic using command line

Firstly, please note that using the java consumer API is not an option. Why it is not an option I am unable to disclose, but I must be able to do the following using a shell command.
I have a topic that I have written a message to, and I can confirm this is the case if I run ./kafka-console-consumer.sh with the --from-beginning option, but since this starts a consumer then the command gets stuck and requires manual intervention with a SIGINT. I have come close using --timeout-ms, however this is not ideal as unless I select a high value there is the possibility that the dump of the data is unreliable.
I would like to dump the output of console-consumer in such a manner that it can be grepped, or a suitable alternative method.
When you write to Kafka, you can set in the producer acks which is the level of guarantee you want from the broker that the message has been received and written by the local broker and/or all replicas.
If you use this then you have no need to try and consume from the topic to determine if the record was written or not. This sounds like a really bad idea to try and do.
If you absolutely must use a command-line tool to do this (which, is not a good idea) then use kafkacat which can consume from any offset for any number of messages, e.g.:
Consume (-C) five messages (-c 5) from the beginning (-o beginning), or exit (-e) when end of partition is reached
kafkacat -b localhost:9092 -t mytopic -o beginning -e -C -c 5
Consume (-C) ten messages (-c 10) from the end (-o -10), or exit (-e) when end of partition is reached
kafkacat -b localhost:9092 -t mytopic -o -10 -e -C -c 10
Consume (-C) one messages (-c 1) at offset 42 (-o 42), or exit (-e) when end of partition is reached
kafkacat -b localhost:9092 -t mytopic -o 42 -e -C -c 1

How to resolve "Leader not available" Kafka error when trying to consume

I'm playing around with Kafka and using my own local single instance of zookeeper + kafka and running into this error that I don't seem to understand how to resolve.
I started a simple server per the Apache Kafka Quickstart Guide
$ bin/zookeeper-server-start.sh config/zookeeper.properties
$ bin/kafka-server-start.sh config/server.properties
Then utilizing kafkacat (installed via Homebrew) I started a Producer that will just echo messages that I type into the console
$ kafkacat -P -b localhost:9092 -t TestTopic -T
test1
test1
But when I try to consume those messages I get an error:
$ kafkacat -C -b localhost:9092 -t TestTopic
% ERROR: Topic TestTopic error: Broker: Leader not available
And similarly when I try to list its' metadata
$ kafkacat -L -b localhost:9092 -t TestTopic
Metadata for TestTopic (from broker -1: localhost:9092/bootstrap):
0 brokers:
1 topics:
topic "TestTopic" with 0 partitions: Broker: Leader not available (try again)
My questions:
Is this an issue with my running instance of zookeeper and/or kafkacat - I ask this because I've been constantly shutting them down and restarting them, after deleting the /tmp/zookeeper and /tmp/kafka-logs directories
Is there some simple setting that I need to try? I tried adding auto.leader.rebalance.enable=true in Kafka's server.properties settings file, but that didn't fix this particular issue
How do I do a fresh restart of zookeeper/kafka. Is shutting them down, deleting the /tmp/zookeeper and /tmp/kafka-logs directories and then restarting zookeeper and then kafka the way to go? (Well maybe the way to go is to build a docker container that I can stand-up and tear down, I was going to use the spotify/docker-kafka container but that is not on Kafka 0.9.0.0 and I haven't taking the time to build my own)
It might be, but probably is not. My guess is the topic isn't created, so kafkacat echoes the massage on screen but doesn't really send it to kafka. All the topics are probably deleted after you delete the /tmp/kafka-logs
No. I don't think this is the way to look for a solution.
Having a docker container is definitely the way to go - you'll soon end up running kafka on multiple brokers, examining the replication behavior, high availability scenarios etc.. Having it dockerised helps a lot.