how to view kafka headers - apache-kafka

We are sending message with headers to Kafka using
org.apache.kafka.clients.producer.ProducerRecord
public ProducerRecord(String topic, Integer partition, K key, V value, Iterable<Header> headers) {
this(topic, partition, (Long)null, key, value, headers);
}
How can I actually see these headers using command. kafka-console-consumer.sh only shows me payload and no headers.

You can use the excellent kafkacat tool.
Sample command:
kafkacat -b kafka-broker:9092 -t my_topic_name -C \
-f '\nKey (%K bytes): %k
Value (%S bytes): %s
Timestamp: %T
Partition: %p
Offset: %o
Headers: %h\n'
Sample output:
Key (-1 bytes):
Value (13 bytes): {foo:"bar 5"}
Timestamp: 1548350164096
Partition: 0
Offset: 34
Headers: __connect.errors.topic=test_topic_json,__connect.errors.partition=0,__connect.errors.offset=94,__connect.errors.connector.name=file_sink_03,__connect.errors.task.id=0,__connect.errors.stage=VALU
E_CONVERTER,__connect.errors.class.name=org.apache.kafka.connect.json.JsonConverter,__connect.errors.exception.class.name=org.apache.kafka.connect.errors.DataException,__connect.errors.exception.message=Co
nverting byte[] to Kafka Connect data failed due to serialization error: ,__connect.errors.exception.stacktrace=org.apache.kafka.connect.errors.DataException: Converting byte[] to Kafka Connect data failed
due to serialization error:
The kafkacat header option is only available in recent builds of kafkacat; you may want to build from master branch yourself if your current version doesn't include it.
You can also run kafkacat from Docker:
docker run --rm edenhill/kafkacat:1.5.0 \
-b kafka-broker:9092 \
-t my_topic_name -C \
-f '\nKey (%K bytes): %k
Value (%S bytes): %s
Timestamp: %T
Partition: %p
Offset: %o
Headers: %h\n'
If you use Docker bear in mind the network implications of how to reach the Kafka broker.

Starting with kafka-2.7.0 you can enable printing headers in console-consumer by providing property print.headers=true
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic quickstart-events --property print.key=true --property print.headers=true --property print.timestamp=true

You can also use kafkactl for this. E.g. with output as yaml:
kafkactl consume my-topic --print-headers -o yaml
Sample output:
partition: 1
offset: 22
headers:
key1: value1
key2: value2
value: my-value
Disclaimer: I am contributor to this project

From kafka-console-consumer.sh script:
exec $(dirname $0)/kafka-run-class.sh kafka.tools.ConsoleConsumer "$#"
src: https://github.com/apache/kafka/blob/2.1.1/bin/kafka-console-consumer.sh
In kafka.tools.ConsoleConsumer the header is provided to the Formatter, but none of the existing Formatters makes use of it:
formatter.writeTo(new ConsumerRecord(msg.topic, msg.partition, msg.offset, msg.timestamp,
msg.timestampType, 0, 0, 0, msg.key, msg.value, msg.headers),
output)
src: https://github.com/apache/kafka/blob/2.1.1/core/src/main/scala/kafka/tools/ConsoleConsumer.scala
At the bottom of the above link you can see existing Formatters.
If you want to print headers you need to implement your own kafka.common.MessageFormatter and in particular its write method:
def writeTo(consumerRecord: ConsumerRecord[Array[Byte], Array[Byte]], output: PrintStream): Unit
and then run your console consumer with --formatter providing your own formatter (it should also be present on the classpath).
Another, simpler and faster way, would be to implement your own mini-program using KafkaConsumer and check headers in debug.

kcat -C -b $brokers -t $topic -f 'key: %k Headers: %h: Message value: %s\n'

Related

How to force log compaction of a Kafka topic?

Using Kafka 2.7.0 (in K8s), I create a test topic with cleanup.policy=compact:
./kafka-topics.sh --create --bootstrap-server kafka.core-kafka.svc.cluster.local:9092 --topic _test_quick_compaction_2021_12_02 --partitions 1 --replication-factor 3 --config cleanup.policy=compact
Write some messages to it:
kafkacat -b kafka.core-kafka.svc.cluster.local:9092 -P -t _test_quick_compaction_2021_12_02 -K:
1:a
2:b
3:c
1:d
2:e
Change the topic settings in a way such that compaction should kick in after 10 seconds:
./kafka-topics.sh --alter --zookeeper zookeeper.core-kafka.svc.cluster.local --topic _test_quick_compaction_2021_12_02 --config max.compaction.lag.ms=10000 --config min.cleanable.dirty.ratio=0.0 --config segment.ms=10000 --config delete.retention.ms=10000
Wait a minute, just to be sure:
sleep 60
Check the topic content:
kafkacat -C -e -o beginning -b kafka.core-kafka.svc.cluster.local:9092 -t _test_quick_compaction_2021_12_02 -K:
And to my surprise, the content is still
1:a
2:b
3:c
1:d
2:e
instead of the
3:c
1:d
2:e
which I expected.
Why is the topic not compacted, and what can I do to force it?
Since active segments are not eligible for compaction, the trick was to again write something to the topic to force the creation of a new segment.
# Create a test topic.
./kafka-topics.sh --create --bootstrap-server kafka.core-kafka.svc.cluster.local:9092 --topic _test_quick_compaction_2021_12_02 --partitions 1 --replication-factor 3 --config cleanup.policy=compact
# Write some messages to it.
echo "1:a\n2:b\n3:c" | kafkacat -b kafka.core-kafka.svc.cluster.local:9092 -P -t _test_quick_compaction_2021_12_02 -K:
# Check the topic content.
kafkacat -C -e -o beginning -b kafka.core-kafka.svc.cluster.local:9092 -t _test_quick_compaction_2021_12_02 -K:
# Change the topic settings in a way such that compaction should kick in after 10 seconds.
./kafka-topics.sh --alter --zookeeper zookeeper.core-kafka.svc.cluster.local --topic _test_quick_compaction_2021_12_02 --config max.compaction.lag.ms=10000 --config min.cleanable.dirty.ratio=0.0 --config segment.ms=10000 --config delete.retention.ms=10000
# Wait for the last segment to outdate
sleep 11
# Write new messages.
echo "1:d\n2:e" | kafkacat -b kafka.core-kafka.svc.cluster.local:9092 -P -t _test_quick_compaction_2021_12_02 -K:
# Check the topic content.
kafkacat -C -e -o beginning -b kafka.core-kafka.svc.cluster.local:9092 -t _test_quick_compaction_2021_12_02 -K:
# Wait for this segment to outdate.
sleep 11
# Write new messages again.
echo "1:d\n2:e" | kafkacat -b kafka.core-kafka.svc.cluster.local:9092 -P -t _test_quick_compaction_2021_12_02 -K:
# Check the topic content.
kafkacat -C -e -o beginning -b kafka.core-kafka.svc.cluster.local:9092 -t _test_quick_compaction_2021_12_02 -K:
# Wait for compaction to happen.
sleep 11
# Check the topic content to validate that it has been compacted.
kafkacat -C -e -o beginning -b kafka.core-kafka.svc.cluster.local:9092 -t _test_quick_compaction_2021_12_02 -K:
# Revert the setting changes.
./kafka-topics.sh --alter --zookeeper zookeeper.core-kafka.svc.cluster.local --topic _test_quick_compaction_2021_12_02 --delete-config max.compaction.lag.ms --delete-config min.cleanable.dirty.ratio --delete-config segment.ms --delete-config delete.retention.ms
# Delete the topic
# /home/th/kafka_2.13-2.7.0/bin/kafka-topics.sh --delete --bootstrap-server kafka.core-kafka.svc.cluster.local:9092 --topic _test_quick_compaction_2021_12_02

Kafkacat Produce message from a file with headers

I need to produce batch messages to Kafka so I have a file that I feed kafkacat:
kafkacat -b localhost:9092 -t <my_topic> -T -P -l /tmp/msgs
The content of /tmp/msgs is as follows
-H "id=1"
{"key" : "value0"}
-H "id=2"
{"key" : "value1"}
When I run the kafkacat command above, it inserts four messages to kafka - one message per line in /tmp/msgs.
I need to instruct kafkacat to parse the file correctly - that is -H "id=1" is the header for the message {"key" = "value0"}.
How do I achieve this?
Thanks
You need to pass the headers as follows.
kcat -b localhost:9092 -t topic-name -P -H key1=value1 -H key2=value2 /temp/payload.json

How can I produce a Kafka Record with null value using the kafka tool set

I'm using the following command:
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test.topic --property parse.key=true --property key.separator=#
This allows me to start typing key#value entries.
However, no matter what I try, I'm not able to create a null entry.
If I try sending [myKey#] and press Enter, on the feed I will see an Empty message for the Key, but not null.
I need to create a Null value.
kafkacat allows you to produce tombstones through the -Z option.
Produce a tombstone (a "delete" for compacted topics) for key "abc" by providing an empty message value which -Z interpretes as NULL:
$ echo "abc:" | kafkacat -b mybroker -t mytopic -Z -K:
Console producer cannot produce a null record. It parses the input as UTF8 strings
Personally, I would write a simple python or ruby script to do so

Message value in Confluent Cloud not decoding properly

I'm very new to kafka and confluent. I wrote a Producer nearly identical to the tutorial on https://www.confluent.fr/blog/schema-registry-avro-in-spring-boot-application-tutorial/ with my own dummy model. The application.yaml is the same as well. When I send the message to ccloud - the messages being received is gibberish
Any idea as to how to fix this? When I do a System.out.println of the avro POJO before sending to kafka, the object looks good with all the proper values.
{
"locationId": 1,
"time": 1575950400,
"temperature": 9.45,
"summary": "Overcast",
"icon": "cloudy",
"precipitationProbability": 0.24,
...
Whereas when I download the message from ccloud, the value looks like this
[
{
"topic":"Weather",
"partition":0,
"offset":14,
"timestamp":1576008230509,
"timestampType":"CREATE_TIME",
"headers":[],
"key":"dummyKey",
"value":"\u0000\u0000\u0001��\u0002\u0002����\u000b\u0002fffff�\"#\
...
}
You're actually doing everything right :) What you're hitting is just a current limitation in the Confluent Cloud GUI in rendering Avro messages.
If you consume the message as Avro you'll see that everything is fine. Here's an example of consuming the message from Confluent Cloud using kafkacat:
$ source .env
$ docker run --rm edenhill/kafkacat:1.5.0 \
-X security.protocol=SASL_SSL -X sasl.mechanisms=PLAIN \
-X ssl.ca.location=./etc/ssl/cert.pem -X api.version.request=true \
-b ${CCLOUD_BROKER_HOST}:9092 \
-X sasl.username="${CCLOUD_API_KEY}" \
-X sasl.password="${CCLOUD_API_SECRET}" \
-r https://"${CCLOUD_SCHEMA_REGISTRY_API_KEY}":"${CCLOUD_SCHEMA_REGISTRY_API_SECRET}"#${CCLOUD_SCHEMA_REGISTRY_HOST} \
-s avro \
-t mssql-04-mssql.dbo.ORDERS \
-f '"'"'Topic %t[%p], offset: %o (Time: %T)\nHeaders: %h\nKey: %k\nPayload (%S bytes): %s\n'"'"' \
-C -o beginning -c1
Topic mssql-04-mssql.dbo.ORDERS[2], offset: 110 (Time: 1576056196725)
Headers:
Key:
Payload (53 bytes): {"order_id": {"int": 1345}, "customer_id": {"int": 11}, "order_ts": {"int": 18244}, "order_total_usd": {"double": 2.4399999999999999}, "item": {"string": "Bread - Corn Muffaleta Onion"}}
This is the same topic shown here, with the binary Avro value field:

Is there a way to add headers in kafka-console-producer.sh

I'd like to use the kafka-console-producer.sh to fire a few JSON messages with Kafka headers.
Is this possible?
docker exec -it kafka_1 /opt/kafka_2.12-2.3.0/bin/kafka-console-producer.sh --broker-list localhost:9093 --topic my-topic --producer.config /opt/kafka_2.12-2.3.0/config/my-custom.properties
No, but you can with kafkacat's -H argument:
Produce:
echo '{"col_foo":1}'|kafkacat -b localhost:9092 -t test -P -H foo=bar
Consume:
kafkacat -b localhost:9092 -t test -C -f '-----\nTopic %t[%p]\nOffset: %o\nHeaders: %h\nKey: %k\nPayload (%S bytes): %s\n'
-----
Topic test[0]
Offset: 0
Headers: foo=bar
Key:
Payload (9 bytes): col_foo:1
% Reached end of topic test [0] at offset 1
Starting from kafka 3.1.0 there is an option to turn on headers parsing parse.headers=true and then you just place them before your record value, info form docs:
| parse.headers=true:
| "h1:v1,h2:v2...\tvalue"
So your command will look like
kafka-console-producer.sh --bootstrap-server localhost:9092 --topic topic_name --property parse.headers=true
and then you pass
header_name:header_value\nrecord_value