What is the best way to publish messages to kafka? - apache-kafka

I have a kafka producer that basically does the below work. I have a topic with atleast 10 partitions and I dont care the order they consumed (my backend will handle it). I will also start atleast 10 consumers (assuming each cling onto 1 partition). If i start publishing messages(using below code) will kafka handle the load and put the messages evenly in all partitions or should i introduce a key (which really doesnt matter to my app) and implement round robin myself?
KeyedMessage<String, String> data = new KeyedMessage<>(topic, txt);
producer.send(data);
producer.close();
Any thoughts?

In default case org.apache.kafka.clients.producer.internals.DefaultPartitioner will be used
if (keyBytes == null) {
int nextValue = counter.getAndIncrement();
List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
if (availablePartitions.size() > 0) {
int part = DefaultPartitioner.toPositive(nextValue) % availablePartitions.size();
return availablePartitions.get(part).partition();
} else {
// no partitions are available, give a non-available partition
return DefaultPartitioner.toPositive(nextValue) % numPartitions;
}
} else {
// hash the keyBytes to choose a partition
return DefaultPartitioner.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}
link to source code
according to the code, kafka will split all messages evenly between all partitions

Related

Reading last Kafka partition's record that's transactionally written

I'm trying to read the last record of a topic's partition. The topic's producer is writing transactionally. The consumer is set up with isolation.level=read_committed. I manage the offsets manually. Here's my consumer code:
// Fetch the end offset of a partition
Map<TopicPartition, Long> endOffsets = consumer.endOffsets(Collections.singleton(topicPartition));
Long endOffset = endOffsets.get(topicPartition);
// Try to read the last record
if (endOffset > 0) {
consumer.seek(topicPartition, Math.max(0, endOffset - 5));
List records = consumer.poll(10 * 1000).records(topicPartition);
if (records.isEmpty()) {
throw new IllegalStateException("How can this be?");
} else {
// Return the last record
return records.get(records.size() - 1);
}
}
So to read the last record I ask for the end offset, then seek to endOffset - 5 (because of Kafka skipping offsets in exactly-once mode, like I've seen in this question: Kafka Streams does not increment offset by 1 when producing to topic), and start polling.
It's been working well before, but now I've got an exception telling me zero records are polled. What can be a reason for that? I can't reproduce it and I think I'm lost.

How to read kafka statestore in processor which was updated from a different processor?

I have created 2 kafka statestores corresponding to 2 different topics as below.
StateStoreSupplier tempStore1 = Stores.create("tempStore1").withKeys(Serdes.String()).withValues(valueSerde).persistent().build();
StateStoreSupplier tempStore2 = Stores.create("tempStore2").withKeys(Serdes.String()).withValues(valueSerde).persistent().build();
streamsBuilder.addSource("Source", "tempTopic1", "tempTopic2")
.addProcessor("Process", () -> new Processor(), "Source")
.connectProcessorAndStateStores("Process", "tempStore1", "tempStore2")
.addStateStore(tempStore1, "Process")
.addStateStore(tempStore2, "Process");
In the Processor class, I'm able to read and add records to the StatesStores when there's a message in respective topics. But I'm unable to read the store tempStore1 when there's a message coming from tempTopic2 and vice verca.
I have to compare the messages from one statestore to the other, which means I need to read both the state stores.
Here's the sample code snipper from process method. I believe the ProcessorContext(context variable) is different for respective topics and hence the other store is not being accessible.
tempKeyValueStore1 = (KeyValueStore) context.getStateStore("tempStore1");
tempKeyValueStore2 = (KeyValueStore) context.getStateStore("tempStore2");
if(context.topic().equals("tempTopic1"))
{
tempKeyValueStore1.put(value.getHeader().getCorrelationId(), value);
}else if(context.topic().equals("tempTopic2"))
{
tempKeyValueStore2.put(value.getHeader().getCorrelationId(), value);
System.out.println("Size: "+tempKeyValueStore1.approximateNumEntries()); // returning as 0 although there records in that statestore
}
Thanks in advance.

How to check if Kafka Consumer is ready

I have Kafka commit policy set to latest and missing first few messages. If I give a sleep of 20 seconds before starting to send the messages to the input topic, everything is working as desired. I am not sure if the problem is with consumer taking long time for partition rebalancing. Is there a way to know if the consumer is ready before starting to poll ?
You can use consumer.assignment(), it will return set of partitions and verify whether all of the partitions are assigned which are available for that topic.
If you are using spring-kafka project, you can include spring-kafka-test dependancy and use below method to wait for topic assignment , but you need to have container.
ContainerTestUtils.waitForAssignment(Object container, int partitions);
You can do the following:
I have a test that reads data from kafka topic.
So you can't use KafkaConsumer in multithread environment, but you can pass parameter "AtomicReference assignment", update it in consumer-thread, and read it in another thread.
For example, snipped of working code in project for testing:
private void readAvro(String readFromKafka,
AtomicBoolean needStop,
List<Event> events,
String bootstrapServers,
int readTimeout) {
// print the topic name
AtomicReference<Set<TopicPartition>> assignment = new AtomicReference<>();
new Thread(() -> readAvro(bootstrapServers, readFromKafka, needStop, events, readTimeout, assignment)).start();
long startTime = System.currentTimeMillis();
long maxWaitingTime = 30_000;
for (long time = System.currentTimeMillis(); System.currentTimeMillis() - time < maxWaitingTime;) {
Set<TopicPartition> assignments = Optional.ofNullable(assignment.get()).orElse(new HashSet<>());
System.out.println("[!kafka-consumer!] Assignments [" + assignments.size() + "]: "
+ assignments.stream().map(v -> String.valueOf(v.partition())).collect(Collectors.joining(",")));
if (assignments.size() > 0) {
break;
}
try {
Thread.sleep(1_000);
} catch (InterruptedException e) {
e.printStackTrace();
needStop.set(true);
break;
}
}
System.out.println("Subscribed! Wait summary: " + (System.currentTimeMillis() - startTime));
}
private void readAvro(String bootstrapServers,
String readFromKafka,
AtomicBoolean needStop,
List<Event> events,
int readTimeout,
AtomicReference<Set<TopicPartition>> assignment) {
KafkaConsumer<String, byte[]> consumer = (KafkaConsumer<String, byte[]>) queueKafkaConsumer(bootstrapServers, "latest");
System.out.println("Subscribed to topic: " + readFromKafka);
consumer.subscribe(Collections.singletonList(readFromKafka));
long started = System.currentTimeMillis();
while (!needStop.get()) {
assignment.set(consumer.assignment());
ConsumerRecords<String, byte[]> records = consumer.poll(1_000);
events.addAll(CommonUtils4Tst.readEvents(records));
if (readTimeout == -1) {
if (events.size() > 0) {
break;
}
} else if (System.currentTimeMillis() - started > readTimeout) {
break;
}
}
needStop.set(true);
synchronized (MainTest.class) {
MainTest.class.notifyAll();
}
consumer.close();
}
P.S.
needStop - global flag, to stop all running thread if any in case of failure of success
events - list of object, that i want to check
readTimeout - how much time we will wait until read all data, if readTimeout == -1, then stop when we read anything
Thanks to Alexey (I have also voted up), I seemed to have resolved my issue essentially following the same idea.
Just want to share my experience... in our case we using Kafka in request & response way, somewhat like RPC. Request is being sent on one topic and then waiting for response on another topic. Running into a similar issue i.e. missing out first response.
I have tried ... KafkaConsumer.assignment(); repeatedly (with Thread.sleep(100);) but doesn't seem to help. Adding a KafkaConsumer.poll(50); seems to have primed the consumer (group) and receiving the first response too. Tested few times and it consistently working now.
BTW, testing requires stopping application & deleting Kafka topics and, for a good measure, restarted Kafka too.
PS: Just calling poll(50); without assignment(); fetching logic, like Alexey mentioned, may not guarantee that consumer (group) is ready.
You can modify an AlwaysSeekToEndListener (listens only to new messages) to include a callback:
public class AlwaysSeekToEndListener<K, V> implements ConsumerRebalanceListener {
private final Consumer<K, V> consumer;
private Runnable callback;
public AlwaysSeekToEndListener(Consumer<K, V> consumer) {
this.consumer = consumer;
}
public AlwaysSeekToEndListener(Consumer<K, V> consumer, Runnable callback) {
this.consumer = consumer;
this.callback = callback;
}
#Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
}
#Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
consumer.seekToEnd(partitions);
if (callback != null) {
callback.run();
}
}
}
and subscribe with a latch callback:
CountDownLatch initLatch = new CountDownLatch(1);
consumer.subscribe(singletonList(topic), new AlwaysSeekToEndListener<>(consumer, () -> initLatch.countDown()));
initLatch.await(); // blocks until consumer is ready and listening
then proceed to start your producer.
If your policy is set to latest - which takes effect if there are no previously committed offsets - but you have no previously committed offsets, then you should not worry about 'missing' messages, because you're telling Kafka not to care about messages that were sent 'previously' to your consumers being ready.
If you care about 'previous' messages, you should set the policy to earliest.
In any case, whatever the policy, the behaviour you are seeing is transient, i.e. once committed offsets are saved in Kafka, on every restart the consumers will pick up where they left previoulsy
I needed to know if a kafka consumer was ready before doing some testing, so i tried with consumer.assignment(), but it only returned the set of partitions assigned, but there was a problem, with this i cannot see if this partitions assigned to the group had offset setted, so later when i tried to use the consumer it didn´t have offset setted correctly.
The solutions was to use committed(), this will give you the last commited offsets of the given partitions that you put in the arguments.
So you can do something like: consumer.committed(consumer.assignment())
If there is no partitions assigned yet it will return:
{}
If there is partitions assigned, but no offset yet:
{name.of.topic-0=null, name.of.topic-1=null}
But if there is partitions and offset:
{name.of.topic-0=OffsetAndMetadata{offset=5197881, leaderEpoch=null, metadata=''}, name.of.topic-1=OffsetAndMetadata{offset=5198832, leaderEpoch=null, metadata=''}}
With this information you can use something like:
consumer.committed(consumer.assignment()).isEmpty();
consumer.committed(consumer.assignment()).containsValue(null);
And with this information you can be sure that the kafka consumer is ready.

KafkaSpout (idle) generates a huge network traffic

After developing and executing my Storm (1.0.1) topology with a KafkaSpout and a couple of Bolts, I noticed a huge network traffic even when the topology is idle (no message on Kafka, no processing is done in bolts). So I started to comment out my topology piece by piece in order to find the cause and now I have only the KafkaSpout in my main:
....
final SpoutConfig spoutConfig = new SpoutConfig(
new ZkHosts(zkHosts, "/brokers"),
"files-topic", // topic
"/kafka", // ZK chroot
"consumer-group-name");
spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
spoutConfig.startOffsetTime = OffsetRequest.LatestTime();
topologyBuilder.setSpout(
"kafka-spout-id,
new KafkaSpout(config),
1);
....
When this (useless) topology executes, even in local mode, even the very first time, the network traffic always grows a lot: I see (in my Activity Monitor)
An average of 432 KB of data received/sec
After a couple of hours the topology is running (idle) data received is 1.26GB and data sent is 1GB
(Important: Kafka is not running in cluster, a single instance that runs in the same machine with a single topic and a single partition. I just downloaded Kafka on my machine, started it and created a simple topic. When I put a message in the topic, everything in the topology is working without any problem at all)
Obviously, the reason is in the KafkaSpout.nextTuple() method (below), but I don't understand why, without any message in Kafka, I should have such traffic. Is there something I didn't consider? Is that the expected behaviour? I had a look at Kafka logs, ZK logs, nothing, I have cleaned up Kafka and ZK data, nothing, still the same behaviour.
#Override
public void nextTuple() {
List<PartitionManager> managers = _coordinator.getMyManagedPartitions();
for (int i = 0; i < managers.size(); i++) {
try {
// in case the number of managers decreased
_currPartitionIndex = _currPartitionIndex % managers.size();
EmitState state = managers.get(_currPartitionIndex).next(_collector);
if (state != EmitState.EMITTED_MORE_LEFT) {
_currPartitionIndex = (_currPartitionIndex + 1) % managers.size();
}
if (state != EmitState.NO_EMITTED) {
break;
}
} catch (FailedFetchException e) {
LOG.warn("Fetch failed", e);
_coordinator.refresh();
}
}
long diffWithNow = System.currentTimeMillis() - _lastUpdateMs;
/*
As far as the System.currentTimeMillis() is dependent on System clock,
additional check on negative value of diffWithNow in case of external changes.
*/
if (diffWithNow > _spoutConfig.stateUpdateIntervalMs || diffWithNow < 0) {
commit();
}
}
Put a sleep for one second (1000ms) in the nextTuple() method and observe the traffic now, For example,
#Override
public void nextTuple() {
try {
Thread.sleep(1000);
} catch(Exception ex){
log.error("Ëxception while sleeping...",e);
}
List<PartitionManager> managers = _coordinator.getMyManagedPartitions();
for (int i = 0; i < managers.size(); i++) {
...
...
...
...
}
The reason is, kafka consumer works on the basis of pull methodology which means, consumers will pull data from kafka brokers. So in consumer point of view (Kafka Spout) will do a fetch request to the kafka broker continuously which is a TCP network request. So you are facing a huge statistics on the data packet sent/received. Though the consumer doesn't consumes any message, pull request and empty response also will get account into network data packet sent/received statistics. Your network traffic will be less if your sleeping time is high. There are also some network related configurations for the brokers and also for consumer. Doing the research on configuration may helps you. Hope it will helps you.
Is your bolt receiving messages ? Do your bolt inherits BaseRichBolt ?
Comment out that line m.fail(id.offset) in Kafaspout and check it out. If your bolt doesn't ack then your spout assumes that message is failed and try to replay the same message.
public void fail(Object msgId) {
KafkaMessageId id = (KafkaMessageId) msgId;
PartitionManager m = _coordinator.getManager(id.partition);
if (m != null) {
//m.fail(id.offset);
}
Also try halt the nextTuple() for few millis and check it out.
Let me know if it helps

Kafka partition key not working properly‏

I'm struggling with how to use the partition key mechanism properly. My logic is set the partition number as 3, then create three partition keys as "0", "1", "2", then use the partition keys to create three KeyedMessage such as
KeyedMessage(topic, "0", message)
KeyedMessage(topic, "1", message)
KeyedMessage(topic, "2", message)
After this, creating a producer instance to send out all the KeyedMessage.
I expecting each KeyedMessage should enter to different partitions according to the different partition keys, which means
KeyedMessage(topic, "0", message) go to Partition 0
KeyedMessage(topic, "1", message) go to Partition 1
KeyedMessage(topic, "2", message) go to Partition 2
I'm using Kafka-web-console to watch the topic status, but the result is not like what I'm expecting. KeyedMessage still go to partitions randomly, some times two KeyedMessage will enter the same partition even they have different partition keys.
To make my question more clear, I would like to post some Scala codes currently I have, and I'm using Kafka 0.8.2-beta, and Scala 2.10.4.
Here is the producer codes, I didn't use the custom partitioner.class :
val props = new Properties()
val codec = if(compress) DefaultCompressionCodec.codec else NoCompressionCodec.codec
props.put("compression.codec", codec.toString)
props.put("producer.type", if(synchronously) "sync" else "async")
props.put("metadata.broker.list", brokerList)
props.put("batch.num.messages", batchSize.toString)
props.put("message.send.max.retries", messageSendMaxRetries.toString)
props.put("request.required.acks",requestRequiredAcks.toString)
props.put("client.id",clientId.toString)
val producer = new Producer[AnyRef, AnyRef](new ProducerConfig(props))
def kafkaMesssage(message: Array[Byte], partition: Array[Byte]): KeyedMessage[AnyRef, AnyRef] = {
if (partition == null) {
new KeyedMessage(topic,message)
} else {
new KeyedMessage(topic,partition,message)
}
}
def send(message: String, partition: String = null): Unit = send(message.getBytes("UTF8"), if (partition == null) null else partition.getBytes("UTF8"))
def send(message: Array[Byte], partition: Array[Byte]): Unit = {
try {
producer.send(kafkaMesssage(message, partition))
} catch {
case e: Exception =>
e.printStackTrace
System.exit(1)
}
}
And here is how I use the producer, create a producer instance, then use this instance to send three message. Currently I create the partition key as Integer, then convert it to Byte Arrays:
val testMessage = UUID.randomUUID().toString
val testTopic = "sample1"
val groupId_1 = "testGroup"
print("starting sample broker testing")
val producer = new KafkaProducer(testTopic, "localhost:9092")
val numList = List(0,1,2);
for (a <- numList) {
// Create a partition key as Byte Array
var key = java.nio.ByteBuffer.allocate(4).putInt(a).array()
//Here I give a Array[Byte] key
//so the second "send" function of producer will be called
producer.send(testMessage.getBytes("UTF8"), key)
}
Not sure whether my logic is incorrect or I didn't understand the partition key mechanism correctly. Anyone could provides some sample code or explanation would be great!
People often assume partitioning is a way to separate business data on business categories, but this is not the right angle of viewing the partition.
Partitioning influence directly these subjects:
-performance (each partition can be consumed in parallel to other partitions)
-messages order (order of messages guaranteed only on partition level)
I will give an example how we create partitions:
You have a topic, say MyMessagesToWorld
You would like to transfer this topic (all MyMessagesToWorld) to some consumer.
You "weight" the whole "mass" of MyMessagesToWorld and found, that this is 10 kg.
You have following "business" categories in "MyMessagesToWorld ":
-messages to dad (D)
-messages to mom (M)
-messages to sis (S)
-messsages to grandma (G)
-messsages to teacher (T)
-messsages to girl friend (F)
You think, who is your consumers and found that your consumers are gnomes, that able consume like 1 Kg messages in an hour each.
You can employ up to 2 such gnomes.
1 gnome needs 10 hours to consume 10 kg messages, 2 gnomes need 5 hours.
So you decide employ all available gnomes to save time.
To create 2 "channels" for these 2 gnomes you create 2 partitions of this topic on Kafka. If you invision more gnomes, create more partitions.
You have 6 business categories inside and 2 sequential independent consumers - gnomes (consumer threads).
What to do?
Kafka's approach is following:
Suppose you have 2 kafka instances in cluster.
(The same example OK , if you have more instaces in cluster)
You set partition number to 2 on Kafka, example(use Kafka 0.8.2.1 as example):
You define your topic in Kafka, telling, that you have 2 partitions for that topic:
kafka-topics.sh(.bat) --create --zookeeper localhost:2181 --replication-factor 2 --partitions 2 --topic MyMessagesToWorld
Now the topic MyMessagesToWorld has 2 partitions: P(0) and P(1).
You chose number 2 (partitions), because you know, you have(invision) only 2 consuming gnomes.
You may add more partitions later, when more consumer gnomes will be employed.
Do not confuse Kafka consumer with such gnome.
Kafka consumer can employ N gnomes. (N parallel threads)
Now you create KEYs for your messages.
You need KEYS to distribute your messages among partitions.
Keys will be these letters of "business categories", that you defined before:
D,M,S,G,T,F, you think such letters are OK to be ID.
But in general case, whatever may be used as a Key:
(complex objects and byte arrays, anything...)
If you create NO partitioner, the default one will be used.
The default partitioner is a bit stupid.
It takes hashcode of each KEY and divides it by number of available partitions , the "reminder" will defind the number of partition for that key.
Example:
KEY M, hashcode=12345, partition for M = 12345 % 2 = 1
As you can imagine, with such partitioner in the best case you have 3 business categories landing in each partition.
In worse case you can have all business categories landing in 1 partition.
If you had 100000 business categories, it will statistically OK to distribute them by such algorithm.
But with only few categories, you can have not very fair distribution.
So, you can rewrite partitioner and distribute your business categories a bit wiser.
There is an example:
This partitioner distributes business categories equally between available partitions.
public class CustomPartitioner {
private static Map<String, Integer> keyDistributionTable = new HashMap<String, Integer>();
private static AtomicInteger sequence = new AtomicInteger();
private ReentrantLock lock = new ReentrantLock();
public int partition(ProducerRecord<String, Object> record, Cluster cluster) {
String key = record.key();
int seq = figureSeq(key);
List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(record.topic());
if (availablePartitions.size() > 0) {
int part = seq % availablePartitions.size();
return availablePartitions.get(part).partition();
} else {
List<PartitionInfo> partitions = cluster.partitionsForTopic(record.topic());
int numPartitions = partitions.size();
// no partitions are available, give a non-available partition
return seq % numPartitions;
}
}
private int figureSeq(String key) {
int sequentualNumber = 0;
if(keyDistributionTable.containsKey(key)){
sequentualNumber = keyDistributionTable.get(key);
}else{//synchronized region
//used only for new Keys, so high waiting time for monitor expected only on start
lock.lock();
try{
if(keyDistributionTable.containsKey(key)){
sequentualNumber = keyDistributionTable.get(key);
}else{
int seq = sequence.incrementAndGet();
keyDistributionTable.put(key, seq);
sequentualNumber = seq;
}
}finally{
lock.unlock();
}
}
return sequentualNumber;
}
}
The default partitioner looks at the key (as a byte array) and uses (% numPartitions) to convert that value to an integer value between 0 and the number of partitions-1 inclusive. The resulting integer is what determines the partition to which the message is written, not the value of the key as you're doing.
Had the same issue - just switch to the ByteArrayParitioner:
props.put("partitioner.class", "kafka.producer.ByteArrayPartitioner")