Ok so I understand that you only get order guarantee per partition.
Just random thought/question.
Assuming that the partition strategy is correct and the messages are grouped correctly to the proper partition (or even say we are using 1 partition)
I suppose that the producing application must send each message 1 by 1 to kafka and make sure that each message has been acked before sending the next one right?
Yes, you are correct that the order the producing application sends the message dictates the order they are stored in the partition.
Messages sent by a producer to a particular topic partition will be appended in the order they are sent. That is, if a message M1 is sent by the same producer as a message M2, and M1 is sent first, then M1 will have a lower offset than M2 and appear earlier in the log.
http://kafka.apache.org/documentation.html#intro_guarantees
However, if you have multiple messages in flight simultaneously I am not sure how order is determined.
You might want to think about the acks config for your producer as well. There are failure conditions where a message may be missing if the leader goes down after M1 is published and a new leader receives M2. In this case you won't have an out of order condition, but a missing message so it's slightly orthogonal to your original question but something to consider if message guarantees and order are critical to your application.
http://kafka.apache.org/documentation.html#producerconfigs
Overall, designing a system where small differences in order are not that important can really simplify things.
sync send message one by one(definitely slow!),
or async send message in batch with max.in.flight.requests.per.connection = 1
Yes, the Producer should be single threaded. If one uses multiple Producer threads to produce to the same partition, ordering guarantee on the Consumer will still be lost.So, ordering guarantee on the same partition implicitly also means a single Producer thread.
There are two strategies for sending messages in kafka : synchronous and asynchronous.
For synchronous type, it is intuitively that a producer send message one by one to the target partition, thus the message order is guaranteed.
For asynchronous type, messages are send using batching method, that is to say, if M1 is send prior to M2, then M1 is accumulated in the memory first, then the same with M2. So When producer sends batches of messages in a single request, the messages order thus will be guaranteed.
Related
Setting the stage..
Here's a diagram to help explain my problem better:
Now, keep in mind the following points:
I have a producer sending messages to 8 partitions of My topic.
On the other side, I have 8 consumers, one for each partition.
The legacy system has limited resources, and can process at most 8 simultaneous requests.
To make sure I don't overwhelm the legacy system, a consumer will only send one request at a time. Any new message will wait for the current message to finish processing.
Explaining the problem..
Since messages are blocked until the previous message is processed, I want to minimize the time a message will wait before it's processed. To do that I need messages to be distributed equally over the partitions. A massage must not be consumed by a busy consumer when another is free.
For example, if 8 messages are produced simultaneously, each message should be sent to one partition. Therefore, each message will be consumed by one consumer, ensuring the messages are processed concurrently without any lag.
What I tried so far
Since the partitions are assigned correctly to the consumers, I had to assume the producer wasn't evenly delivering messages to the partitions. Which turned out to be the case. Here's what I tried so far to resolve the issue...
Using null keys
The most intuitive solution was to produce records without keys which will basically make the DefaultPartitioner behave like the RoundRobinPartitioner. unfortunately, this solution did not work.
Using null keys and batch.size=0
Since using null keys didn't work, It made sense that messages were being sent in batches breaking the even distribution. Setting the batch size to 0 should've caused the producer to send messages one by one. That didn't work either.
Using RoundRobinPartitioner
This one was weird. The RoundRobinPartitioner distributed messages evenly, but it only used 4 out of the 8 partitions.
Using RoundRobinPartitioner and batch.size=0
This made no difference.
Finally, my question:
I need the producer to send messages in Round Robin fashion one by one without batching. How can I do that?
TL;DR
I need the producer to send messages in Round Robin fashion without batching. How can I do that?
I come across two phrases with respect to ordering,
Messages sent by a producer to a particular topic partition will be
appended in the order they are sent. That is, if a record M1 is sent
by the same producer as a record M2, and M1 is sent first, then M1
will have a lower offset than M2 and appear earlier in the log.
Another
(config param) max.in.flight.requests.per.connection - The maximum number of
unacknowledged requests the client will send on a single connection
before blocking. Note that if this setting is set to be greater than
1 and there are failed sends, there is a risk of message re-ordering
due to retries (i.e., if retries are enabled).
The question is, will the order still be retained to a particular partition if there are failed sends like mentioned #2 ? if there is a potential issue with one message , all the following messages will be dropped "to retain the order" per partition or the "correct" messages will be sent and failed messages will be notified to the application ?
"will the order still be retained to a particular partition if there are failed sends like mentioned #2?"
As written in the documentation part you have copied, there is a risk that the ordering is changed.
Imagine, you have a topic with e.g. one partition. You set the retries to 100 and the max.in.flight.requests.per.connection to 5 which is greater than one. As a note, retries will only make sense if you set the acks to 1 or "all".
If you plan to produce the following messages in the order K1, K2, K3, K4, K5 and it takes your producer some time to
actually create the batch and
make a request to the broker and
wait for the acknowledgement of the broker
you could have up to 5 requests in parallel (based on the setting of max.in.flight.request.per.connection). Now, producing "K3" has some issues and it goes into the retry-loop, the messages K4 and K5 can be produced as the request was already in flight.
Your topic would end up with messages in that order: K1, K2, K4, K5, K3.
In case you enable idempotency in the Kafka Producer, the ordering would still be guaranteed as explained in Ordering guarantees when using idempotent Kafka Producer
I was once asked on an interview, how would you deal with messages coming in out of order in a message queue. It has been a while and I have not found a definitive answer and I was wondering if an expert in the field can help me answer it to address my own curiosity.
I understand that some message queues provide exactly-once and FIFO guarantees. Also I am aware of the notion of event time and processing time in streaming systems. For instance, in log based message queues like Kafka, mixed up ordering may be less likely to happen due to the presence of offsets and message durability (I may be wrong). I have also thought about using timestamps requiring each message sender to record the time of message before sending it but this is fraught with inconsistency due to clock skew.
Given all of that, I am wondering how can one address mixed up ordering in a traditional messaging system like AMQP, JMS or RabbitMQ where a dozen of IOT devices may be sending messages and I as a consumer want to reconcile them in the correct order.
If queue your system is using, provides ordered message guarantee, then simply use that channel(like kakfa's single partition, AMQP under some settings).
But if queue your system is using does not provide strict ordering then general Idea is that client can have monotonically increasing[1] number(or timestamp) attached with each message it sends to queue. This forms the basis of sequence which producer intends to send to its receivers.
How to get montonically increasing value:
Using timestamp:
POSIX clock_gettime() function with CLOCK_MONOTONIC[2] provides option to get monotonically increasing timestamp, which can be used by producer to put timestamp on each message. Receiver can identify out of order messages when it sees that received message has timestamp older than latest message.
Using sequence number:
Before sending each message you can simply increase an atomic counter and attach counter value to each message, so that receiver can know about intended ordering. This will form strictly increasing sequence. Approach is very similar to Lamport's logical clock[3] which provides virtual clock for producer.
Dealing with out of order messages on receiver side:
This is pretty much application specific but in general you have 2 options when messages arrive out of order:
a) discard the older message, like in cases in which receiver have to show latest value of a stock.
b) Have buffer to reorder sequencing, like within a TCP connection(e.g. zookeeper uses TCP as queue for FIFO ordering [4-5])
Tools:
If you are not adding timestamp to messages, then send all messages to Apache kafka single partition in sequence from producer, as this will ensure that receiver can receive messages in sequence.
If you are using messaging system which does not guarantee ordered delivery (like AMQP under some settings[6]), then you can consider adding additional monotonically increasing number/clock with each message.
[1] https://en.wiktionary.org/wiki/monotonic_increasing#targetText=Adjective,contrast%20this%20with%20strictly%20increasing
[2] https://linux.die.net/man/2/clock_gettime
[3] https://en.wikipedia.org/wiki/Lamport_timestamps#Lamport's_logical_clock_in_distributed_systems
[4] https://cwiki.apache.org/confluence/download/attachments/24193445/zookeeper-internals.pdf?version=1&modificationDate=1295034038000&api=v2
[5] http://www.tcs.hut.fi/Studies/T-79.5001/reports/2012-deSouzaMedeiros.pdf
[6] RabbitMQ - Message order of delivery
I can answer with respect to Apache Kafka.
Apache Kafka guarantees strict order on a topic by partition means each partition is an immutable sequence of message appending in a strict order.
So in case, more than one partition consumer may consume messages from more than one partition which can't be in strict order. We can consider below 2 options to achieve strict order.
If looking for 1 producer message in order use only 1 partition per topic. so the producer will publish on the same partition in sequence order which will get consumed by consumers in strict order.
Producer publishes a message to multi-partition, so use multi-consumer in consumer group but use assign to specific partition per consumer to consume message from specific partition will guarantee strict order per partition per consumer
I have been studying Apache Kafka for a while now.
Lets consider the following example.
Consider I have a topic with 3 partitions. I have a single producer and single consumer. I am producing my messages without specifying the key attribute.
So i know on the producer side, when i publish a message, the strategy used by kafka to assign a message to either of those partitions would be Round-Robin.
Now, what i want to know is when I start a single consumer belonging to a certain consumer group listening to that same topic, what strategy will it use to pull the messages from the different partitons(as there are 3)?
Would it follow the a similar round-robin model, where it will send a fetch request to a leader of a partition 1, wait for a response, get the response, return the records to process. Then, send a fetch request to the leader of a partition 2 and so on?
If it follows some other strategy/algorithm, I would love to know what it is?
Thank you in advance.
There is no ordering guarantee outside of a partition so in a way that algorithm used is moot to the end user and subject to change.
Today, there is nothing terribly complex that happens in this instance. The protocol shows you that a fetch request includes a partition so you get a fetch per partition. That means the order depends on the consumer. A partition won't be starved because fetch requests will happen for all partitions assigned to the consumer.
I was going through kafka documentation and came across
Guarantees
At a high-level, Kafka gives the following guarantees:
Messages sent by a producer to a particular topic partition will be
appended in the order they are sent. That is, if a record M1 is sent
by the same producer as a record M2, and M1 is sent first, then M1
will have a lower offset than M2 and appear earlier in the log. A
consumer instance sees records in the order they are stored in the
log. For a topic with replication factor N, we will tolerate up to N-1
server failures without losing any records committed to the log.
I had few questions.
Is it always guaranteed that M1 will have a lower offset than M2 ? what if M1 is retried later than M2 ?
I also understood from various documentations that ordering is not guaranteed, and the consumer has to deal with it.
A possible scenario even with a single partition is:
Producer sends M1
Producer sends M2
M1 is not ack'ed on the first try due to some failure
M2 is delivered
M1 is delivered in a subsequent try.
One easy way to avoid this is through the producer config max.in.flight.requests.per.connection=1.
This of course has performance implications, so it should be used with caution.
Please notice, that ordering guarantees apply at the partition level. So, if you have more than one partition in the topic, you'll need to set the same partition key for messages that you require to appear in order.
For example, if you want to collect messages from various sensors and sensor has it's id, then if you use this ID as message key, ordering of messages from every sensor will be guaranteed on consumers (as no sensor will write messages to more than 1 partition).
To answer your questions:
Yes, M1 will have always offset lower than M2. The offsets are set by broker, so the time of message arrival at the broker is key here.
Ordering is not guaranteed on topic level only
I have an article about deep understanding ordering guarantees provided by Kafka.
You can check it in my medium post.