I was going through kafka documentation and came across
Guarantees
At a high-level, Kafka gives the following guarantees:
Messages sent by a producer to a particular topic partition will be
appended in the order they are sent. That is, if a record M1 is sent
by the same producer as a record M2, and M1 is sent first, then M1
will have a lower offset than M2 and appear earlier in the log. A
consumer instance sees records in the order they are stored in the
log. For a topic with replication factor N, we will tolerate up to N-1
server failures without losing any records committed to the log.
I had few questions.
Is it always guaranteed that M1 will have a lower offset than M2 ? what if M1 is retried later than M2 ?
I also understood from various documentations that ordering is not guaranteed, and the consumer has to deal with it.
A possible scenario even with a single partition is:
Producer sends M1
Producer sends M2
M1 is not ack'ed on the first try due to some failure
M2 is delivered
M1 is delivered in a subsequent try.
One easy way to avoid this is through the producer config max.in.flight.requests.per.connection=1.
This of course has performance implications, so it should be used with caution.
Please notice, that ordering guarantees apply at the partition level. So, if you have more than one partition in the topic, you'll need to set the same partition key for messages that you require to appear in order.
For example, if you want to collect messages from various sensors and sensor has it's id, then if you use this ID as message key, ordering of messages from every sensor will be guaranteed on consumers (as no sensor will write messages to more than 1 partition).
To answer your questions:
Yes, M1 will have always offset lower than M2. The offsets are set by broker, so the time of message arrival at the broker is key here.
Ordering is not guaranteed on topic level only
I have an article about deep understanding ordering guarantees provided by Kafka.
You can check it in my medium post.
Related
I come across two phrases with respect to ordering,
Messages sent by a producer to a particular topic partition will be
appended in the order they are sent. That is, if a record M1 is sent
by the same producer as a record M2, and M1 is sent first, then M1
will have a lower offset than M2 and appear earlier in the log.
Another
(config param) max.in.flight.requests.per.connection - The maximum number of
unacknowledged requests the client will send on a single connection
before blocking. Note that if this setting is set to be greater than
1 and there are failed sends, there is a risk of message re-ordering
due to retries (i.e., if retries are enabled).
The question is, will the order still be retained to a particular partition if there are failed sends like mentioned #2 ? if there is a potential issue with one message , all the following messages will be dropped "to retain the order" per partition or the "correct" messages will be sent and failed messages will be notified to the application ?
"will the order still be retained to a particular partition if there are failed sends like mentioned #2?"
As written in the documentation part you have copied, there is a risk that the ordering is changed.
Imagine, you have a topic with e.g. one partition. You set the retries to 100 and the max.in.flight.requests.per.connection to 5 which is greater than one. As a note, retries will only make sense if you set the acks to 1 or "all".
If you plan to produce the following messages in the order K1, K2, K3, K4, K5 and it takes your producer some time to
actually create the batch and
make a request to the broker and
wait for the acknowledgement of the broker
you could have up to 5 requests in parallel (based on the setting of max.in.flight.request.per.connection). Now, producing "K3" has some issues and it goes into the retry-loop, the messages K4 and K5 can be produced as the request was already in flight.
Your topic would end up with messages in that order: K1, K2, K4, K5, K3.
In case you enable idempotency in the Kafka Producer, the ordering would still be guaranteed as explained in Ordering guarantees when using idempotent Kafka Producer
When exclusively reading messages from a single partition living in a Kafka topic where timestamps are configured for ingestion (broker) time, can I assume that all message retrieved from the partition will always be in strict timestamp order?
Kafka provides ordering guarantees while storing as well as retrieving messages i.e messages are stored & retrieved in the order they are sent.
Messages sent by a producer to a particular topic partition will be appended in the order they are sent. That is, if a record M1 is sent by the same producer as a record M2, and M1 is sent first, then M1 will have a lower offset(as well as lower Timestamp) than M2 and appear earlier in the log.
A consumer instance sees records in the order they are stored in the log.
However , Kafka only provides a total order over records within a partition, not between different partitions in a topic. But, if you require a total order over records this can be achieved with a topic that has only one partition, though this will mean only one consumer process per consumer group(Not suggested). Using this analogy , if you have only 1 partition , then it's a yes for your use-case but if more partitions then again a yes for ordering on per partition basis but ordering can't be guaranteed across the topic(multiple partitions).
Yes I was talking about a Kafka topic when it is explicitly configured for log append time.
I'm assuming that since the broker determines the timestamp and the broker owns a particular partition that timestamps in that partition will reflect timestamp order.
Rephrasing the question, is this always true within a single partition configured for log append time:
timestamp x <= timestamp y
where
offset x < offset y
Thanks.
While trying to implement exactly-once semantics, I found this in the official Kafka documentation:
Exactly-once delivery requires co-operation with the destination
storage system but Kafka provides the offset which makes implementing
this straight-forward.
Does this mean that I can use the (topic, partiton, offset) tuple as a unique primary identifier to implement deduplication?
An example implementation would be to use an RDBMS and this tuple as a primary key for an insert operation within a big processing transaction where the transaction fails if the insertion is not possible anymore because of an already existing primary key.
I think the question is equivalent to:
Does a producer use the same offset for a message when retrying to send it after detecting a possible failure or does every retry attempt get its own offset?
If the offset is reused when retrying, consumers obviously see multiple messages with the same offset.
Other question, maybe somehow related:
With single or multiple producers producing to the same topic, can there be "gaps" in the offset number sequence seen by one consumer?
Another possibility could be that the offset is determined e.g. solely by or as recently as the message reaches the leader which does the job (implying that - if not listening to something like a producer's suggested offset - there are probably no gaps/offset jumps, but also different offsets for duplicate messages and I would have to use my own unique identifier within the application's message on application level).
To answer my own question:
The offset is generated solely by the server (more precisely: by the leader of the corresponding partition), not by the producing client. It is then sent back to the producer in the produce response. So:
Does a producer use the same offset for a message when retrying to
send it after detecting a possible failure or does every retry attempt
get its own offset?
No. (See update below!) The producer does not determine offsets and two identical/duplicate application messages can have different offsets. So the offset cannot be used to identify messages for producer deduplication purposes and a custom UID has to be defined in the application message. (Source)
With single or multiple producers producing to the same topic, can there be "gaps" in the offset number sequence seen by one consumer?
Due to the fact that there is only a single leader for every partition which maintains the current offset and the fact that (with the default configuration) this leadership is only transfered to active in-sync replica in case of a failure, I assume that the latest used offset is always communicated correctly when electing a new leader for a partition and therefore there are should not be any offset gaps or jumps initially. However, because of the log compaction feature, there are cases (assuming log compaction being enabled) where there can indeed be gaps in a stream of offsets when consuming already committed messages of a partition once again after the compaction has kicked in. (Source)
Update (Kafka >= 0.11.0)
Starting from Kafka version 0.11.0, producers now additionally send a sequence number with their requests, which is then used by the leader to deduplicate requests by this number and the producer's ID. So with 0.11.0, the precondition on the producer side for implementing exactly once semantics is given by Kafka itself and there's no need to send another unique ID or sequence number within the application's message.
Therefore, the answer to question 1 could now also be yes, somehow.
However, note that exactly once semantics are still only possible with the consumer never failing. Once the consumer can fail, one still has to watch out for duplicate message processings on consumer side.
Ok so I understand that you only get order guarantee per partition.
Just random thought/question.
Assuming that the partition strategy is correct and the messages are grouped correctly to the proper partition (or even say we are using 1 partition)
I suppose that the producing application must send each message 1 by 1 to kafka and make sure that each message has been acked before sending the next one right?
Yes, you are correct that the order the producing application sends the message dictates the order they are stored in the partition.
Messages sent by a producer to a particular topic partition will be appended in the order they are sent. That is, if a message M1 is sent by the same producer as a message M2, and M1 is sent first, then M1 will have a lower offset than M2 and appear earlier in the log.
http://kafka.apache.org/documentation.html#intro_guarantees
However, if you have multiple messages in flight simultaneously I am not sure how order is determined.
You might want to think about the acks config for your producer as well. There are failure conditions where a message may be missing if the leader goes down after M1 is published and a new leader receives M2. In this case you won't have an out of order condition, but a missing message so it's slightly orthogonal to your original question but something to consider if message guarantees and order are critical to your application.
http://kafka.apache.org/documentation.html#producerconfigs
Overall, designing a system where small differences in order are not that important can really simplify things.
sync send message one by one(definitely slow!),
or async send message in batch with max.in.flight.requests.per.connection = 1
Yes, the Producer should be single threaded. If one uses multiple Producer threads to produce to the same partition, ordering guarantee on the Consumer will still be lost.So, ordering guarantee on the same partition implicitly also means a single Producer thread.
There are two strategies for sending messages in kafka : synchronous and asynchronous.
For synchronous type, it is intuitively that a producer send message one by one to the target partition, thus the message order is guaranteed.
For asynchronous type, messages are send using batching method, that is to say, if M1 is send prior to M2, then M1 is accumulated in the memory first, then the same with M2. So When producer sends batches of messages in a single request, the messages order thus will be guaranteed.
I have a use case where I would be reading a set of key / value pairs, where key is just a String and value is a JSON. I have to expose these values as JSON's to a REST end-point which I would do using a kafka-streaming consumer.
Now my questions are:
How do I deal with Kafka partitions? I'm planning to use spark-streaming for the consumer
How about the producer? I would like to poll the data from an external service at a constant interval and write the resulting key / value pair to the Kafka topic. Is the a streaming producer?
Is this even a valid use case to employ Kafka? I mean, I could have another consumer group that just logs the incoming key / value pairs to a database. This is exactly what attracts me to use Kafka, the possibility to have multiple consumer groups to do different things!
Partitioning the topic I suppose is to increase parallelism, thereby increasing consumer throughput. How does this throughput compare with no partitioning? I have a use case where I have to ensure ordering, so I cannot partition the topic, but at the same time I would like to have a very high throughput for my consumer. How do I go about doing this?
Any suggestions?
Just trying to share few thoughts on this
Topic is the main level of parallelism in Kafka. A topic having N partitions can be consumed by N number of threads in parallel. But having multiple partitions mainly creates problem in ordering of the data. E.g. If you have N no of partitions P and you configure your producer to publish messages randomly (default behaviour) then message M1 produced at time T1 might go to partition P1, message M2 #T2 to P2, M3 #T3 to P2 and then M4 to P1 again. You can configure custom rule to produced messages to specific partitions (using something called Key) but it requires to be handled at your end.
Not sure what exactly you mean regarding the producer. In general you can create observers to listen for those events and invoke producers when they arrive. You can choose to send message by batches as well.
One of the key reasons for choosing Kafka is the compatibility with different computations engine like apache storm, apache spark etc. But as far as my understanding goes the main thing Kafka aim for is high throughput expecting data to be published in a very frequent duration. If in your case the interval between events are high then it might worth thinking about other possibilities before finalising on Kafka as maintaining an idle cluster is not a good idea.