How do you address messages coming out of order in a message queue? - apache-kafka

I was once asked on an interview, how would you deal with messages coming in out of order in a message queue. It has been a while and I have not found a definitive answer and I was wondering if an expert in the field can help me answer it to address my own curiosity.
I understand that some message queues provide exactly-once and FIFO guarantees. Also I am aware of the notion of event time and processing time in streaming systems. For instance, in log based message queues like Kafka, mixed up ordering may be less likely to happen due to the presence of offsets and message durability (I may be wrong). I have also thought about using timestamps requiring each message sender to record the time of message before sending it but this is fraught with inconsistency due to clock skew.
Given all of that, I am wondering how can one address mixed up ordering in a traditional messaging system like AMQP, JMS or RabbitMQ where a dozen of IOT devices may be sending messages and I as a consumer want to reconcile them in the correct order.

If queue your system is using, provides ordered message guarantee, then simply use that channel(like kakfa's single partition, AMQP under some settings).
But if queue your system is using does not provide strict ordering then general Idea is that client can have monotonically increasing[1] number(or timestamp) attached with each message it sends to queue. This forms the basis of sequence which producer intends to send to its receivers.
How to get montonically increasing value:
Using timestamp:
POSIX clock_gettime() function with CLOCK_MONOTONIC[2] provides option to get monotonically increasing timestamp, which can be used by producer to put timestamp on each message. Receiver can identify out of order messages when it sees that received message has timestamp older than latest message.
Using sequence number:
Before sending each message you can simply increase an atomic counter and attach counter value to each message, so that receiver can know about intended ordering. This will form strictly increasing sequence. Approach is very similar to Lamport's logical clock[3] which provides virtual clock for producer.
Dealing with out of order messages on receiver side:
This is pretty much application specific but in general you have 2 options when messages arrive out of order:
a) discard the older message, like in cases in which receiver have to show latest value of a stock.
b) Have buffer to reorder sequencing, like within a TCP connection(e.g. zookeeper uses TCP as queue for FIFO ordering [4-5])
Tools:
If you are not adding timestamp to messages, then send all messages to Apache kafka single partition in sequence from producer, as this will ensure that receiver can receive messages in sequence.
If you are using messaging system which does not guarantee ordered delivery (like AMQP under some settings[6]), then you can consider adding additional monotonically increasing number/clock with each message.
[1] https://en.wiktionary.org/wiki/monotonic_increasing#targetText=Adjective,contrast%20this%20with%20strictly%20increasing
[2] https://linux.die.net/man/2/clock_gettime
[3] https://en.wikipedia.org/wiki/Lamport_timestamps#Lamport's_logical_clock_in_distributed_systems
[4] https://cwiki.apache.org/confluence/download/attachments/24193445/zookeeper-internals.pdf?version=1&modificationDate=1295034038000&api=v2
[5] http://www.tcs.hut.fi/Studies/T-79.5001/reports/2012-deSouzaMedeiros.pdf
[6] RabbitMQ - Message order of delivery

I can answer with respect to Apache Kafka.
Apache Kafka guarantees strict order on a topic by partition means each partition is an immutable sequence of message appending in a strict order.
So in case, more than one partition consumer may consume messages from more than one partition which can't be in strict order. We can consider below 2 options to achieve strict order.
If looking for 1 producer message in order use only 1 partition per topic. so the producer will publish on the same partition in sequence order which will get consumed by consumers in strict order.
Producer publishes a message to multi-partition, so use multi-consumer in consumer group but use assign to specific partition per consumer to consume message from specific partition will guarantee strict order per partition per consumer

Related

What atomicity guarantees - if any - does Kafka have regarding batch writes?

We're now moving one of our services from pushing data through legacy communication tech to Apache Kafka.
The current logic is to send a message to IBM MQ and retry if errors occur. I want to repeat that, but I don't have any idea about what guarantees the broker provide in that scenario.
Let's say I send 100 messages in a batch via producer via Java client library. Assuming it reaches the cluster, is there a possibility only part of it be accepted (e.g. a disk is full, or some partitions I touch in my write are under-replicated)? Can I detect that problem from my producer and retry only those messages that weren't accepted?
I searched for kafka atomicity guarantee but came up empty, may be there's a well-known term for it
When you say you send 100 messages in one batch, you mean, you want to control this number of messages or be ok letting the producer batch a certain amount of messages and then send the batch ?
Because not sure you can control the number of produced messages in one producer batch, the API will queue them and batch them for you, but without guarantee of batch them all together ( I'll check that though).
If you're ok with letting the API batch a certain amount of messages for you, here is some clues about how they are acknowledged.
When dealing with producer, Kafka comes with some kind of reliability regarding writes ( also "batch writes")
As stated in this slideshare post :
https://www.slideshare.net/miguno/apache-kafka-08-basic-training-verisign (83)
The original list of messages is partitioned (randomly if the default partitioner is used) based on their destination partitions/topics, i.e. split into smaller batches.
Each post-split batch is sent to the respective leader broker/ISR (the individual send()’s happen sequentially), and each is acked by its respective leader broker according to request.required.acks
So regarding atomicity.. Not sure the whole batch will be seen as atomic regarding the above behavior. Maybe you can assure to send your batch of message using the same key for each message as they will go to the same partition, and thus maybe become atomic
If you need more clarity about acknowlegment rules when producing, here how it works As stated here https://docs.confluent.io/current/clients/producer.html :
You can control the durability of messages written to Kafka through the acks setting.
The default value of "1" requires an explicit acknowledgement from the partition leader that the write succeeded.
The strongest guarantee that Kafka provides is with "acks=all", which guarantees that not only did the partition leader accept the write, but it was successfully replicated to all of the in-sync replicas.
You can also look around producer enable.idempotence behavior if you aim having no duplicates while producing.
Yannick

Kafka ordering with multiple producers on same topic and parititon

Let's say I have two producers (ProducerA and ProducerB) writing to the same topic with a single partition. Each producer is writing it's own unique events serially. So if ProducerA fired 3 events and then ProducerB fired 3 events, my understanding is that Kafka cannot guarantee the order across the producer's events like this:
ProducerA_event_1
ProducerA_event_2
ProducerA_event_3
ProducerB_event_1
ProducerB_event_2
ProducerB_event_3
due to acking, retrying, etc.
However will individual producer's events still be in order? For example:
ProducerA_event_1
ProducerB_event_2
ProducerB_event_1
ProducerA_event_2
ProducerA_event_3
ProducerB_event_3
This is of course a simplified version of what I am doing, but I just want to guarantee that if I am reading from a topic for a specific producer's events, then those events will be in order even if other producer's events interleave them.
Short answer to this one is Yes, the individual producer's events will be guaranteed to be in order.
Messages in Kafka are appended to a topic partition in the order they are sent and the consumers read the messages in the same order they are stored in the topic partition.
So assuming if you are interested in the messages from Producer A and are filtering everything else, then in the given scenario, you can expect the events 1, 2 and 3 from Producer A to be read in the order.
PS: I am however curious to understand the motivation behind using just one partition. Also, on your statement:
So if ProducerA fired 3 events and then ProducerB fired 3 events, my
understanding is that Kafka cannot guarantee the order across the
producer's events like this:
You are correct in saying that the overall ordering is something that cannot be guaranteed but ordering within a partition can be guaranteed.
I hope this helps.
There is a nice article on medium which states that Kafka does not always guarantee the message ordering even for the same producer. It all depends on the Kafka configuration. In particular, max.in.flight.requests.per.connection has to be set to 1. The reason is if there are multiple requests (say, 2) in flight and the first one failed, the second will get appended to the log earlier, thus breaking the ordering.
A producer's messages will be stored, per partition, in the order they are received. If you can guarantee message ordering on the producer, then consumers can assume ordering when polling. Retry logic, multiple KafkaProducer instances, and other asynchronous implementation details might complicate ordered message production. Often these can be mitigated by including a unique event identifier, an identifier of the producer, and a timestamp of sufficient granularity either in the key or value of the message. Relying on ordering in an asynchronous framework is often a best case flow but there should be some way to compensate when things come in out of order.

Repeatedly produced to Apache Kafka, different offsets? (Exactly once semantics)

While trying to implement exactly-once semantics, I found this in the official Kafka documentation:
Exactly-once delivery requires co-operation with the destination
storage system but Kafka provides the offset which makes implementing
this straight-forward.
Does this mean that I can use the (topic, partiton, offset) tuple as a unique primary identifier to implement deduplication?
An example implementation would be to use an RDBMS and this tuple as a primary key for an insert operation within a big processing transaction where the transaction fails if the insertion is not possible anymore because of an already existing primary key.
I think the question is equivalent to:
Does a producer use the same offset for a message when retrying to send it after detecting a possible failure or does every retry attempt get its own offset?
If the offset is reused when retrying, consumers obviously see multiple messages with the same offset.
Other question, maybe somehow related:
With single or multiple producers producing to the same topic, can there be "gaps" in the offset number sequence seen by one consumer?
Another possibility could be that the offset is determined e.g. solely by or as recently as the message reaches the leader which does the job (implying that - if not listening to something like a producer's suggested offset - there are probably no gaps/offset jumps, but also different offsets for duplicate messages and I would have to use my own unique identifier within the application's message on application level).
To answer my own question:
The offset is generated solely by the server (more precisely: by the leader of the corresponding partition), not by the producing client. It is then sent back to the producer in the produce response. So:
Does a producer use the same offset for a message when retrying to
send it after detecting a possible failure or does every retry attempt
get its own offset?
No. (See update below!) The producer does not determine offsets and two identical/duplicate application messages can have different offsets. So the offset cannot be used to identify messages for producer deduplication purposes and a custom UID has to be defined in the application message. (Source)
With single or multiple producers producing to the same topic, can there be "gaps" in the offset number sequence seen by one consumer?
Due to the fact that there is only a single leader for every partition which maintains the current offset and the fact that (with the default configuration) this leadership is only transfered to active in-sync replica in case of a failure, I assume that the latest used offset is always communicated correctly when electing a new leader for a partition and therefore there are should not be any offset gaps or jumps initially. However, because of the log compaction feature, there are cases (assuming log compaction being enabled) where there can indeed be gaps in a stream of offsets when consuming already committed messages of a partition once again after the compaction has kicked in. (Source)
Update (Kafka >= 0.11.0)
Starting from Kafka version 0.11.0, producers now additionally send a sequence number with their requests, which is then used by the leader to deduplicate requests by this number and the producer's ID. So with 0.11.0, the precondition on the producer side for implementing exactly once semantics is given by Kafka itself and there's no need to send another unique ID or sequence number within the application's message.
Therefore, the answer to question 1 could now also be yes, somehow.
However, note that exactly once semantics are still only possible with the consumer never failing. Once the consumer can fail, one still has to watch out for duplicate message processings on consumer side.

How to guarantee order in Kafka partition

Ok so I understand that you only get order guarantee per partition.
Just random thought/question.
Assuming that the partition strategy is correct and the messages are grouped correctly to the proper partition (or even say we are using 1 partition)
I suppose that the producing application must send each message 1 by 1 to kafka and make sure that each message has been acked before sending the next one right?
Yes, you are correct that the order the producing application sends the message dictates the order they are stored in the partition.
Messages sent by a producer to a particular topic partition will be appended in the order they are sent. That is, if a message M1 is sent by the same producer as a message M2, and M1 is sent first, then M1 will have a lower offset than M2 and appear earlier in the log.
http://kafka.apache.org/documentation.html#intro_guarantees
However, if you have multiple messages in flight simultaneously I am not sure how order is determined.
You might want to think about the acks config for your producer as well. There are failure conditions where a message may be missing if the leader goes down after M1 is published and a new leader receives M2. In this case you won't have an out of order condition, but a missing message so it's slightly orthogonal to your original question but something to consider if message guarantees and order are critical to your application.
http://kafka.apache.org/documentation.html#producerconfigs
Overall, designing a system where small differences in order are not that important can really simplify things.
sync send message one by one(definitely slow!),
or async send message in batch with max.in.flight.requests.per.connection = 1
Yes, the Producer should be single threaded. If one uses multiple Producer threads to produce to the same partition, ordering guarantee on the Consumer will still be lost.So, ordering guarantee on the same partition implicitly also means a single Producer thread.
There are two strategies for sending messages in kafka : synchronous and asynchronous.
For synchronous type, it is intuitively that a producer send message one by one to the target partition, thus the message order is guaranteed.
For asynchronous type, messages are send using batching method, that is to say, if M1 is send prior to M2, then M1 is accumulated in the memory first, then the same with M2. So When producer sends batches of messages in a single request, the messages order thus will be guaranteed.

Apache Kafka order of messages with multiple partitions

As per Apache Kafka documentation, the order of the messages can be achieved within the partition or one partition in a topic. In this case, what is the parallelism benefit we are getting and it is equivalent to traditional MQs, isn't it?
In Kafka the parallelism is equal to the number of partitions for a topic.
For example, assume that your messages are partitioned based on user_id and consider 4 messages having user_ids 1,2,3 and 4. Assume that you have an "users" topic with 4 partitions.
Since partitioning is based on user_id, assume that message having user_id 1 will go to partition 1, message having user_id 2 will go to partition 2 and so on..
Also assume that you have 4 consumers for the topic. Since you have 4 consumers, Kafka will assign each consumer to one partition. So in this case as soon as 4 messages are pushed, they are immediately consumed by the consumers.
If you had 2 consumers for the topic instead of 4, then each consumer will be handling 2 partitions and the consuming throughput will be almost half.
To completely answer your question,
Kafka only provides a total order over messages within a partition, not between different partitions in a topic.
ie, if consumption is very slow in partition 2 and very fast in partition 4, then message with user_id 4 will be consumed before message with user_id 2. This is how Kafka is designed.
I decided to move my comment to a separate answer as I think it makes sense to do so.
While John is 100% right about what he wrote, you may consider rethinking your problem. Do you really need ALL messages to stay in order? Or do you need all messages for specific user_id (or whatever) to stay in order?
If the first, then there's no much you can do, you should use 1 partition and lose all the parallelism ability.
But if the second case, you might consider partitioning your messages by some key and thus all messages for that key will arrive to one partition (they actually might go to another partition if you resize topic, but that's a different case) and thus will guarantee that all messages for that key are in order.
In kafka Messages with the same key, from the same Producer, are delivered to the Consumer in order
another thing on top of that is, Data within a Partition will be stored in the order in which it is written therefore, data read from a Partition will be read in order for that partition
So if you want to get your messages in order across multi partitions, then you really need to group your messages with a key, so that messages with same key goes to same partition and with in that partition the messages are ordered.
In a nutshell, you will need to design a two level solution like above logically to get the messages ordered across multi partition.
You may consider having a field which has the Timestamp/Date at the time of creation of the dataset at the source.
Once, the data is consumed you can load the data into database. The data needs to be sorted at the database level before using the dataset for any usecase. Well, this is an attempt to help you think in multiple ways.
Let's consider we have a message key as the timestamp which is generated at the time of creation of the data and the value is the actual message string.
As and when a message is picked up by the consumer, the message is written into HBase with the RowKey as the kafka key and value as the kafka value.
Since, HBase is a sorted map having timestamp as a key will automatically sorts the data in order. Then you can serve the data from HBase for the downstream apps.
In this way you are not loosing the parallelism of kafka. You also have the privilege of processing sorting and performing multiple processing logics on the data at the database level.
Note: Any distributed message broker does not guarantee overall ordering. If you are insisting for that you may need to rethink using another message broker or you need to have single partition in kafka which is not a good idea. Kafka is all about parallelism by increasing partitions or increasing consumer groups.
Traditional MQ works in a way such that once a message has been processed, it gets removed from the queue. A message queue allows a bunch of subscribers to pull a message, or a batch of messages, from the end of the queue. Queues usually allow for some level of transaction when pulling a message off, to ensure that the desired action was executed, before the message gets removed, but once a message has been processed, it gets removed from the queue.
With Kafka on the other hand, you publish messages/events to topics, and they get persisted. They don’t get removed when consumers receive them. This allows you to replay messages, but more importantly, it allows a multitude of consumers to process logic based on the same messages/events.
You can still scale out to get parallel processing in the same domain, but more importantly, you can add different types of consumers that execute different logic based on the same event. In other words, with Kafka, you can adopt a reactive pub/sub architecture.
ref: https://hackernoon.com/a-super-quick-comparison-between-kafka-and-message-queues-e69742d855a8
Well, this is an old thread, but still relevant, hence decided to share my view.
I think this question is a bit confusing.
If you need strict ordering of messages, then the same strict ordering should be maintained while consuming the messages. There is absolutely no point in ordering message in queue, but not while consuming it. Kafka allows best of both worlds. It allows ordering the message within a partition right from the generation till consumption while allowing parallelism between multiple partition. Hence, if you need
Absolute ordering of all events published on a topic, use single partition. You will not have parallelism, nor do you need (again parallel and strict ordering don't go together).
Go for multiple partition and consumer, use consistent hashing to ensure all messages which need to follow relative order goes to a single partition.