Will Kafka allow "unballanced" partitions? - apache-kafka

One question raised during system design, if message key is selected in the way that it happens too often in the stream of data, does that mean that only one topic partition will be receiving these messages exclusively even if that creates disbalance in the way how partitions are filled with data?
Does Kafka have a mechanism to "split" messages with the same key among several partitions, sacrificing order in this case?
Or there are no exceptions in key -> partition allocation regardless how that impact size of partitions?

To answer your question in the topic, the answer is yes, kafka will allow unbalanced partitions.
You can define your own partioner class to decide where the messages would be sent to, in default architecture it is using murmur2 algorithm to decide where to send each key , so it will have same keys in the same partition if your use case is not requiring ordering between the events you might not need to send key at all, and than the messages would be distributed across the partitions, in last updates kafka "batch" messages sent from producer to same partition to have even better throughput...
To make it clear , kafka does not require you to send a key for a message


How to change partitioner logic in a live system

In a Kafka deployment a custom topic partitioner logic is used to route all messages that belong to the same root entity (for example all message for particular user) to the same partition.
Can anyone recommend a strategy on how to deal with partitioning logic change in such live system?
One example that affects the partitioning is the obvious change of the partitioner implementation. The other example would be change of the number of partitions for a given topic.
In both cases, we would end up in a situation where some of the messages for user A, that entered the Kafka before the change, will be in partition 1, while after the change in partitioning logic or number of partitions messages for that same user A will go the partition 2.
This can lead to a problem where messages for user A are processed out of order. Consumer reading the messages from partition 2 could process messages before the consumer that reads the messages from partition 1.
Have anyone faced this issue in live system? How did you or would you solve this issue?
This seems like a very common scenario, but I was not able to find anything about it.
By partitioning logic, if you meant partitioning algorithm, I do not understand how that would just change like that. As for increasing partitions, it is in theory not possible to achieve increasing of partitions while guaranteeing the order of messages. -- there is a KIP for that, but its status is still "under discussion".
What I do usually when I increase partitions is to accept a small downtime.
The playbook is like this:
Stop the producer
Monitor the lag for the consumer group
Once lag is zero, shut down the consumers
Increase the number of partitions
Start the consumers
Start the producers
This way, you can be sure that there are no message losses and no out of order message consumption.
If you want to avoid a downtime, you may have to rely on an external system which can temporarily hold the data per partition in order and publish, but that solution depends on a few things
The best way to change how records are partitioned is to use the default Apache Kafka® partitioner, and change the record keys. If all records from a user need to go to the same topic then make sure they all have the same key.
If you'd like to change the keys for a whole set you can use KSQL to re-key (republish to a new topic with new keys) the data using the PARTITION BY function.

Kafka message partitioning by key

We have a business process/workflow that is being started when initial event message is received and closed when the last message is processed. We have up to 100,000 processes executed each day. My problem is that the order of the messages that come to specific process has to be processed by the same order messages were received. If one of the messages fails, the process has to freeze until the problem is fixed, despite that all other processes has to continue. For this kind of situation i am thinking of using Kafka. first solution that came to my mind was to use Topic partitioning by message key. The key of the message would be the ProcessId. This way i could be sure that all process messages would be partitioned and kafka would guarantee the order. As i am new to Kafka what i managed to figure out that partitions has to be created in advance and that makes everything to difficult. so my questions are:
1) when i produce message to kafka's topic that does not exist, the topic is created on runtime. Is it possible to have same behavior for topic partitions?
2) there can be more than 100,000 active partitions on the topic, is that a problem?
3) can partition be deleted after all messages from that topic were read?
4) maybe you can suggest other approaches to my problem?
When i produce message to kafka's topic that does not exist, the topic is created on runtime. Is it possible to have same behavior for topic partitions?
You need to specify number of partitions while creating topic. New Partitions won't be create automatically(as is the case with topic creation), you have to change number of partitions using topic tool.
More Info: https://kafka.apache.org/documentation/#basic_ops_modify_topi
As soon as you increase number of partitions, producer and consumer will be notified of new paritions, thereby leading them to rebalance. Once rebalanced, producer and consumer will start producing and consuming from new partition.
there can be more than 100,000 active partitions on the topic, is that a problem?
Yes, having this much partitions will increase overall latency.
Go through how-choose-number-topics-partitions-kafka-cluster on how to decide number of partitions.
can partition be deleted after all messages from that topic were read?
Deleting a partition would lead to data loss and also the remaining data's keys would not be distributed correctly so new messages would not get directed to the same partitions as old existing messages with the same key. That's why Kafka does not support decreasing partition count on topic.
Also, Kafka doc states that
Kafka does not currently support reducing the number of partitions for a topic.
I suppose you choose wrong feature to solve you task.
In general, partitioning is used for load balancing.
Incoming messages will be distributed on given number of partition according to the partitioning strategy which defined at broker start. In short, default strategy just calculate i=key_hash mod number_of_partitions and put message to ith partition. More about strategies you could read here
Message ordering is guaranteed only within partition. With two messages from different partitions you have no guarantees which come first to the consumer.
Probably you would use group instead. It's option for consumer
Each group consumes all messages from topic independently.
Group could consist of one consumer or more if you need it.
You could assign many groups and add new group (in fact, add new consumer with new groupId) dynamically.
As you could stop/pause any consumer, you could manually stop all consumers related to specified group. I suppose there is no single command to do that but I'm not sure. Anyway, if you have single consumer in each group you could stop it easily.
If you want to remove the group you just shutdown and drop out related consumers. No actions on broker side is needed.
As a drawback you'll get 100,000 consumers which read (single) topic. It's heavy network load at least.

kafka topics and partitions decisions

I need to understand something about kafka:
When I have a single kafka broker on a single host - is there any sense to have it have more than one partition for the topics? I means even if my data can be distinguished with some key (say tenant id) - what is the benefit of doing it on a single kafka broker? does this give any parallelism , if so how?
When a key is used, is this means that each key is mapped to a given partition? Does the number of partitions for a topic must be equal to the number of possible values for the key I specified? OR is this just a hash and so the number of partitions doesnt have to be equal?
From what I read, topics are created due to types of messages to be places in kafka. But in my case, i have 2 topics I have created since I have 2 types of consumption: one for reading one by one message. the second in case of a bulk of messages comes into the queue (application reasons) and then it is being entered into the second topic. Is that a good design although the messages type is the same? any other practice for such a scansion?
Yes, it definitely makes sense to have more than one partition for a topic even when you have a single Kafka broker. A scenario when you can benefit from this is pretty simple:
you need to guarantee in-order processing by tenant id
processing logic for each message is rather complex and takes some time. Especially the case when the Kafka message itself is simple, but the logic behind processing this message takes time (simple example - message is an URL, and the processing logic is downloading the file from there and doing some processing)
Given these 2 conditions you may get into a situation where one consumer is not able to keep up processing all the messages if all the data goes to a single partition. Remember, you can process one partition with exactly one consumer (well, you can use 2 consumers if using different consumer groups, but that's not your case), so you'll start getting behind over time. But if you have more than one partition you'll either be able to use one consumer and process data in parallel (this could help to speed things up in some cases) or just add more consumers.
By default, Kafka uses hash-based partitioning. This is configurable by providing a custom Partitioner, for example you can use random partitioning if you don't care what partition your message ends up in.
It's totally up to you what purposes you have topics for
UPD, answers to questions in the comment:
Adding more consumers is usually done for adding more computing power, not for achieving desired parallelism. To add parallelism add partitions. Most consumer implementations process different partitions on different threads, so if you have enough computing power, you might just have a single consumer processing multiple partitions in parallel. Then, if you start bumping into situations where one consumer is not enough, you just add more consumers.
When you create a topic you just specify the number of partitions (and replication factor for this topic, but that's a different thing). The key and partition to send is completely up to producer. In fact, you could configure your producer to use random partitioner and it won't even care about keys, just pick the partition randomly. There's no direct relation between key -> partition, it's just convenient to benefit from having things setup like this.
Can you elaborate on this one? Not sure I understand this, but I guess your question is whether you can send just a value and Kafka will infer a key somehow itself. If so, then the answer is no - Kafka does not apply any transformation to messages and stores them as is, so if you want your message to contain a key, the producer must explicitly send the key.

Is there any way to maintain message ordering between partitions of a kafka topic with a single consumer?

We are developing a kafka based streaming system in which the producer would produce to multiple partitions within its topic and a single consumer would consume from the topic. I know that kafka maintains message order within partitions, but can we maintain a global message order between partitions within a topic?
Short answer:
no, Kafka does not provide any ordering guarantees between partitions.
Long answer:
I don't quite understand your problem. If you are saying you have only one consumer consuming your topic, why would you have more than 1 partition in that topic and reinvent the wheel trying to maintain order between partitions? If you want to leave some space for future growth, e.g. adding another consumer to consume a part of partitions, then you'll have to rethink your "global message order" idea.
Do you really need ALL messages to be processed in order? Or maybe you could partition by client/application/whatever and maintain order per partition? In most cases you don't really need that global message order, but just have to partition your data properly.
Maintaining order between multiple consumers is a really tough problem to solve, and even if solved correctly you'll just neglect all Kafka benefits.
You can't benifit from kafka if you want the global ordering in more than one partition. Kafka only supports message ordering in only one partition. In our company, we need only the same catergory messages are sent to the same partition, which can easily partition using partitionId.
The purpose of partitions in Kafka is to create a partial order of messages in a broader topic, where the messages follow a strict total order in any given partition. So the answer is 'no', it would defeat the purpose of partitions if any notion of cross-partition order were to be introduced.
I would suggest instead focusing on how messages (records, in Kafka parlance) are keyed, which effectively determines how they are mapped to a partition. Which partition specifically doesn't matter, as long as the mapping is deterministic and repeatable — all you should care about is that identically keyed records will always appear on the same partition and, hence, will not be assigned to multiple consumers at the same time (within the same consumer group).
If you are publishing updates to persisted entities, the primary key of the entity is typically a good starting point for a Kafka record key. If there needs to be some order of updates across a connected graph of entities, then taking the ID root of the graph and making it the key will likely satisfy your ordering needs.

Apache Kafka order of messages with multiple partitions

As per Apache Kafka documentation, the order of the messages can be achieved within the partition or one partition in a topic. In this case, what is the parallelism benefit we are getting and it is equivalent to traditional MQs, isn't it?
In Kafka the parallelism is equal to the number of partitions for a topic.
For example, assume that your messages are partitioned based on user_id and consider 4 messages having user_ids 1,2,3 and 4. Assume that you have an "users" topic with 4 partitions.
Since partitioning is based on user_id, assume that message having user_id 1 will go to partition 1, message having user_id 2 will go to partition 2 and so on..
Also assume that you have 4 consumers for the topic. Since you have 4 consumers, Kafka will assign each consumer to one partition. So in this case as soon as 4 messages are pushed, they are immediately consumed by the consumers.
If you had 2 consumers for the topic instead of 4, then each consumer will be handling 2 partitions and the consuming throughput will be almost half.
To completely answer your question,
Kafka only provides a total order over messages within a partition, not between different partitions in a topic.
ie, if consumption is very slow in partition 2 and very fast in partition 4, then message with user_id 4 will be consumed before message with user_id 2. This is how Kafka is designed.
I decided to move my comment to a separate answer as I think it makes sense to do so.
While John is 100% right about what he wrote, you may consider rethinking your problem. Do you really need ALL messages to stay in order? Or do you need all messages for specific user_id (or whatever) to stay in order?
If the first, then there's no much you can do, you should use 1 partition and lose all the parallelism ability.
But if the second case, you might consider partitioning your messages by some key and thus all messages for that key will arrive to one partition (they actually might go to another partition if you resize topic, but that's a different case) and thus will guarantee that all messages for that key are in order.
In kafka Messages with the same key, from the same Producer, are delivered to the Consumer in order
another thing on top of that is, Data within a Partition will be stored in the order in which it is written therefore, data read from a Partition will be read in order for that partition
So if you want to get your messages in order across multi partitions, then you really need to group your messages with a key, so that messages with same key goes to same partition and with in that partition the messages are ordered.
In a nutshell, you will need to design a two level solution like above logically to get the messages ordered across multi partition.
You may consider having a field which has the Timestamp/Date at the time of creation of the dataset at the source.
Once, the data is consumed you can load the data into database. The data needs to be sorted at the database level before using the dataset for any usecase. Well, this is an attempt to help you think in multiple ways.
Let's consider we have a message key as the timestamp which is generated at the time of creation of the data and the value is the actual message string.
As and when a message is picked up by the consumer, the message is written into HBase with the RowKey as the kafka key and value as the kafka value.
Since, HBase is a sorted map having timestamp as a key will automatically sorts the data in order. Then you can serve the data from HBase for the downstream apps.
In this way you are not loosing the parallelism of kafka. You also have the privilege of processing sorting and performing multiple processing logics on the data at the database level.
Note: Any distributed message broker does not guarantee overall ordering. If you are insisting for that you may need to rethink using another message broker or you need to have single partition in kafka which is not a good idea. Kafka is all about parallelism by increasing partitions or increasing consumer groups.
Traditional MQ works in a way such that once a message has been processed, it gets removed from the queue. A message queue allows a bunch of subscribers to pull a message, or a batch of messages, from the end of the queue. Queues usually allow for some level of transaction when pulling a message off, to ensure that the desired action was executed, before the message gets removed, but once a message has been processed, it gets removed from the queue.
With Kafka on the other hand, you publish messages/events to topics, and they get persisted. They don’t get removed when consumers receive them. This allows you to replay messages, but more importantly, it allows a multitude of consumers to process logic based on the same messages/events.
You can still scale out to get parallel processing in the same domain, but more importantly, you can add different types of consumers that execute different logic based on the same event. In other words, with Kafka, you can adopt a reactive pub/sub architecture.
ref: https://hackernoon.com/a-super-quick-comparison-between-kafka-and-message-queues-e69742d855a8
Well, this is an old thread, but still relevant, hence decided to share my view.
I think this question is a bit confusing.
If you need strict ordering of messages, then the same strict ordering should be maintained while consuming the messages. There is absolutely no point in ordering message in queue, but not while consuming it. Kafka allows best of both worlds. It allows ordering the message within a partition right from the generation till consumption while allowing parallelism between multiple partition. Hence, if you need
Absolute ordering of all events published on a topic, use single partition. You will not have parallelism, nor do you need (again parallel and strict ordering don't go together).
Go for multiple partition and consumer, use consistent hashing to ensure all messages which need to follow relative order goes to a single partition.