Kafka message partitioning by key - apache-kafka

We have a business process/workflow that is being started when initial event message is received and closed when the last message is processed. We have up to 100,000 processes executed each day. My problem is that the order of the messages that come to specific process has to be processed by the same order messages were received. If one of the messages fails, the process has to freeze until the problem is fixed, despite that all other processes has to continue. For this kind of situation i am thinking of using Kafka. first solution that came to my mind was to use Topic partitioning by message key. The key of the message would be the ProcessId. This way i could be sure that all process messages would be partitioned and kafka would guarantee the order. As i am new to Kafka what i managed to figure out that partitions has to be created in advance and that makes everything to difficult. so my questions are:
1) when i produce message to kafka's topic that does not exist, the topic is created on runtime. Is it possible to have same behavior for topic partitions?
2) there can be more than 100,000 active partitions on the topic, is that a problem?
3) can partition be deleted after all messages from that topic were read?
4) maybe you can suggest other approaches to my problem?

When i produce message to kafka's topic that does not exist, the topic is created on runtime. Is it possible to have same behavior for topic partitions?
You need to specify number of partitions while creating topic. New Partitions won't be create automatically(as is the case with topic creation), you have to change number of partitions using topic tool.
More Info: https://kafka.apache.org/documentation/#basic_ops_modify_topi
As soon as you increase number of partitions, producer and consumer will be notified of new paritions, thereby leading them to rebalance. Once rebalanced, producer and consumer will start producing and consuming from new partition.
there can be more than 100,000 active partitions on the topic, is that a problem?
Yes, having this much partitions will increase overall latency.
Go through how-choose-number-topics-partitions-kafka-cluster on how to decide number of partitions.
can partition be deleted after all messages from that topic were read?
Deleting a partition would lead to data loss and also the remaining data's keys would not be distributed correctly so new messages would not get directed to the same partitions as old existing messages with the same key. That's why Kafka does not support decreasing partition count on topic.
Also, Kafka doc states that
Kafka does not currently support reducing the number of partitions for a topic.

I suppose you choose wrong feature to solve you task.
In general, partitioning is used for load balancing.
Incoming messages will be distributed on given number of partition according to the partitioning strategy which defined at broker start. In short, default strategy just calculate i=key_hash mod number_of_partitions and put message to ith partition. More about strategies you could read here
Message ordering is guaranteed only within partition. With two messages from different partitions you have no guarantees which come first to the consumer.
Probably you would use group instead. It's option for consumer
Each group consumes all messages from topic independently.
Group could consist of one consumer or more if you need it.
You could assign many groups and add new group (in fact, add new consumer with new groupId) dynamically.
As you could stop/pause any consumer, you could manually stop all consumers related to specified group. I suppose there is no single command to do that but I'm not sure. Anyway, if you have single consumer in each group you could stop it easily.
If you want to remove the group you just shutdown and drop out related consumers. No actions on broker side is needed.
As a drawback you'll get 100,000 consumers which read (single) topic. It's heavy network load at least.

Related

If I use Kafka as simple message. Does it really worth

=== Assume everything from consumer point of view ===
I was reading couple of Kafka articles and I saw that the number of partitions is coupled to number of micro-service instances.... Ex: If I say 1topic 1partition for my serviceA.. Producer pushes message to topicT1, partitionP1, and from consumerSide(ServiceA1) I can read from t1,p1. If I spin new pod(ServiceA2) to have highThroughput then second instance will never receive any message because Kafka/ZooKeeper assigns id to each Consumer and partition1 is already taken by serviceA1. So serviceA2++ stays idle... To avoid such a hassle Kafka recommends to add more partition, so that number of consumers can be increased/decreased based on need.
I was also able to test through commandLine and service2 never consumed any message. If I shut service1 then service2 was able to pick new message... So if I spin more pod then FailSafe/Availability increases but throughput is same always...
Is my assumption is correct. Am I missing anything. Now I feel like any standard messaging will have the same problem...How to extend message-oriented systems itself.
Every topic has a partition, by default it comes with only one partition if you don't define the partition count value. In your case, you have a consumer group that consists of two consumers. Every consumer read the log from the partition. In your case, first consumer read the log from the first partition(we have the only partition), and for second consumer there will be no partition to the consumer the data so it become idle. Once first consumer gets down then only the second consumer starts reading the data from the first partition from the last committed offset.
Please check below blogs and videos. It explains the topic, consumer, and consumer group in kafka.
https://www.javatpoint.com/apache-kafka-consumer-and-consumer-groups
http://cloudurable.com/blog/kafka-architecture-consumers/index.html
https://docs.confluent.io/platform/current/clients/consumer.html
https://www.youtube.com/watch?v=lAdG16KaHLs
I hope this will give you idea about the consumer and consumer group.
A broad solution to this is to decouple consumption of a message (i.e. receiving a message from Kafka and perhaps deserializing it and validating that it conforms to the schema) and processing it (interpreting the message). If the consumption is simple enough, being limited to no more instances consuming than there are partitions need not constrain.
One way to accomplish this is to have a Kafka consumption service which sends an HTTP request (perhaps through a load balancer or whatever) to a processing service which has arbitrarily many members.
Note that depending on what you're using Kafka for, there may be a requirement that certain messages always be in the same partition as one another in order to ensure that they get handled in a deterministic order (since ordering across partitions is not guaranteed). A typical example of this would be if the messages are change events for a particular record. If you're accomplishing this via some hash of the message key (or a portion of the key if using a custom partitioner), then simply changing the number of partitions might not be viable (you would need to introduce some sort of migration or have the producers know which records have to be routed to the old partitions and only route to the new partitions if the record has never been seen before).
We just started replacing messaging with Kafka.
In a traditional MQ there will be a cluster and 1orMQ will be there inside.
So the MQ cluster/co-ordinator service will deliver the message to clients.
Now there can be 10 services/clients which can consume message from single MQ.
So if there are 10 messages in MQ then each service/consumer/client can read/process 1 message
Now this case is not possible in Kafka which I understood now as per design
To achieve similar functionality in Kafka I have add equal or more number of partition as client/consumer/pods.

Kafka repartitioning

From my understanding partitions and consumers are tied up into a 1:1 relationship in which a single consumer processes a partition. However is there such a way to repartition in the middle of processing?
We are currently trying to optimize a process in which the topic gets consumed across a group but there are cases in which the data processing needs to take longer on a certain consumer while others are already idle. Its like data cleansing where a certain partition might no longer need cleansing while others require fuzzy matching thereby adding complexity to the task a consumer performs.
Your understanding with regards to partitions and consumers is not quite right.
If you have N partitions, then you can have up to N consumers within the same consumer group each of which reading from a single partition. When you have less consumers than partitions, then some of the consumers will read from more than one partition. Also, if you have more consumers than partitions then some of the consumers will be inactive and will receive no messages at all.
If you have one consumer per partition, then some of the partitions might receive more messages and this is why some of your consumers might be idle while some others might still processing some messages. Note that messages are not always inserted into topic partitions in a round-robin fashion as messages with the same key are placed into the same partition.
in kafka topics are partitioned, and even if you can add partitions to a topic there is no repartitioning: all the data already written to a partition stays there, new data will be partitioned among the existing partitions (in a round robin fashion if you do not define keys, otherwise one key will always land in the same partition as long as you do not add partitions.)
But if you have a consumer group, and you add or remove consumers to this group, there is a group rebalancing where each consumer receives its share of partitions to exclusively consume from.
So if you have 3 partitions (with evenly distributed messages among them) and 2 consumers (in the same group) one consumer will have twice as much messages to handle than the other; with 3 consumers each one will consume one partition; with 4 consumers one will stay idle...
So as you already have evenly distributed messages (which is good), you should have as many consumers as you have partitions, and if it is still not fast enough you may add n partitions and n consumers. (For sure you could also try to optimize the consumer but that is another story...)
Added to answer comment:
Once a consumer -- from a given group -- is consuming a partition, it will continue to do so and will be the only one from the group consuming this partition, even if a lot of other consumers from the same group are idle. In one group a partition is never shared between consumers. (If the consumer crashes, another one will continue the work, and if a new consumer enters the group a rebalance will occur, but anyway only one consumer will work on one partition at a given time).
So one approach, as said in your comment would be to distribute the load evenly over the partitions. Another approach, would be to have a topic dedicated to expensive jobs, let it have a lot of partitions and a lot of consumers; and let the topic for non-expensive jobs have fever consumers.
Last approach that I would not recommend would be to not use the consumer group features and to manage yourself how you consume from Kafka, by using assign and seek methods from the consumer. (See KafkaConsumer JavaDoc for more information). Spark Structured Streaming for example is using that approach, but it is much more complex...

How can Apache Kafka send messages to multiple consumer groups?

In the Kafka documentation:
Kafka handles this differently. Our topic is divided into a set of
totally ordered partitions, each of which is consumed by one consumer
at any given time. This means that the position of consumer in each
partition is just a single integer, the offset of the next message to
consume. This makes the state about what has been consumed very small,
just one number for each partition. This state can be periodically
checkpointed. This makes the equivalent of message acknowledgements
very cheap.
Yet, following their quick start guide in that same document, I was easily able to:
Create a topic with a single partition
Start a console-producer
Push a few messages
Start a consumer to consume --from-beginning
Start another consumer --from-beginning
And have both consumers successfully consume from the same partition.
But this seems at odds with the documentation above?
When using different consumer groups, consumers can consume the same partitions easily. You may consider group ids as different applications consuming a Kafka topic. Multiple different applications might want to use the data in a Kafka topic differently and thus not to conflict with other applications. That's why two consumers may consume one partition (in fact the only way how two consumers can consume one partition).
And when you start a console consumer it randomly generates a group id for it (link) thus these consumers are doing exactly what I just wrote.

Apache Kafka order of messages with multiple partitions

As per Apache Kafka documentation, the order of the messages can be achieved within the partition or one partition in a topic. In this case, what is the parallelism benefit we are getting and it is equivalent to traditional MQs, isn't it?
In Kafka the parallelism is equal to the number of partitions for a topic.
For example, assume that your messages are partitioned based on user_id and consider 4 messages having user_ids 1,2,3 and 4. Assume that you have an "users" topic with 4 partitions.
Since partitioning is based on user_id, assume that message having user_id 1 will go to partition 1, message having user_id 2 will go to partition 2 and so on..
Also assume that you have 4 consumers for the topic. Since you have 4 consumers, Kafka will assign each consumer to one partition. So in this case as soon as 4 messages are pushed, they are immediately consumed by the consumers.
If you had 2 consumers for the topic instead of 4, then each consumer will be handling 2 partitions and the consuming throughput will be almost half.
To completely answer your question,
Kafka only provides a total order over messages within a partition, not between different partitions in a topic.
ie, if consumption is very slow in partition 2 and very fast in partition 4, then message with user_id 4 will be consumed before message with user_id 2. This is how Kafka is designed.
I decided to move my comment to a separate answer as I think it makes sense to do so.
While John is 100% right about what he wrote, you may consider rethinking your problem. Do you really need ALL messages to stay in order? Or do you need all messages for specific user_id (or whatever) to stay in order?
If the first, then there's no much you can do, you should use 1 partition and lose all the parallelism ability.
But if the second case, you might consider partitioning your messages by some key and thus all messages for that key will arrive to one partition (they actually might go to another partition if you resize topic, but that's a different case) and thus will guarantee that all messages for that key are in order.
In kafka Messages with the same key, from the same Producer, are delivered to the Consumer in order
another thing on top of that is, Data within a Partition will be stored in the order in which it is written therefore, data read from a Partition will be read in order for that partition
So if you want to get your messages in order across multi partitions, then you really need to group your messages with a key, so that messages with same key goes to same partition and with in that partition the messages are ordered.
In a nutshell, you will need to design a two level solution like above logically to get the messages ordered across multi partition.
You may consider having a field which has the Timestamp/Date at the time of creation of the dataset at the source.
Once, the data is consumed you can load the data into database. The data needs to be sorted at the database level before using the dataset for any usecase. Well, this is an attempt to help you think in multiple ways.
Let's consider we have a message key as the timestamp which is generated at the time of creation of the data and the value is the actual message string.
As and when a message is picked up by the consumer, the message is written into HBase with the RowKey as the kafka key and value as the kafka value.
Since, HBase is a sorted map having timestamp as a key will automatically sorts the data in order. Then you can serve the data from HBase for the downstream apps.
In this way you are not loosing the parallelism of kafka. You also have the privilege of processing sorting and performing multiple processing logics on the data at the database level.
Note: Any distributed message broker does not guarantee overall ordering. If you are insisting for that you may need to rethink using another message broker or you need to have single partition in kafka which is not a good idea. Kafka is all about parallelism by increasing partitions or increasing consumer groups.
Traditional MQ works in a way such that once a message has been processed, it gets removed from the queue. A message queue allows a bunch of subscribers to pull a message, or a batch of messages, from the end of the queue. Queues usually allow for some level of transaction when pulling a message off, to ensure that the desired action was executed, before the message gets removed, but once a message has been processed, it gets removed from the queue.
With Kafka on the other hand, you publish messages/events to topics, and they get persisted. They don’t get removed when consumers receive them. This allows you to replay messages, but more importantly, it allows a multitude of consumers to process logic based on the same messages/events.
You can still scale out to get parallel processing in the same domain, but more importantly, you can add different types of consumers that execute different logic based on the same event. In other words, with Kafka, you can adopt a reactive pub/sub architecture.
ref: https://hackernoon.com/a-super-quick-comparison-between-kafka-and-message-queues-e69742d855a8
Well, this is an old thread, but still relevant, hence decided to share my view.
I think this question is a bit confusing.
If you need strict ordering of messages, then the same strict ordering should be maintained while consuming the messages. There is absolutely no point in ordering message in queue, but not while consuming it. Kafka allows best of both worlds. It allows ordering the message within a partition right from the generation till consumption while allowing parallelism between multiple partition. Hence, if you need
Absolute ordering of all events published on a topic, use single partition. You will not have parallelism, nor do you need (again parallel and strict ordering don't go together).
Go for multiple partition and consumer, use consistent hashing to ensure all messages which need to follow relative order goes to a single partition.

Data Modeling with Kafka? Topics and Partitions

One of the first things I think about when using a new service (such as a non-RDBMS data store or a message queue) is: "How should I structure my data?".
I've read and watched some introductory materials. In particular, take, for example, Kafka: a Distributed Messaging System for Log Processing, which writes:
"a Topic is the container with which messages are associated"
"the smallest unit of parallelism is the partition of a topic. This implies that all messages that ... belong to a particular partition of a topic will be consumed by a consumer in a consumer group."
Knowing this, what would be a good example that illustrates how to use topics and partitions? When should something be a topic? When should something be a partition?
As an example, let's say my (Clojure) data looks like:
{:user-id 101 :viewed "/page1.html" :at #inst "2013-04-12T23:20:50.22Z"}
{:user-id 102 :viewed "/page2.html" :at #inst "2013-04-12T23:20:55.50Z"}
Should the topic be based on user-id? viewed? at? What about the partition?
How do I decide?
When structuring your data for Kafka it really depends on how it´s meant to be consumed.
In my mind, a topic is a grouping of messages of a similar type that will be consumed by the same type of consumer so in the example above, I would just have a single topic and if you´ll decide to push some other kind of data through Kafka, you can add a new topic for that later.
Topics are registered in ZooKeeper which means that you might run into issues if trying to add too many of them, e.g. the case where you have a million users and have decided to create a topic per user.
Partitions on the other hand is a way to parallelize the consumption of the messages. The total number of partitions in a broker cluster need to be at least the same as the number of consumers in a consumer group to make sense of the partitioning feature. Consumers in a consumer group will split the burden of processing the topic between themselves according to the partitioning so that one consumer will only be concerned with messages in the partition itself is "assigned to".
Partitioning can either be explicitly set using a partition key on the producer side or if not provided, a random partition will be selected for every message.
Once you know how to partition your event stream, the topic name will be easy, so let's answer that question first.
#Ludd is correct - the partition structure you choose will depend largely on how you want to process the event stream. Ideally you want a partition key which means that your event processing is partition-local.
For example:
If you care about users' average time-on-site, then you should partition by :user-id. That way, all the events related to a single user's site activity will be available within the same partition. This means that a stream processing engine such as Apache Samza can calculate average time-on-site for a given user just by looking at the events in a single partition. This avoids having to perform any kind of costly partition-global processing
If you care about the most popular pages on your website, you should partition by the :viewed page. Again, Samza will be able to keep a count of a given page's views just by looking at the events in a single partition
Generally, we are trying to avoid having to rely on global state (such as keeping counts in a remote database like DynamoDB or Cassandra), and instead be able to work using partition-local state. This is because local state is a fundamental primitive in stream processing.
If you need both of the above use-cases, then a common pattern with Kafka is to first partition by say :user-id, and then to re-partition by :viewed ready for the next phase of processing.
On topic names - an obvious one here would be events or user-events. To be more specific you could go with with events-by-user-id and/or events-by-viewed.
This is not exactly related to the question, but in case you already have decided upon the logical segregation of records based on topics, and want to optimize the topic/partition count in Kafka, this blog post might come handy.
Key takeaways in a nutshell:
In general, the more partitions there are in a Kafka cluster, the higher the throughput one can achieve. Let the max throughout achievable on a single partition for production be p and consumption be c. Let’s say your target throughput is t. Then you need to have at least max(t/p, t/c) partitions.
Currently, in Kafka, each broker opens a file handle of both the index and the data file of every log segment. So, the more partitions, the higher that one needs to configure the open file handle limit in the underlying operating system. E.g. in our production system, we once saw an error saying too many files are open, while we had around 3600 topic partitions.
When a broker is shut down uncleanly (e.g., kill -9), the observed unavailability could be proportional to the number of partitions.
The end-to-end latency in Kafka is defined by the time from when a message is published by the producer to when the message is read by the consumer. As a rule of thumb, if you care about latency, it’s probably a good idea to limit the number of partitions per broker to 100 x b x r, where b is the number of brokers in a Kafka cluster and r is the replication factor.
I think topic name is a conclusion of a kind of messages, and producer publish message to the topic and consumer subscribe message through subscribe topic.
A topic could have many partitions. partition is good for parallelism. partition is also the unit of replication,so in Kafka, leader and follower is also said at the level of partition. Actually a partition is an ordered queue which the order is the message arrived order. And the topic is composed by one or more queue in a simple word. This is useful for us to model our structure.
Kafka is developed by LinkedIn for log aggregation and delivery. this scene is very good as a example.
The user's events on your web or app can be logged by your Web sever and then sent to Kafka broker through the producer. In producer, you could specific the partition method, for example : event type (different event is saved in different partition) or event time (partition a day into different period according your app logic) or user type or just no logic and balance all logs into many partitions.
About your case in question, you can create one topic called "page-view-event", and create N partitions through hash keys to distribute the logs into all partitions evenly. Or you could choose a partition logic to make log distributing by your spirit.