Scenario:
Running spring-boot project consuming from partition named 'test' that has 10 partitions. Partition assignment happens at 13:00:00
At ~13:00:30 adding partitions to the topic using: ./kafka-topics.sh --alter --zookeeper zookeeper:2181 --topic test --partitions 100
At ~13:05:30 partition reassignment is triggered.
I ran those steps few times and it looks like that reassignment happens every ~5 minutes.
Is there a way to change the reassignment check operation frequency?
We would like it to be every few seconds. Does this operation is heavy and this is the reason that it happens every 5 minutes? Or it's pretty negligible?
EDIT:
My use case is the following: we have integration tests which boots our microservices. When a consumer of a topic boots first, it creates the topic if it does not exists and the number of partitions it's creating equals to the configured concurrency (10 for example). Then the producer of this topic boots and his configured partitonCount (20 for example) is bigger than the number of created partitions, so spring-cloud-stream adds the missing partitions, in the mean time the consumer assigned partitions , haven't changed and it keeps consuming from the first 10 partitions (1-10). The problem is that the producer is publishing messages to all 20 partitions so messages that are sent to the last 10 partitions (11-20) will not be consumed until the consumer is assigned the new partitions. This behavior causes problems to our tests and we cannot wait for 5 minutes until all of the partitions are assigned to consumers. Also we would not want to create the topic with the number of desired partitions in advance, and we would like that it will still be handled by spring-cloud-stream.
EDIT 2:
It seems like that the relevant property that controls the "reassignment" is metadata.max.age.ms.
The period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions.
So there are couple of concerns here.
First, "spring-cloud-stream" and/or "spring-kafka" are NOT doing any type of rebalancing, partition reassignment etc. This is all done inside of Kafka. There is a client-side property in Kafka that defaults to 5 min (i believe) if a consumer didn't poll for that much time consider it dead etc. In any event I would refer you to apache-kafka channel to get more information on Kafka internals.
Also, adding partitions, reassignments and rebalancing are expensive operations and should not be attempted without serious consideration of their impact. So, I'd be curious to know what is your use case for constantly adding partitions?
Related
My topic has many partitions, many producers, but only one consumer. The consumer can run only for a short period of time, let's say one minute. During this period, I need to make sure it will consume all the records from all the partitions that were produced before the consumer was initialized, but ALSO the records produced during the minute the consumer was running.
The problem is that can't find the correct partition assignation strategy that guarantee I will get all the partitions. If I use consumer.Subscribe(topic), I will only get some partitions, not all. If I use consumer.Assign(partition) during the initialization, I WILL get all the active partitions, but if a new partition comes along and receives records, I will miss those.
The only solution I have so far is to re-do the assignments periodically (every 10 seconds).
If I use consumer.Subscribe(topic), I will only get some partitions, not all
It should get all. If you don't get all, then that means you more than likely have some un-closed consumer instance in the same consumer group that has already been assigned other partitions.
You can periodically run kafka-consumer-groups --describe command to inspect this.
Using assignment doesn't use the consumer group protocol, thus why it would work.
There is no guarantee Kafka can provide that your consumer will read any/all data between two time intervals. You'd need to track this on your own, and may potentially require your consumer instance to run for longer than you expect.
I am investigating on Kafka to assess its suitability for our use case. Can you please help me understand how flexible is Kafka with changing the number of partitions for an existing Topic?
Specifically,
Is it possible to change the number of partitions without tearing down the cluster?
And is it possible to do that without bringing down the topic?
Will adding/removing partitions automatically take care of redistributing messages across the new partitions?
Ideally, I would want the change to be transparent to the producers and consumers. Does Kafka ensure this?
Update:
From my understanding so far, it looks like Kafka's design cannot allow this because it mapping of consumer groups to partitions will have to be altered. Is that correct?
1.Is it possible to change the number of partitions without tearing down the cluster?
Yes kafka supports increasing the number of partitions at runtime but doesn't support decreasing number of partitions due to its design
2.And is it possible to do that without bringing down the topic?
Yes provided you are increasing partitions.
3.Will adding/removing partitions automatically take care of redistributing messages across the new partitions?
As mentioned earlier removing partitions is not supported .
When you increase the number of partitions, the existing messages will remain in the same partitions as before only the new messages will be considered for new partitions (also depending on you partitioner logic). Increasing the partitions for a topic will trigger a cluster rebalance , where in the consumers and producers will get notified with the updated metadata of the topics. Producers will start sending messages to new partitions after receiving updated metadata and consumer rebalancer will redistribute the partitions among the consumers groups and resume consumption from the last committed offset.All this will happen under the hood , so you wont have to do any changes at client side
Yes, it it perfectly possible. You just execute the following command against the topic of your choice: bin/kafka-topics.sh --zookeeper zk_host:port --alter --topic <your_topic_name> --partitions <new_partition_count>. Remember, Kafka only allows increasing the number of partitions, because decreasing it would cause in data loss.
There's a catch here. Kafka doc says the following:
Be aware that one use case for partitions is to semantically partition
data, and adding partitions doesn't change the partitioning of
existing data so this may disturb consumers if they rely on that
partition. That is if data is partitioned by hash(key) %
number_of_partitions then this partitioning will potentially be
shuffled by adding partitions but Kafka will not attempt to
automatically redistribute data in any way.
Yes, if by bringing down the topic you mean deleting the topic.
Once you've increased the partition count, Kafka would trigger a rebalance, for consumers who are subscribing to that topic, and on subsequent polls, the partitions would get distributed across the consumers. It's transparent to the client code, you don't have to worry about it.
NOTE: As I mentioned before, you can only add partitions, removing is not possible.
+one more thing, if you are using stateful operations in clients like aggregations(making use of statestore), change in partition will kill all the streams thread in consumer. This is expected as increase in partition may corrupt stateful applications. So beware changing partition size, it may break stateful consumers connected to the topic.
Good read: Why does kafka streams threads die when the source topic partitions changes ? Can anyone point to reading material around this?
I'm confused to what degree partition assignment is a client side concern partition.assignment.strategy and what part is handled by Kafka.
For example, say I have one kafka topic with 100 partitions.
If I make 1 app that runs 5 threads of consumers, with a partition.assignment.strategy of RangeAssignor then I should get 5 consumers each consuming 25 partitions.
Now if I scale this app by deploying it 4 times, and using the same consumer group. Will kafka first divide 25 partitions to each of these apps on its side, and only then are these 25 partitions further subdivided by the app using the PartitionStrategy?
Which would result neatly in 4 apps with 5 consumers each, consuming 5 partitions each.
The behavior of the default Assignors is well documented in the Javadocs.
RangeAssignor is the default Assignor, see its Javadoc for example of assignment it generates: http://kafka.apache.org/21/javadoc/org/apache/kafka/clients/consumer/RangeAssignor.html
If you have 20 consumers using RangeAssignor that are consuming from a topic with 100 partitions, each consumer will be assigned 5 partitions.
Because RangeAssignor assigns partitions topic by topic, it can create really unbalanced assignments if you have topics with very few partitions. In that case, RoundRobinAssignor works better
As part of group management, the consumer will keep track of the list of consumers that belong to a particular group and will trigger a rebalance operation if any one of the following events are triggered:
Number of partitions change for any of the subscribed topics
A subscribed topic is created or deleted
An existing member of the consumer group is shutdown or fails.
A new member is added to the consumer group.
Most likely point no. 4 is your case and the strategy used will be the same(partition.assignment.strategy). Not that this is not applicable if you have explicitly specified the partition to be consumed by your consumer
We started to use Apache Kafka to persist Timeseries data into a Timeseries database. What we started with was to just have a single topic, a producer writing to this topic and a single consumer reading from this topic and dumping the data to the Timeseries database.
We had 3 broker instances and what we noticed in the first try was that the producer was pretty fast in writing messages to the topic. Within a matter of 30 minutes, we had around 1.5 million messages. The consumer was just doing 300 messages per second.
Our next approach was to partition the topic and have more consumer instances (equal to the number of partitions). This definitely improved on the consumer write speed. Now my questions are:
What happens if I set my topic partition to 6, but I have only 3 broker instances. Which broker instance would be the leader for partition 1 to 6?
Is there a formula to determine how many partitions would I be needing? Since this was our test environment, we could play with it and scale it. We might not be able to do the same on our production environment. So how to determine the partition size?
The partitions get distributed amongst your brokers. It's impossible to know which broker will be elected leader of a given partition -- and it can change over time. Depending on which version of Kafka and which Consumer API you use, your consumer may or may not discover partition leaders on its own. With the SimpleConsumer you have to find partition leaders on your own, and respond to new leader election in your code (instead of having it handled by the API automatically).
As to the number of partitions -- there's no real "formula" other than this: you can have no more parallelism than you have partitions. If you have 4 partitions and 5 consumers, one of the consumers will starve. I usually use numbers like 12 or 60 or multiples thereof for the number of partitions for large topics. Something that divides easily and cleanly among variable numbers of consumers.
Also, note that you can later on change the number of partitions, with some caveats. See this answer for how and what the caveats are.
We are using kafka cluster with 3 servers, all of them are having zookeeper as well. we have 7 topics each with 6 partitions. We have 3 java consumers for each topic. When I start consumer it takes almost 3-5 min to assign partitions to consumers. The same behavior is encountered when we stop one of the consumer and start is again. How can I control or reduce it?
Please note, I am using kafka 0.9, with new consumer
I have added below properties in server.properties of each kafka
auto.leader.rebalance.enable=true
leader.imbalance.check.interval.seconds=10
Let me know if you need more information.
Thanks
Check the value your consumer is using for 'session.timeout.ms'.
The default is 30 seconds and the co-ordination won't trigger a rebalance until this time has passed E.g. no heartbeat for 30 seconds.
The danger in making this lower is if you take too long to process the messages a rebalance might occur because the co-ordinator will think your consumer is dead.