Does one consumer thread against many partitions per topic in Kafka can cause latency? - apache-kafka

Our kafka setup is as follows:
30 partitions per topic
1 consumer thread
we configured this way to be able to scale-up in the future.
we wanted to minimize the times we re-balance when we need to scale-up by adding partitions because latency is very important to us and during re-balances messages can be stuck till the coordination phase is done
Having 1 consumer thread with many partitions per 1 topic can effect somehow the overall messaging consuming latency?

More partitions in a Kafka cluster leads to higher throughput however, you need to be aware that the number of partitions has an impact on availability and latency as well.
In general more partitions,
Lead to Higher Throughput
Require More Open File Handles
May Increase Unavailability
May Increase End-to-end Latency
May Require More Memory In the Client
You need to study the trade-offs and make sure that you've picked the number of partitions that satisfies your requirements regarding throughput, latency and required resources.
For further details refer to this blog post from Confluent.
My opinion: Make some tests and write down your findings. For example, try to run a single consumer over a topic with 5, 10, 15, ... partitions, measure the impact and pick the configuration that meets your requirements. Finally ask yourself if you will ever need x partitions. At the end of the day, if you need more partitions you should not worry about re-balancing etc. Kafka was designed to be scalable .

Related

Are 3k kafka topics decrease performance?

I have a Kafka Cluster (Using Aivan on AWS):
Kafka Hardware
Startup-2 (2 CPU, 2 GB RAM, 90 GB storage, no backups) 3-node high availability set
Ping between my consumers and the Kafka Broker is 0.7ms.
Backgroup
I have a topic such that:
It contains data about 3000 entities.
Entity lifetime is a week.
Each week there will be different 3000 entities (on avg).
Each entity may have between 15k to 50k messages in total.
There can be at most 500 messages per second.
Architecture
My team built an architecture such that there will be a group of consumers. They will parse this data, perform some transformations (without any filtering!!) and then sends the final messages back to the kafka to topic=<entity-id>.
It means I upload the data back to the kafka to a topic that contains only a data of a specific entity.
Questions
At any given time, there can be up to 3-4k topics in kafka (1 topic for each unique entity).
Can my kafka handle it well? If not, what do I need to change?
Do I need to delete a topic or it's fine to have (alot of!!) unused topics over time?
Each consumer which consumes the final messages, will consume 100 topics at the same time. I know kafka clients can consume multiple topics concurrenctly but I'm not sure what is the best practices for that.
Please share your concerns.
Requirements
Please focus on the potential problems of this architecture and try not to talk about alternative architectures (less topics, more consumers, etc).
The number of topics is not so important in itself, but each Kafka topic is partitioned and the total number of partitions could impact performance.
The general recommendation from the Apache Kafka community is to have no more than 4,000 partitions per broker (this includes replicas). The linked KIP article explains some of the possible issues you may face if the limit is breached, and with 3,000 topics it would be easy to do so unless you choose a low partition count and/or replication factor for each topic.
Choosing a low partition count for a topic is sometimes not a good idea, because it limits the parallelism of reads and writes, leading to performance bottlenecks for your clients.
Choosing a low replication factor for a topic is also sometimes not a good idea, because it increases the chance of data loss upon failure.
Generally it's fine to have unused topics on the cluster but be aware that there is still a performance impact for the cluster to manage the metadata for all these partitions and some operations will still take longer than if the topics were not there at all.
There is also a per-cluster limit but that is much higher (200,000 partitions). So your architecture might be better served simply by increasing the node count of your cluster.

Ideal number of partitions for Kafka topic

I am currently working on a setup which has 6 kafka-brokers, Data is being pushed into my topic from two producers at a rate of about 4000 messages per second, I have 5 Consumers for this topic working as a group. What should be the ideal number of partitions of my kafka topic?
Please feel free to tell me if any change is required in brokers/consumers/producers as well.
In general more the partitions - more the throughput. However there are other considerations too like the limits of hardware you are running on, whether you are using compression etc. There is a good enough information from Confluent here which provides you insight into rough calculation you can use to arrive at number of partitions.
A rough formula for picking the number of partitions is based on
throughput. You measure the throughout that you can achieve on a
single partition for production (call it p) and consumption (call it
c). Let’s say your target throughput is t. Then you need to have at
least max(t/p, t/c) partitions. The per-partition throughput that one
can achieve on the producer depends on configurations such as the
batching size, compression codec, type of acknowledgement, replication
factor, etc.
Moreover for consumer
The consumer throughput is often application dependent since it
corresponds to how fast the consumer logic can process each message
So the best way is to measure and benchmark for your own use case

Does kafka support millions of partitions?

Will we have any problem if we have millions of partitions for one topic?
Due to our business requirement, we are thinking if we can make a partition for every user in kafka.
We have millions of users.
Any insight would be appreciated!
Yes, I think you will end up having problems if you have millions of partitions for several reasons:
(Most importantly!!) Customers come and go, so you will have the requirement to constantly change the number of partitions or have plenty of unused partitions (because you can not reduce the number of partitions within a topic).
More Partitions Requires More Open File Handles: More Partitions means more directories and segment files on disk.
More Partitions May Increase Unavailability: Planned failures move Leaders off of a Broker one at a time, with minimal downtime per partition. In a hard failure all the leaders are immediately unavailable.
More Partitions May Increase End-to-end Latency: For the message to be seen by a Consumer it must be committed. The Broker replicates data from the leader with a single thread, resulting in overhead per Partition.
More Partitions May Require More Memory In the Client
More details are provided in the blog from Confluent on How to choose the number of topics/partitions in a Kafka cluster?.
In addition, according to Confluent's training material for Kafka developers it is recommended:
"The current limits (2-4K Partitions/Broker, 100s K Partitions per cluster) are maximums. Most environments are well below these values (typically in the 1000-1500 range or less per Broker)."
This blog explains that "Apache Kafka Supports 200K Partitions Per Cluster".
This might change with the replacement of Zookeeper KIP-500 but, again, looking at the first bullet point above this will still be a unhealthy software design.

Kafka - Best practices in case of slow processing consumer. How to achieve more parallelism?

I'm aware that the maximum number of active consumers in a consumer group is the number of partitions of a topic.
What's the best practice in case of slow processing consumers? How to achieve more parallelism?
An example: A topic with 6 partitions and thousands of messages per second produced from Producers. So I have at most 6 consumers in the group. Consider that processing those messages is complex and the consumers are much slower than the producers. The result is that the consumers are always behind the last offset and the lag is increasing.
In a traditional MQ system, we simply add more and more consumers to stay up to date.
How to achieve this with Kafka, since the total of the consumers in a group is at most the number of partitions? Should I:
Configure the topic to have more partitions allowing more consumers per group?
Route the message from the consumer to a traditional MQ Queue (but lose the ordering)?
What's the best practice for this situation?
In Kafka, partitions are the unit of parallelism.
Without knowing our exact use case and requirements it's hard to come up with precise recommendations but there are a few options.
First you should really consider having more partitions. 6 partitions is relatively small, you could easily have 60, 120 or even more partitions (and the corresponding number of consumers). Suddenly the amount of work each consumers has to do is significantly reduced.
Also if your requirements allow, you can also consume at a fast rate and spread the processing of records across many workers. In solutions like this it's harder to maintain ordering but if you don't need it then you can consider it.
I'm not sure how routing messages through a MQ Queue would really help in this scenario. If you are still reading slower than writing the amount of data in the queue will grow till you have no disk space left.
Kafka is better designed to serve as buffer between your producers and consumers so just ensure you have retention limits on your topics that allow some flexibility on the consumer side without losing data.

Kafka: Is our number of partitions insane?

We have a 3 host Kafka cluster. We have 136 topics, each of which has 100 partitions, with a replication factor of 3. This makes for 13,600 partitions across our cluster.
Is this a sane configuration of our topics?
It's too many. You should ask yourself if you have (or plan to have soon) enough consumer instances to need that many partitions. Then, if you do plan to have 13k consumer instances, what sort of hardware are you running these brokers on such that they would be able to serve that many consumers? That's even before your consider the additional impact of many partitions pre-1.1 https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
This to me looks like 100 was a round number and seemed future proof. I'd suggest starting at a much lower number per topic (like say 2 or 10) and see if you actually hit scale issues that demand more partitions before trying to jump to expert mode. You can always add more partitions later.
The short answer to your question is 'It depends'.
More partitions in a Kafka cluster leads to higher throughput however, you need to be aware that the number of partitions has an impact on availability and latency.
In general more partitions,
Lead to Higher Throughput
Require More Open File Handles
May Increase Unavailability
May Increase End-to-end Latency
May Require More Memory In the Client
You need to study the trade-offs and make sure that you've picked the number of partitions that satisfies your requirements regarding throughput, latency and required resources.
For further details refer to this blog post from Confluent.
Partitions = max(NP, NC)
where:
NP is the number of required producers determined by calculating: TT/TP.
NC is the number of required consumers determined by calculating: TT/TC.
TT is the total expected throughput for our system.
TP is the max throughput of a single producer to a single partition.
TC is the max throughput of a single consumer from a single partition.