Kafka and Event Streaming On Client Side? - apache-kafka

I need to consume messages from a event source (represented as a single Kafka topic) producing about 50k to 250k events per second. It only provides a single partition and the ping is quite high (90-100ms).
As far as I have learned by reading the Kafka client code, during polling a fetch request is issued and once the response is fully read, the events/messages are parsed and deserialized and once enough events/messages are available consumer.poll() will provide the subset to the calling application.
In my case this makes the whole thing not worth while. The best throughput I achieve with about 2s duration per fetch request (about 2.5MB fetch.max.bytes). Smaller fetch durations will increase the idle time (time the consumer does not receive any bytes) between last byte of previous response, parsing, deserialization and sending next request and waiting for the first byte of the next request's response.
Using a fetch duration of about 2s results in a max latency of 2s which is highly undesirable. What I would like to see is while receiving the fetch response, that the messages transmitted are already available to the consumer as soon as a individual message is fully transmitted.
Since every message has an individual id and the messages are send in a particular order while only a single consumer (+thread) for a single partition is active, it is not a problem to suppress retransmitted messages in case a fetch response is aborted / fails and its messages were partially processed and later on retransmitted.
So the big question is, if the Kafka client provides a possibility to consume messages from a not-yet completed fetch response.

That is a pretty large amount of messages coming in through a single partition. Since you can't control anything on the Kafka server, the best you can do is configure your client to be as efficient as possible, assuming you have access to Kafka client configuration parameters. You didn't mention anything about needing to consume the messages as fast as they're generated, so I'm assuming you don't need that. Also I didn't see any info about average message size, how much message sizes vary, but unless those are crazy values, the suggestions below should help.
The first thing you need to do is set max.poll.records on the client side to a smallish number, say, start with 10000, and see how much throughput that gets you. Make sure to consume without doing anything with the messages, just dump them on the floor, and then call poll() again. This is just to benchmark how much performance you can get with your fixed server setup. Then, increase or decrease that number depending on if you need better throughput or latency. You should be able to get a best scenario after playing with this for a while.
After having done the above, the next step is to change your code so it dumps all received messages to an internal in-memory queue, and then call poll() again. This is especially important if processing of each message requires DB access, hitting external APIs, etc. If you take even 100ms to process 1K messages, that can reduce your performance in half in your case (100ms to poll/receive, and then another 100ms to process the messages received before you start the next poll())
Without having access to Kafka configuration parameters on the server side, I believe the above should get you pretty close to an optimal throughput for your configuration.
Feel free to post more details in your question, and I'd be happy to update my answer if that doesn't help.

To deal with such a high throughput, this is what community recommendations for number of partitions on a source topic. And it is worth considering all these factors when choosing the number of partitions.
• What is the throughput you expect to achieve for the topic?
• What is the maximum throughput you expect to achieve when
consuming from a single partition?
• If you are sending messages to partitions based on keys,
adding partitions later can be very challenging, so calculate
throughput based on your expected future usage, not the current
usage.
• Consider the number of partitions you will place on each
broker and available diskspace and network bandwidth per
broker.
So if you want to be able to write and read 1 GB/sec from a topic, and each consumer can only process 50 MB/s, then you need at least 20 partitions. This way, you can have 20 consumers reading from the topic and achieve 1 GB/sec.
Also,
Regarding the fetch.max.bytes, I am sure you have already had a glance on this one Kafka fetch max bytes doesn't work as expected.

Related

Single distributed system to handle large and small transactions

I have a kafka topic. The producer publishes 2 kinds of messages to this topic. Large messages which take more time to process and then small or fast processing messages. The small messages are of large volume (80%). The consumer receives these messages and sends these messages to our processing system. Our processing system have set of microservices deployed in Kubernetes environment as pods (which provides option to scaling).
I have to get the overall processing time as 200ms per transaction and system processing speed of (with scaling) to 10000 tps.
Now what is the better way to design this system in such way that small messages are processed with no blockage from large messages. Or is there a way to isolate the large messages in same channel without impacting processing small messages. Looking for your valuable inputs.
I have put a sample control flow of our system
.
The one option which I have is that consumer diverts the large message to one system and small messages to other system. But this doesn't seem like a good design and nightmare to maintain 2 systems with same functionalities. Also this could lead improper resource allocation.
I will assume large message and small messages can be processed out of order. Otherwise small messages will have to wait for large message and there is no parallelization possible.
I will also assume, you can not change producer to write large messages to another topic. Otherwise, you can just ask producers to send large messages to a different topic, with lesser number of consumers, so large messages will not block small messages.
Ok, with above two assumptions, following is the simplest solution:
On the consumer, if you read a small message, forward it to the message parser as you are doing today.
On the consumer, if you read a large message, instead of forwarding to the message parser, send it to another topic. Let's call it "Large Message Topic"
Configure limited number of consumers on the "Large Message Topic" to read and process large messages.
Alternatively, you will have to take control of commit offset, and add a little more complexity to your consumer code. You can use the solution below:
Disable auto commit, don't call commit on consumer after reading each batch.
If you read a small message, forward it to the message parser as you are doing today.
If you read large messages, send them to another thread/thread pool on your consumer process, that will forward it to the message parser. This thread pool processes in coming messages in a sequence, and keeps track of last offset completed.
Once in a while, you call commit with offset = min (consumer offset, large message offset)

How to implement fair scheduling between multiple tennants writing to 1 stream

As of now I have single Kafka Topic with 10 partitions. We have 10000 clients who keep dumping uncontrolled data into streams. The problem currently is that
A tenant with out any notice (or little notice) floods the topic
now the messages from other tenants suffer --> because their messages (handful) are queued behind and will take several hours to get their turn for processing
Question:
Can I somehow read may be 1k messages per tenant and roundrobin --> essentially like fair scheduling of Hadoop yarn
Can Apache pulsar help me in this? If yes then is there any example you can point me to?
I went through: https://www.confluent.io/blog/prioritize-messages-in-kafka/ already; but given the volume of clients it may not be practical to have 100k partitions etc.
I'm not aware of any way to get what you want out of the box. You could probably have the consumer pause some partitions to prioritize consumption from the ones with more messages (for example, by checking the lag per partition after every few poll iterations).
I'm not familiar enough with Apache Pulsar to have a clear answer.
I have a similar problem: a single customer can monopolize the resources and delay execution from all other customers, just because their events arrived first.
On a different application with a low amount of messages, we just load all the events in memory, creating a in-memory queue for every customer and then dequeuing up to N events from each customer queue and re-queue them again into a different queue, lets call it the re-ordered queue. The re-ordered queue has a capacity limit. (lets say...100*N), so no additional elements are queue until there is space. This guarantees equal treatment to all customers.
I am facing the same problem now with an application that processes billions of messages. The solution above is impossible; there is just not enough RAM. We can't keep all the data in memory. Creating a topic for each customer also sounds overkill; specially if you have a variable set of active customers at any given point in time. Nevertheless, Pulsar seems to handle well thousands, even millions, of topics.
So the technique above may work well for you (and for me).
Just read from thousands of topics... write to another topic a limited number of messages and then wait for it to have "space" to continue enqueuing.

Streaming audio streams trough MQ (scalability)

my question is rather specific, so I will be ok with a general answer, which will point me in the right direction.
Description of the problem:
I want to deliver specific task data from multiple producers to a particular consumer working on the task (both are docker containers run in k8s). The relation is many to many - any producer can create a data packet for any consumer. Each consumer is processing ~10 streams of data at any given moment, while each data stream consists of 100 of 160b messages per second (from different producers).
Current solution:
In our current solution, each producer has a cache of a task: (IP: PORT) pair values for consumers and uses UDP data packets to send the data directly. It is nicely scalable but rather messy in deployment.
Question:
Could this be realized in the form of a message queue of sorts (Kafka, Redis, rabbitMQ...)? E.g., having a channel for each task where producers send data while consumer - well consumes them? How many streams would be feasible to handle for the MQ (i know it would differ - suggest your best).
Edit: Would 1000 streams which equal 100 000 messages per second be feasible? (troughput for 1000 streams is 16 Mb/s)
Edit 2: Fixed packed size to 160b (typo)
Unless you need disk persistence, do not even look in message broker direction. You are just adding one problem to an other. Direct network code is a proper way to solve audio broadcast. Now if your code is messy and if you want a simplified programming model good alternative to sockets is a ZeroMQ library. This will give you all MessageBroker functionality for which you care: a) discrete messaging instead of streams, b) client discoverability; without going overboard with another software layer.
When it comes to "feasible": 100 000 messages per second with 160kb message is a lot of data and it comes to 1.6 Gb/sec even without any messaging protocol on top of it. In general Kafka shines at message throughput of small messages as it batches messages on many layers. Knowing this sustained performances of Kafka are usually constrained by disk speed, as Kafka is intentionally written this way (slowest component is disk). However your messages are very large and you need to both write and read messages at same time so I don't see it happen without large cluster installation as your problem is actual data throughput, and not number of messages.
Because you are data limited, even other classic MQ software like ActiveMQ, IBM MQ etc is actually able to cope very well with your situation. In general classic brokers are much more "chatty" than Kafka and are not able to hit message troughpout of Kafka when handling small messages. But as long as you are using large non-persistent messages (and proper broker configuration) you can expect decent performances in mb/sec from those too. Classic brokers will, with proper configuration, directly connect a socket of producer to a socket of a consumer without hitting a disk. In contrast Kafka will always persist to disk first. So they even have some latency pluses over Kafka.
However this direct socket-to-socket "optimisation" is just a full circle turn to the start of an this answer. Unless you need audio stream persistence, all you are doing with a broker-in-the-middle is finding an indirect way of binding producing sockets to consuming ones and then sending discrete messages over this connection. If that is all you need - ZeroMQ is made for this.
There is also messaging protocol called MQTT which may be something of interest to you if you choose to pursue a broker solution. As it is meant to be extremely scalable solution with low overhead.
A basic approach
As from Kafka perspective, each stream in your problem can map to one topic in Kafka and
therefore there is one producer-consumer pair per topic.
Con: If you have lots of streams, you will end up with lot of topics and IMO the solution can get messier here too as you are increasing the no. of topics.
An alternative approach
Alternatively, the best way is to map multiple streams to one topic where each stream is separated by a key (like you use IP:Port combination) and then have multiple consumers each subscribing to a specific set of partition(s) as determined by the key. Partitions are the point of scalability in Kafka.
Con: Though you can increase the no. of partitions, you cannot decrease them.
Type of data matters
If your streams are heterogeneous, in the sense that it would not be apt for all of them to share a common topic, you can create more topics.
Usually, topics are determined by the data they host and/or what their consumers do with the data in the topic. If all of your consumers do the same thing i.e. have the same processing logic, it is reasonable to go for one topic with multiple partitions.
Some points to consider:
Unlike in your current solution (I suppose), once the message is received, it doesn't get lost once it is received and processed, rather it continues to stay in the topic till the configured retention period.
Take proper care in determining the keying strategy i.e. which messages land in which partitions. As said, earlier, if all of your consumers do the same thing, all of them can be in a consumer group to share the workload.
Consumers belonging to the same group do a common task and will subscribe to a set of partitions determined by the partition assignor. Each consumer will then get a set of keys in other words, set of streams or as per your current solution, a set of one or more IP:Port pairs.

Kafka Random Access to Logs

I am trying to implement a way to randomly access messages from Kafka, by using KafkaConsumer.assign(partition), KafkaConsumer.seek(partition, offset).
And then read poll for a single message.
Yet i can't get past 500 messages per second in this case. In comparison if i "subscribe" to the partition i am getting 100,000+ msg/sec. (#1000 bytes msg size)
I've tried:
Broker, Zookeeper, Consumer on the same host and on different hosts. (no replication is used)
1 and 15 partitions
default threads configuration in "server.properties" and increased to 20 (io and network)
Single consumer assigned to a different partition each time and one consumer per partition
Single thread to consume and multiple threads to consume (calling multiple different consumers)
Adding two brokers and a new topic with it's partitions on both brokers
Starting multiple Kafka Consumer Processes
Changing message sizes 5k, 50k, 100k -
In all cases the minimum i get is ~200 msg/sec. And the maximum is 500 if i use 2-3 threads. But going above, makes the ".poll()", call take longer and longer (starting from 3-4 ms on a single thread to 40-50 ms with 10 threads).
My naive kafka understanding is that the consumer opens a connection to the broker and sends a request to retrieve a small portion of it's log. While all of this has some involved latency, and retrieving a batch of messages will be much better - i would imagine that it would scale with the number of receivers involved, with the expense of increased server usage on both the VM running the consumers and the VM running the broker. But both of them are idling.
So apparently there is some synchronization happening on broker side, but i can't figure out if it is due to my usage of Kafka or some inherent limitation of using .seek
I would appreaciate some hints of whether i should try something else, or this is all i can get.
Kafka is a streaming platform by design. It means there are many, many things has been developed for accelerating sequential access. Storing messages in batches is just one thing. When you use poll() you utilize Kafka in such way and Kafka do its best. Random access is something for what Kafka don't designed.
If you want fast random access to distributed big data you would want something else. For example, distributed DB like Cassandra or in-memory system like Hazelcast.
Also you could want to transform Kafka stream to another one which would allow you to use sequential way.

Testing Kafka producer throughput

We have a Kafka cluster consists of 3 nodes each with 32GB of RAM and 6 core 2.5 CPU.
We wrote a kafka producer that receive tweets from twitter and send it to Kafka in batches of 5000 tweets.
In the Producer we uses producer.send(list<KeyedMessages>) method.
The avg size of the tweet is 7KB.
Printing the time in milliseconds before and after the send statement to measure the time taken to send 5000 messages we found that it takes about 3.5 seconds.
Questions
Is the way we test the Kafka performance correct?
Is using the send method that takes list of keyed messages the correct way to send batch of messages to Kafka? Is there any other way?
What are the important configurations that affects the producer performance?
You're measuring only the producer side? That metric tells you only how much data you can store in a unit of time.
Maybe that's what you wanted to measure, but since the title of your question is "Kafka performance", I would think that you'd actually want to measure the throughput, i.e. how long does it take for a message to go though Kafka (usually referred to as end-to-end latency).
You'd achieve that by measuring the difference in time between sending a message and receiving that message on the other side, by a consumer.
If the cluster is configured correctly (default configuration will do), you should see latency ranging from only a couple of ms (less than 10ms), up to 50ms (few tens of milliseconds).
Kafka is able to do that because messages you read by the consumer don't even touch the disk, cuz' they are still in RAM (page cache and socket buffer cache). Keep in mind that this only works while you're able to "catch up" with your consumers, i.e. don't have a large consumer lag. If a consumer lags behind producers, the messages will eventually be purged from cache (depending on the rate of messages - how long it takes for the cache to fill up with new messages), and will thus have to be read from disk. Even that is not the end of the world (order of magnitude slower, in the range of low 100s of ms), because messages are written consecutively, one by one is a straight line, which is a single disk seek.
BTW you'd want to give Kafka only a small percentage of those 32GB, e.g. 5 to 8GB (even G1 garbage collector slows down with bigger sizes) and leave everything else unassigned so OS can use it for page and buffer cache.