Delaying all messages by 30 minutes in Kafka Streams - apache-kafka

We have a use case where we need to write all messages of topic a into topic b, but with a delay of 30 minutes for each message. Why, you ask? Because time is of critical importance for this stream of data, so paying customers get the real-time feed, for freeloaders, we offer the delayed stream.
I guess it would be relatively easy to do in a KafkaConsumer poll() loop, by comparing system time and message time (using an ordered message time like producer time or ingestion time) and then pause()ing the partitions in question and resume()ing them after the appropriate time interval of up to 30 minutes(, all the while continuing to poll() to avoid getting failed over).
As the data, though delayed, still needs to be delivered in a streaming fashion, the delay of the ingestion times of all messages in topic a and b should be as close to 30 minutes as possible.
But is this also easily possible in Kafka Streams, so that we can use its built-in exactly-once guarantees? I wonder if "it's ok to call Thread.sleep() in Kafka Streams also applies to longer sleeps of up to 30 minutes? (Of course we don't want a partition rebalance to occur because Kafka thinks something's wrong with our process)
Assuming we get this to work, is there a way to get proper lag monitoring for this? If we just delay messages, I would think the consumer group lag would always amount to at least 30 minutes worth of messages. So is it possible to have the lag monitor count only unprocessed messages older than 30 minutes?
(2. is of less importance for us than getting 1. to work)
Edit: https://stackoverflow.com/a/59261274/709537 proposes a solution to a somewhat related problem, but that involves state stores and thus looks more complicated than would seem necessary for our simple (?) "delay all messages by x minutes" task.

Regarding 2., I assume we will have to roll our own lag monitoring for this.
A simple way to do something related - measuring latency instead of the number of lagging messages - would be to periodically and for every partition
get the first unread input message
currentLatency = max(0, ingestionTime(firstUnreadMessage) - 30min)
If we wanted to monitor the number of lagging messages, something a little more involved would need to be done:
read input messages backwards, until there is one with ingestionTime + 30min <= systemTime
the number of those messages would be the lag
However reading messages backwards is not exactly one of Kafka's core competencies... A clever binary search style could be devised to get the exact value. However, no-one really wants to know whether the message lag is 43123 or 40513, what they want to know is the order of magnitude. That will keep the number of seeks down to a handful (per partition), and no binary search style back and forth would be necessary. The output could e.g. be
lag < 10
lag < 100
lag < 1000
lag < 10000
...

Related

Kafka and Event Streaming On Client Side?

I need to consume messages from a event source (represented as a single Kafka topic) producing about 50k to 250k events per second. It only provides a single partition and the ping is quite high (90-100ms).
As far as I have learned by reading the Kafka client code, during polling a fetch request is issued and once the response is fully read, the events/messages are parsed and deserialized and once enough events/messages are available consumer.poll() will provide the subset to the calling application.
In my case this makes the whole thing not worth while. The best throughput I achieve with about 2s duration per fetch request (about 2.5MB fetch.max.bytes). Smaller fetch durations will increase the idle time (time the consumer does not receive any bytes) between last byte of previous response, parsing, deserialization and sending next request and waiting for the first byte of the next request's response.
Using a fetch duration of about 2s results in a max latency of 2s which is highly undesirable. What I would like to see is while receiving the fetch response, that the messages transmitted are already available to the consumer as soon as a individual message is fully transmitted.
Since every message has an individual id and the messages are send in a particular order while only a single consumer (+thread) for a single partition is active, it is not a problem to suppress retransmitted messages in case a fetch response is aborted / fails and its messages were partially processed and later on retransmitted.
So the big question is, if the Kafka client provides a possibility to consume messages from a not-yet completed fetch response.
That is a pretty large amount of messages coming in through a single partition. Since you can't control anything on the Kafka server, the best you can do is configure your client to be as efficient as possible, assuming you have access to Kafka client configuration parameters. You didn't mention anything about needing to consume the messages as fast as they're generated, so I'm assuming you don't need that. Also I didn't see any info about average message size, how much message sizes vary, but unless those are crazy values, the suggestions below should help.
The first thing you need to do is set max.poll.records on the client side to a smallish number, say, start with 10000, and see how much throughput that gets you. Make sure to consume without doing anything with the messages, just dump them on the floor, and then call poll() again. This is just to benchmark how much performance you can get with your fixed server setup. Then, increase or decrease that number depending on if you need better throughput or latency. You should be able to get a best scenario after playing with this for a while.
After having done the above, the next step is to change your code so it dumps all received messages to an internal in-memory queue, and then call poll() again. This is especially important if processing of each message requires DB access, hitting external APIs, etc. If you take even 100ms to process 1K messages, that can reduce your performance in half in your case (100ms to poll/receive, and then another 100ms to process the messages received before you start the next poll())
Without having access to Kafka configuration parameters on the server side, I believe the above should get you pretty close to an optimal throughput for your configuration.
Feel free to post more details in your question, and I'd be happy to update my answer if that doesn't help.
To deal with such a high throughput, this is what community recommendations for number of partitions on a source topic. And it is worth considering all these factors when choosing the number of partitions.
• What is the throughput you expect to achieve for the topic?
• What is the maximum throughput you expect to achieve when
consuming from a single partition?
• If you are sending messages to partitions based on keys,
adding partitions later can be very challenging, so calculate
throughput based on your expected future usage, not the current
usage.
• Consider the number of partitions you will place on each
broker and available diskspace and network bandwidth per
broker.
So if you want to be able to write and read 1 GB/sec from a topic, and each consumer can only process 50 MB/s, then you need at least 20 partitions. This way, you can have 20 consumers reading from the topic and achieve 1 GB/sec.
Also,
Regarding the fetch.max.bytes, I am sure you have already had a glance on this one Kafka fetch max bytes doesn't work as expected.

Apache Kafka: Send Messages to another Topic after a period of time

I am new to Apache Kafka, so it might be that this is basic knowledge.
At the moment I try to figure out some possibilities and functions that Kafka offers me. And so I was wondering whether it is possible to move a message after a specified period of time to another topic.
Scenario:
Producer 1 writes Message (M1) into Topic 1 where Consumer 1 handles the messages.
After a period of time, let's say 1 hour, M1 is moved into Topic 2 to which the Consumer 2 is subscribed.
It is possible to do something like that with Kafka? I know that there is a way to delete a message after a period of time, but I don't know if there is a way to change to topic or catch the delete-action.
I thought about running a timer in a Producer, but with a huge amount of data, I think that this isn't possible anymore.
Thanks in advance
EDIT:
Thanks to #OneCricketeer i know, that my first assumption with the several producers wasn't that bad.
I know that the throughput with one Producer is really good and that one won't take the system down.
But I'm still concerend about the second producer.
In my imagination it is like the following sketchy image
When I take 30 messages per minute that would mean that I would habe 31 instances of producers. 1 that handles the messages asap and 30 others waiting for the timer to determinate so that they can work with their message.
Counting that up to an hour it would be round about 1800 instances. That is where I#m concerned about. Or is there a better way to handel this?
I found a solution that might work for my case.
I accidentally stumbled over a Consumer-Methode which allows you to read messages based on Timestamp.
The methode is called offsetsForTimes and usable since the Version 0.10.
See the Kafka API or the following post which I found researching about that methode.
Maybe this is usefull for others so I decided to publish this.

Kafka Streams commits offset when producer throws an exception

In my Kafka streams application I have a single processor that is scheduled to produce output messages every 60 seconds. Output message is built from messages that come from a single input topic. Sometimes it happens that the output message is bigger than the configured limit on broker (1MB by default). An exception is thrown and the application shuts down. Commit interval is set to default (60s).
In such case I would expect that on the next run all messages that were consumed during those 60s preceding the crash would be re-consumed. But in reality the offset of those messages is committed and the messages are not processed again on the next run.
Reading answers to similar questions it seems to me that the offset should not be committed. When I increase commit interval to 120s (processor still punctuates every 60s) then it works as expected and the offset is not committed.
I am using default processing guarantee but I have also tried exactly_once. Both have the same result. Calling context.commit() from processor seems to have no effect on the issue.
Am I doing something wrong here?
The contract of a Processor in Kafka Streams is, that you have fully processed an input record and forward() all corresponding output messages before process() return. -- This contract implies that Kafka Streams is allowed to commit the corresponding offset after process() returns.
It seem you "buffer" messages within process() in-memory to emit them later. This violated this contract. If you want to "buffer" messages, you should attach a state store to the Processor and put all those messages into the store (cf https://kafka.apache.org/25/documentation/streams/developer-guide/processor-api.html#state-stores). The store is managed by Kafka Streams for you and it's fault-tolerant. This way, after an error the state will be recovered and you don't loose any data (even if the input messages are not reprocessed).
I doubt that setting the commit interval to 120 seconds actually works as expected for all cases, because there is no alignment between when a commit happens and when punctuation is called.
Some of this will depend on the client you are using and whether it's based on librdkafka.
Some of the answer will also depend on how you are "looping" over the "poll" method. A typical example will look like the code under "Automatic Offset Committing" at https://kafka.apache.org/23/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
But this assumes quite a rapid poll loop (100ms + processing time) and a auto.commit.timeout.ms at 1000ms (the default is usually 5000ms).
If I read your question correctly, you seem to consuming messages once per 60 seconds?
Something to be aware of is that the behavior of kafka client is quite tied to how frequently poll is called (some libraries will wrap poll inside something like a "Consume" method). Calling poll frequently is important in order to appear "alive" to the broker. You will get other exceptions if you do not poll at least every max.poll.interval.ms (default 5min). It can lead to clients being kicked out of their consumer groups.
anyway, to the point... auto.commit.interval.ms is just a maximum. If a message has been accepted/acknowledged or StoreOffset has been used, then, on poll, the client can decide to update the offset on the broker. Maybe due to client side buffer size being hit or some other semantic.
Another thing to look at (esp if using a librdkafka based client. others have something similar) is enable.auto.offset.store (default true) this will "Automatically store offset of last message provided to application" so every time you poll/consume a message from the client it will StoreOffset. If you also use auto.commit then your offset may move in ways you might not expect.
See https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md for the full set of config for librdkafka.
There are many/many ways of consuming/acknowledging. I think for your case, the comment for max.poll.interval.ms on the config page might be relevant.
"
Note: It is recommended to set enable.auto.offset.store=false for long-time processing applications and then explicitly store offsets (using offsets_store()) after message processing
"
Sorry that this "answer" is a bit long winded. I hope there are some threads for you to pull on.

Kafka as a message queue for long running tasks

I am wondering if there is something I am missing about my set up to facilitate long running jobs.
For my purposes it is ok to have At most once message delivery, this means it is not required to think about committing offsets (or at least it is ok to commit each message offset upon receiving it).
I have the following in order to achieve the competing consumer pattern:
A topic
X consumers in the same group
P partitions in a topic (where P >= X always)
My problem is that I have messages that can take ~15 minutes (but this may fluctuate by up to 50% lets say) in order to process. In order to avoid consumers having their partition assignments revoked I have increased the value of max.poll.interval.ms to reflect this.
However this comes with some negative consequences:
if some message exceeds this length of time then in a worst case scenario a the consumer processing this message will have to wait up to the value of max.poll.interval.ms for a rebalance
if I need to scale and increase the number of consumers based on load then any new consumers might also have to wait the value of max.poll.interval.ms for a rebalance to occur in order to process any new messages
As it stands at the moment I see that I can proceed as follows:
Set max.poll.interval.ms to be a small value and accept that every consumer processing every message will time out and go through the process of having assignments revoked and waiting a small amount of time for a rebalance
However I do not like this, and am considering looking at alternative technology for my message queue as I do not see any obvious way around this.
Admittedly I am new to Kafka, and it is just a gut feeling that the above is not desirable.
I have used RabbitMQ in the past for these scenarios, however we need Kafka in our architecture for other purposes at the moment and it would be nice not to have to introduce another technology if Kafka can achieve this.
I appreciate any advise that anybody can offer on this subject.
Using Kafka as a Job queue for scheduling long running process is not a good idea as Kafka is not a queue in the strictest sense and semantics for failure handling and retries are limited. Though you might be able to achieve a compromise by playing around with certain configuration for rebalance or timeout, it is likely to remain brittle design. Simple answer is that Kafka was not designed for these kind of usecases.
The idea of max.poll.interval.ms is to prevent livelock situation (see), but in your case, consumer will send a false positive to the Kafka broker and will trigger a rebalance as there is no way to distinguish between a livelock and a legitimate long process.
You should think about the tradeoffs between living with the negative consequences you mentioned Vs. introducing a new technology which helps you to model a job queue in a better way. For a more complex usecase, check out how slack is doing it.
The way we got around the issues we were having was as suggested in the comments.
We decided to decouple the message processing from the consumer polling.
On each worker/consumer there were 2 threads, one for doing the actual processing and the other for phoning home to Kafka periodically.
We also did some work with trying to reduce the processing times for messages.
However some messages still take time that can be measured in minutes.
This has worked for us now for some time with no issues.
Thanks for this suggestions in comments #Donal

Testing Kafka producer throughput

We have a Kafka cluster consists of 3 nodes each with 32GB of RAM and 6 core 2.5 CPU.
We wrote a kafka producer that receive tweets from twitter and send it to Kafka in batches of 5000 tweets.
In the Producer we uses producer.send(list<KeyedMessages>) method.
The avg size of the tweet is 7KB.
Printing the time in milliseconds before and after the send statement to measure the time taken to send 5000 messages we found that it takes about 3.5 seconds.
Questions
Is the way we test the Kafka performance correct?
Is using the send method that takes list of keyed messages the correct way to send batch of messages to Kafka? Is there any other way?
What are the important configurations that affects the producer performance?
You're measuring only the producer side? That metric tells you only how much data you can store in a unit of time.
Maybe that's what you wanted to measure, but since the title of your question is "Kafka performance", I would think that you'd actually want to measure the throughput, i.e. how long does it take for a message to go though Kafka (usually referred to as end-to-end latency).
You'd achieve that by measuring the difference in time between sending a message and receiving that message on the other side, by a consumer.
If the cluster is configured correctly (default configuration will do), you should see latency ranging from only a couple of ms (less than 10ms), up to 50ms (few tens of milliseconds).
Kafka is able to do that because messages you read by the consumer don't even touch the disk, cuz' they are still in RAM (page cache and socket buffer cache). Keep in mind that this only works while you're able to "catch up" with your consumers, i.e. don't have a large consumer lag. If a consumer lags behind producers, the messages will eventually be purged from cache (depending on the rate of messages - how long it takes for the cache to fill up with new messages), and will thus have to be read from disk. Even that is not the end of the world (order of magnitude slower, in the range of low 100s of ms), because messages are written consecutively, one by one is a straight line, which is a single disk seek.
BTW you'd want to give Kafka only a small percentage of those 32GB, e.g. 5 to 8GB (even G1 garbage collector slows down with bigger sizes) and leave everything else unassigned so OS can use it for page and buffer cache.