Does retention period of zero makes sense in kafka borker?
We want to quickly forward message from producer to consumer via kafka broker. From buffercache/pagecache on broker machine without flushing to disk. We do not need replication and assume our broker will never crash.
When a message is produced to a Kafka topic it is written to the disk. Once the message has been consumed, the offset of this message is committed by the consumer (if you are using the high-level consumer API) however, there is no functionality that deletes only the messages that have been consumed (many consumers may subscribe to the same topic and some of them might have consumed that message while some others might have not).
What I would suggest in your case is to set a short retention period (which by default is set to 7 days) but allow a reasonable amount of time in order to allow your consumer to consume the messages. To do this, you simply need to configure the following parameter in server.properties:
log.retention.ms=X
Note that there is no guarantee that the deleted message(s) have been successfully consumed by your consumer(s). For example, if you set the retention period to 2 seconds (i.e. log.retention.ms=2000) and your consumer crashes, then every message which is sent to the topic while the consumer is down will be lost.
Related
If I have a service that connects to kafka as a message consumer, and every message I read I send a commit to that message offset, so that if my service shutsdown and restarts it will start reading from the last read message onwards. My understanding is that the committed offset will be maintained by kafka.
Now my question is, do I have to worry about the offset? Can kafka somehow lose that information and when the service restarts start reading messages from the beginning of the topic or the end of it depending on my initial offset config? Or if kafka loses my offset it will also have lost all messages in the topic so that it is alright to read from the beginning?
Note: I use spring-kafka on the service, but not sure if that is relevant to the question.
In most cases where you have an active consumer (with manual or auto-committing), you don't need to worry about it.
The cases where you do need to consider the behavior of auto.offset.reset setting is when the offsets.retention.minutes time on the broker has elapsed while your consumer group(s) are inactive. When this happens, Kafka compacts the __consumer_offsets topic and removes any offsets stored for those inactive groups
Losing offsets doesn't affect the source topic. Your client topic(s) have their own independent retention settings, and its message can be removed as well (or not), depending on how you've configured it.
I have scenario where i want to send message to a alert service that would process the message and would send it to hipchat.
But I want the message to be active only for a minute. If hipchat is down (hypothetical) then the message should not be sent to hipchat.
I am using kafka so one of the service sends the message to kafka then the message is consumed by alert service(it polls the service) which processes the message (kafka consumer) while processing it checks that the time now and the time of the message is not greater than one minute. If not, it sends the message to hipchat aynchronously.
Enhancement:
I want a way to construct a self destruction message so that i automatically disappears after one minute. Is there a way to do it with kafka ? OR is there a better alternate than kafka (flink/sqs). If yes, how?
You can make use of the Kafka topic configurations retention.ms and delete.retention.ms as described in the Topic Level Configs.
The retention.ms should be set to 1 minute (60000 ms) and the delete.retention.ms should be set to 0 in your case. That way, the messages will stay in the Kafka Topic for one minute before they get deleted. However, that also means that you might loose messages if your consumer takes more then one minute to consume all messages (especially when reading a topic from beginning).
Details on those configurations are:
delete.retention.ms: The amount of time to retain delete tombstone markers for log compacted topics. This setting also gives a bound on the time in which a consumer must complete a read if they begin from offset 0 to ensure that they get a valid snapshot of the final stage (otherwise delete tombstones may be collected before they complete their scan).
retention.ms: This configuration controls the maximum time we will retain a log before we will discard old log segments to free up space if we are using the "delete" retention policy. This represents an SLA on how soon consumers must read their data. If set to -1, no time limit is applied.
Playing around with Apache Kafka and its retention mechanism I'm thinking about following situation:
A consumer fetches first batch of messages with offsets 1-5
The cleaner deletes the first 10 messages, so the topic now has offsets 11-15
In the next poll, the consumer fetches the next batch with offsets 11-15
As you can see the consumer lost the offsets 6-10.
Question, is such a situation possible at all? With other words, will the cleaner execute while there is an active consumer? If yes, is the consumer able to somehow recognize that gap?
Yes such a scenario can happen. The exact steps will be a bit different:
Consumer fetches message 1-5
Messages 1-10 are deleted
Consumer tries to fetch message 6 but this offset is out of range
Consumer uses its offset reset policy auto.offset.reset to find a new valid offset.
If set to latest, the consumer moves to the end of the partition
If set to earliest the consumer moves to offset 11
If none or unset, the consumer throws an exception
To avoid such scenarios, you should monitor the lead of your consumer group. It's similar to the lag, but the lead indicates how far from the start of the partition the consumer is. Being near the start has the risk of messages being deleted before they are consumed.
If consumers are near the limits, you can dynamically add more consumers or increase the topic retention size/time if needed.
Setting auto.offset.reset to none will throw an exception if this happens, the other values only log it.
Question, is such a situation possible at all? will the cleaner execute while there is an active consumer
Yes, if the messages have crossed TTL (Time to live) period before they are consumed, this situation is possible.
Is the consumer able to somehow recognize that gap?
In case where you suspect your configuration (high consumer lag, low TTL) might lead to this, the consumer should track offsets. kafka-consumer-groups.sh command gives you the information position of all consumers in a consumer group as well as how far behind the end of the log they are.
Setting the autocommit.enable option for the Kafka consumer causes consumed messages to be committed, which means that if a consumer crashes, it will start reading offsets from the last committed position.
But what if we restart kafka server, will the consumer re-read already committed offsets or this option works in such case as well - after server reboot only unread message will be consumed?
You asked (sort of):
Does the consumer re-read message before the committed offset?
The answer is no. Once your offset is committed on the server, the consumers won't re-read any message (unless they manually want to).
But you might want to be asking this question to yourself:
Is it possible for consumers to consume the same message multiple
times even if they enable autocommit?
The answer to that is "not without effort". To understand why, read section 4.6 of Kafka design. Kafka doesn't provide exactly-once delivery guarantees on the consumers side. In order to make sure that multiple consumers don't consume the same messages you need to coordinate between your consumer clients.
The other option is to make all your messages idempotent. That way it doesn't matter if multiple consumers process the same message several times.
The committed offset works across Kafka server reboot, because
When producer publishes a message it gets a offset that is
immutable, and retains across server restart
In Kafka 0.9 and after the committed offset is stored in a topic __committed_offset(You might want to check name of this topic), which is retained across server restart
In Kafka before 0.9 also the committed offset will be stored in Zookeeper and zookeeper retains that offset in log file
A typical kafka consumer looks like the following:
kafka-broker ---> kafka-consumer ----> downstream-consumer like Elastic-Search
And according to the documentation for Kafka High Level Consumer:
The ‘auto.commit.interval.ms’ setting is how often updates to the
consumed offsets are written to ZooKeeper
It seems that there can be message loss if the following two things happen:
Offsets are committed just after some messages are retrieved from kafka brokers.
Downstream consumers (say Elastic-Search) fail to process the most recent batch of messages OR the consumer process itself is killed.
It would perhaps be most ideal if the offsets are not committed automatically based on a time interval but they are committed by an API. This would make sure that the kafka-consumer can signal the committing of offsets only after it receives an acknowledgement from the downstream-consumer that they have successfully consumed the messages. There could be some replay of messages (if kafka-consumer dies before committing offsets) but there would at least be no message loss.
Please let me know if such an API exists in the High Level Consumer.
Note: I am aware of the Low Level Consumer API in 0.8.x version of Kafka but I do not want to manage everything myself when all I need is just one simple API in High Level Consumer.
Ref:
AutoCommitTask.run(), look for commitOffsetsAsync
SubscriptionState.allConsumed()
There is a commitOffsets() API in the High Level Consumer API that can be used to solve this.
Also set option "auto.commit.enable" to "false" so that at no time, the offsets are committed automatically by kafka consumer.