Apache Storm with Kafka offset management - apache-kafka

I have built a sample topology with Storm using Kafka as a source. Here is a problem for which I need a solution.
Every time I kill a topology and start it again, the topology starts processing from the beginning.
Suppose Message A in Topic X was processed by Topology and then I kill the topology.
Now when I again submit the topology and Message A is still there is Topic X. It is processed again.
Is there a solution, maybe some sort of offset management to handle this situation.

You shouldn't use storm-kafka for new code, it is deprecated since the underlying client API is deprecated in Kafka, and removed as of 2.0.0. Instead, use storm-kafka-client.
With storm-kafka-client you want to set a group id, and a first poll offset strategy.
KafkaSpoutConfig.builder(bootstrapServers, "your-topic")
.setProp(ConsumerConfig.GROUP_ID_CONFIG, "kafkaSpoutTestGroup")
.setFirstPollOffsetStrategy(UNCOMMITTED_EARLIEST)
.build();
The above will make your spout start at the earliest offset first time you start it, and then it will pick up where it left off if you restart it. The group id is used by Kafka to recognize the spout when it restarts, so it can get the stored offset checkpoint back. Other offset strategies will behave differently, you can check the javadoc for the FirstPollOffsetStrategy enum.
The spout will checkpoint how far it got periodically, there is also a setting in the config to control this. The checkpointing is controlled by the setProcessingGuarantee setting in the config, and can be set to have at-least-once (only checkpoint acked offsets), at-most-once (checkpoint before spout emits the message), and "any times" (checkpoint periodically, ignoring acks).
Take a look at one of the example topologies included with Storm https://github.com/apache/storm/blob/dc56e32f3dcdd9396a827a85029d60ed97474786/examples/storm-kafka-client-examples/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutTopologyMainNamedTopics.java#L93.

Make sure when creating your spoutconfig that it has a fixed spout id by which it can identify itself after a restart.
From official Storm site:
Important: When re-deploying a topology make sure that the settings
for SpoutConfig.zkRoot and SpoutConfig.id were not modified, otherwise
the spout will not be able to read its previous consumer state
information (i.e. the offsets) from ZooKeeper -- which may lead to
unexpected behavior and/or to data loss, depending on your use case.

since I am facing with a similar issue, take advantage of it and ask. I have a code like this:
KafkaTridentSpoutConfig.Builder kafkaSpoutConfigBuilder = KafkaTridentSpoutConfig.builder(bootstrapServers, topic);
kafkaSpoutConfigBuilder.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, fetchSizeBytes);
kafkaSpoutConfigBuilder.setProp(ConsumerConfig.GROUP_ID_CONFIG, clientId);
kafkaSpoutConfigBuilder.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
kafkaSpoutConfigBuilder.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
return new KafkaTridentSpoutOpaque(kafkaSpoutConfigBuilder.build());
But everytime I restart Storm Local Cluster, messages are read from the beginning. If I check offsets directly in Kafka for the particular group, there is no lag. It's like offsets from Kafka are not read.
Using Kafka 2.8, Storm 2.2.0. I didn't have this issue with Storm 0.9.X.
Any idea?
Thanks!

Related

How to skip kafka history data in flink job if certain lag is encountered?

Sometimes we encounter lag in kafka consumer due to some external issues.
Flink job will always consume kafka history (delayed data) with exactly-once semantics, but here's a scenario:
We will skip delayed data when kafka consumer lag is too much in order to let our downstream service get the latest data in time.
I am thinking to set a window period to do it. What should I code for it?
I'd say your least painful option is to always read all the messages, but do not process (discard them as soon as possible) the ones you want to skip. Just reading and discarding without any further processing is really fast.
You could stop the Flink job and use kafka-consumer-groups CLI from Kafka to seek the consumer group forward (assuming Flink is using one, rather than maintaining offsets itself)
When the job restarts, it'll start from the new offset location

kafka offset management auto vs manual

I'm working on an application of spring boot which uses Kafka stream, in my application, I want to manage Kafka offset and commit the offset in case of the successful message processing only. This is important, to be certain I won't lose messages even if Kafka restarted or the zookeeper is down. my current situation is when my Kafka is down and up my consumer starts from the beginning and consumes all the previous messages.
also, I need to know what is the difference between managing the Kafka offset automatic using autoCommitOffset and manging it manually using HBase or zookeeper or checkpoints?
also, what are the benefits of managing it manually if there is an automatic config we can use?
You have no guarantee of durability with auto commit
Older Kafka clients did use Zookeeper for offset storage, but now it is all in the broker to minimize dependencies. Kafka Streams API has no way to integrate offset storage outside of Kafka itself, and so you must use the Consumer API to lookup and seek/commit offsets to external storage, if you choose to do so, however, you can still then end up with less than optimal message processing.
my current situation is when my Kafka is down and up my consumer starts from the beginning and consumes all the previous messages
Sounds like you set auto.offset.reset=earliest and you never commit any offsets at all...
The auto commit setting does a periodic commit, not "automatic after reading any message".
If you want to guarantee delivery, then you need to set at least acks=1 in the producer and actually do a commitSync in the consumer

Why do the offsets of the consumer-group (app-id) of my Kafka Streams Application get reset after application restart?

I have a Kafka Streams application for which, whenever I restart it, the offsets for the topic it is consuming get reset. Hence, for all partitions, the lags increase and the app needs to reprocess all the data.
UPDATE:
The output topic is receiving a burst of events that were already processed after the App gets restarted, is not that the input topic offsets are getting reset as I said in the previous paragraph. However, the internal topic (KTABLE-SUPPRESS-STATE-STORE) offsets are getting reset, see comments below.
I have ensured the lag is 1 for every partition before the restart (this is for the output topic).
All consumers that belong to that consumer-group-id (app-id) are active.
The restart is immediate, it takes around 30 secs.
The app is using exactly once as processing guarantee.
I have read this answer How does an offset expire for an Apache Kafka consumer group? .
I have tried with auto.offset.reset = latest and auto.offset.reset = earliest.
It seems like the offsets for these topics are not effectively committed, (but I am not sure about this).
I assume that after the restart the app should pick-up from the latest committed offset for that consumer group.
UPDATE:
I assume this for the internal topic (KTABLE-SUPPRESS-STATE-STORE)
Does the Kafka Stream API ensure to commit all consumed offset before shutting down? (after calling streams.close())
I would really appreciate any clue about this.
UPDATE:
This is the code the App execute:
final StreamsBuilder builder = new StreamsBuilder();
final KStream<..., ...> events = builder
.stream(inputTopicNames, Consumed.with(..., ...)
.withTimestampExtractor(...);
events
.filter((k, v) -> ...)
.flatMapValues(v -> ...)
.flatMapValues(v -> ...)
.selectKey((k, v) -> v)
.groupByKey(Grouped.with(..., ...))
.windowedBy(
TimeWindows.of(Duration.ofSeconds(windowSizeInSecs))
.advanceBy(Duration.ofSeconds(windowSizeInSecs))
.grace(Duration.ofSeconds(windowSizeGraceInSecs)))
.reduce((agg, new) -> {
...
return agg;
})
.suppress(Suppressed.untilWindowCloses(
Suppressed.BufferConfig.unbounded()))
.toStream()
.to(outPutTopicNameOfGroupedData, Produced.with(..., ...));
The offset reset just and always happens (after restarting) with the KTABLE-SUPPRESS-STATE-STORE internal topic created by the Kafka Stream API.
I have tried with the Processing guarantee exactly once and at least once.
Once again, I will really appreciate any clue about this.
UPDATE:
This has been solved in the release 2.2.1 (https://issues.apache.org/jira/browse/KAFKA-7895)
The offset reset just and always happens (after restarting) with the KTABLE-SUPPRESS-STATE-STORE internal topic created by the Kafka Stream API.
This is currently (version 2.1) expected behavior, because the suppress() operator works in-memory only. Thus, on restart, the suppress buffer must be recreate from the changelog topic before processing can start.
Note, it is planned to let suppress() write to disk in future releases (cf. https://issues.apache.org/jira/browse/KAFKA-7224). This will avoid the overhead of recreating the buffer from the changelog topic.
I think #Matthias J. Sax 's reply covers most of the internals of suppress. One thing I need to clarify though: when you say "restart the application", what exactly did you do? Did you shutdown the whole application gracefully, and then restart it?
Commit frequency is controlled by the parameter commit.interval.ms. Check whether your offsets are indeed committed. By default, offsets are committed every 100 ms or 30 secs, depending upon your processing guarantee config. Check this out

Kafka producer buffering

Suppose there is a producer which is running and I run a consumer a few minutes later. I noticed that the consumer will consume old messages that has been produced by the producer but I don't want that happens. How can I do that? Is there any config parameters in broker to be set and solve this problem?
It really depends on the use case, you didn't really provide much information about the architecture. For instance - once the consumer is up, is it a long running consumer, or does it just wake up for a short while and consumes new messages arriving?
You can take any of the following approaches:
Filter ConsumerRecord by timestamp, so you will automatically throw away messages that were produced over configurable time.
In my team we're using ephemeral groups. That is - each time the service goes up, we generate a new group id for the consumer group, setting auto.offset.reset to latest
Seek to timestamp - since kafka 0.10 you can seek to a certain position. Use consumer.offsetsForTimes to get the offset of each topic partition for the desired time, and then use consumer.seek to get to the given offset.
If you use a consumer group, but never commit to kafka, then each time the a consumer is assigned to a topic partition, it will start consuming according to auto.offset.reset policy...

kafka consumer sessions timing out

We have an application that a consumer reads a message and the thread does a number of things, including database accesses before a message is produced to another topic. The time between consuming and producing the message on the thread can take several minutes. Once message is produced to new topic, a commit is done to indicate we are done with work on the consumer queue message. Auto commit is disabled for this reason.
I'm using the high level consumer and what I'm noticing is that zookeeper and kafka sessions timeout because it is taking too long before we do anything on consumer queue so kafka ends up rebalancing every time the thread goes back to read more from consumer queue and it starts to take a long time before a consumer reads a new message after a while.
I can set zookeeper session timeout very high to not make that a problem but then i have to adjust the rebalance parameters accordingly and kafka won't pickup a new consumer for a while among other side effects.
What are my options to solve this problem? Is there a way to heartbeat to kafka and zookeeper to keep both happy? Do i still have these same issues if i were to use a simple consumer?
It sounds like your problems boil down to relying on the high-level consumer to manage the last-read offset. Using a simple consumer would solve that problem since you control the persistence of that offset. Note that all the high-level consumer commit does is store the last read offset in zookeeper. There's no other action taken and the message you just read is still there in the partition and is readable by other consumers.
With the kafka simple consumer, you have much more control over when and how that offset storage takes place. You can even persist that offset somewhere other than Zookeeper (a data base, for example).
The bad news is that while the simple consumer itself is simpler than the high-level consumer, there's a lot more work you have to do code-wise to make it work. You'll also have to write code to access multiple partitions - something the high-level consumer does quite nicely for you.
I think issue is consumer's poll method trigger consumer's heartbeat request. And when you increase session.timeout. Consumer's heartbeat will not reach to coordinator. Because of this heartbeat skipping, coordinator mark consumer dead. And also consumer rejoining is very slow especially in case of single consumer.
I have faced a similar issue and to solve that I have to change following parameter in consumer config properties
session.timeout.ms=
request.timeout.ms=more than session timeout
Also you have to add following property in server.properties at kafka broker node.
group.max.session.timeout.ms =
You can see the following link for more detail.
http://grokbase.com/t/kafka/users/16324waa50/session-timeout-ms-limit