Kafka stream - define a retention policy for a changelog - apache-kafka

I use Kafka Streams for some aggregations of a TimeWindow.
I'm interested only in the final result of each window, so I use the .suppress() feature which creates a changelog topic for its state.
The retention policy configuration for this changelog topic is defined as "compact" which to my understanding will keep at least the last event for each key in the past.
The problem in my application is that keys often change. This means that the topic will grow indefinitely (each window will bring new keys which will never be deleted).
Since the aggregation is per window, after the aggregation was done, I don't really need the "old" keys.
Is there a way to tell Kafka Streams to remove keys from previous windows?
For that matter, I think configuring the changelog topic retention policy to "compact,delete" will do the job (which is available in kafka according to this: KIP-71, KAFKA-4015.
But is it possible to change the retention policy so using the Kafka Streams api?

suppress() operator sends tombstone messages to the changelog topic if a record is evicted from its buffer and sent downstream. Thus, you don't need to worry about unbounded growth of the topic. Changing the compaction policy might in fact break the guarantees that the operator provide and you might loose data.

Related

Kafka - uncompacted topics Vs compacted topics

I come across the following two phrases from the book "Mastering Kafka Streams and ksqlDB" and author used two terms, what does they really mean "compacted topics" and "uncompacted topics"
Does they got anything to with respect to "log compaction" ?
Tables can be thought of as updates to a database. In this view of the logs, only the current state (either the latest record for a given key or some kind of aggregation) for each key is retained. Tables are usually built from compacted topics.
Streams can be thought of as inserts in database parlance. Each distinct record remains in this view of the log. Streams are usually built from uncompacted topics.
Yes, log compaction according to kafka docs
Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition
https://kafka.apache.org/documentation/#compaction
If log compaction is enabled on topic, Kafka removes any old records when there is a newer version of it with the same key in the partition log.
For more detailed explanation of log compaction refer - https://medium.com/swlh/introduction-to-topic-log-compaction-in-apache-kafka-3e4d4afd2262
Yes, these terms are synonymous.
Ref: Log Compaction
From this article:
The idea behind compacted topics is that no duplicate keys exist. Only the most recent value for a message key is maintained.
It is mostly used for the scenarios such as restoring to the previous state before the application crashed or system failed, or while reloading cache after application restarts.
As an example of the above kafka has the topic __consumer_offsets which can be used to to continue from the last message which was read after a crash or a restart. A schema registry is also often used to ensure compatible communication between producers and consumers. The schemas used are maintained in the __schemas topic.

Kafka Topic Retention and impact on the State store in Kafka streams

I have a state store (using Materialized.as()) in Kafka streams word-count application. Based on my understanding the state-store is maintained in Kafka internal topic.
Following questions are :
Can state-stores have unlimited key-value pairs, or they are
governed by the rules of kafka topics based on the log.retention
policies or log.segment.bytes?
I set the log.retention.ms=60000 and
expected the state store value to be reset to 0 after a minute. But I find that it
is not happening, I can still see values from state store. Does kafka completely wipe out the logs or keeps
the SNAPSHOT in case log-compaction topic?
What does it mean by "segment gets committed"?
Please post along with the sources for solution if available.
Can state-stores have unlimited key-value pairs, or they are governed by the rules of kafka topics based on the log.retention policies or log.segment.bytes?
Yes, state stores can have unlimited key-value pairs = events (or 'messages'). Well, local app storage space and remote storage space in Kafka permitting, of course (the latter for durably storing the data in your state stores).
Your application's state stores are persisted remotely in compacted internal Kafka topics. Compaction means that Kafka periodically purges older events for the same event key (e.g., Bob's old account balances) from storage. But compacted topics do not remove the most recent event per event key (e.g., Bob's current account balance). There is no upper limit for how many such 'unique' key-value pairs will be stored in a compacted topic.
I set the log.retention.ms=60000 and expected the state store value to be reset to 0 after a minute. But I find that it is not happening, I can still see values from state store.
log.retention.ms is not used when a topic is configured to be compacted (log.cleanup.policy=compact). See the existing SO question Log compaction to keep exactly one message per key for details, including for why compaction doesn't happen immediately (in short that's because compaction operates on partition segment files, it will not touch the most current segment file, and there can be multiple events per event key in that file).
Note: You can nowadays set the configuration log.cleanup.policy to a combination of compaction and time/volume-based retention with log.cleanup.policy=compact,delete (see KIP-71 for details). But generally you shouldn't fiddle with this setting unless you really know what you are doing -- the defaults are what you need 99% of the time.
Does kafka completely wipe out the logs or keeps the SNAPSHOT in case log-compaction topic? What does it mean by "segment gets committed"?
I don't understand this question, unfortunately. :-) Perhaps my previous answers and reference links already cover your needs. What I can say is that, no, Kafka doesn't wipe out the logs completely. Compaction operates on a topic-partition's segment files. You will probably need to read up on how compaction works, for which I would suggest an article like https://medium.com/#sunny_81705/kafka-log-retention-and-cleanup-policies-c8d9cb7e09f8, in case the Apache Kafka docs weren't sufficiently clear yet.
State stores are maintained by compacted, internal topics. They therefore follow the same semantics of compacted topics, and have to finite retention duration
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Internal+Data+Management

Kafka stream KTable changelog TTL

Let’s say I have an A-Event KStream aggregated into an A-Snapshot KTable and a B-Event KStream aggregated into a B-Snapshot KTable. Neither A-Snapshot nor B-Snapshot conveys null values (delete events are instead aggregated as state attribute of the snapshot). At this point, we can assume we have a persisted kafka changelog topic and a rocksDB local store for both the A-KTable and B-KTable aggregations. Then, my topology would join the A-KTable with B-KTable to produce a joined AB-KStream. That said, my problem is with the A-KTable and B-KTable materialization lifecycles (both the changelog topic and the local rocksdb store). Let’s say A-Event topic and B-Event topic retention strategies were set to 2 weeks, is there a way to side effect the kafka internal KTable materialization topic retention policy (changelog and rocksDB) with the upstream event topic delete retention policies? Otherwise, can we configure the KTable materialization with some sort of retention policy that would manage both the changelog topic and the rockdb store lifecycle? Considering I can’t explicitly emit A-KTable and B-KTable snapshot tombstones? I am concerned that the changelog and the local store will grow indefinitely,..,
At the moment, KStream doesn't support out of the box functionality to inject the cleanup in Changelog topics based on the source topics retention policy. By default, it uses the "Compact" retention policy.
There is an open JIRA issue for the same :
https://issues.apache.org/jira/browse/KAFKA-4212
One option is to inject tombstone messages but that's not nice way.
In case of windowed store, you can use "compact, delete" retention policy.

Get latest values from a topic on consumer start, then continue normally

We have a Kafka producer that produces keyed messages in a very high frequency to topics whose retention time = 10 hours. These messages are real-time updates and the used key is the ID of the element whose value has changed. So the topic is acting as a changelog and will have many duplicate keys.
Now, what we're trying to achieve is that when a Kafka consumer launches, regardless of the last known state (new consumer, crashed, restart, etc..), it will somehow construct a table with the latest values of all the keys in a topic, and then keeps listening for new updates as normal, keeping the minimum load on Kafka server and letting the consumer do most of the job. We tried many ways and none of them seems the best.
What we tried:
1 changelog topic + 1 compact topic:
The producer sends the same message to both topics wrapped in a transaction to assure successful send.
Consumer launches and requests the latest offset of the changelog topic.
Consumes the compacted topic from beginning to construct the table.
Continues consuming the changelog since the requested offset.
Cons:
Having duplicates in compacted topic is a very high possibility even with setting the log compaction frequency the highest possible.
x2 number of topics on Kakfa server.
KSQL:
With KSQL we either have to rewrite a KTable as a topic so that consumer can see it (Extra topics), or we will need consumers to execute KSQL SELECT using to KSQL Rest Server and query the table (Not as fast and performant as Kafka APIs).
Kafka Consumer API:
Consumer starts and consumes the topic from beginning. This worked perfectly, but the consumer has to consume the 10 hours change log to construct the last values table.
Kafka Streams:
By using KTables as following:
KTable<Integer, MarketData> tableFromTopic = streamsBuilder.table("topic_name", Consumed.with(Serdes.Integer(), customSerde));
KTable<Integer, MarketData> filteredTable = tableFromTopic.filter((key, value) -> keys.contains(value.getRiskFactorId()));
Kafka Streams will create 1 topic on Kafka server per KTable (named {consumer_app_id}-{topic_name}-STATE-STORE-0000000000-changelog), which will result in a huge number of topics since we a big number of consumers.
From what we have tried, it looks like we need to either increase the server load, or the consumer launch time. Isn't there a "perfect" way to achieve what we're trying to do?
Thanks in advance.
By using KTables, Kafka Streams will create 1 topic on Kafka server per KTable, which will result in a huge number of topics since we a big number of consumers.
If you are just reading an existing topic into a KTable (via StreamsBuilder#table()), then no extra topics are being created by Kafka Streams. Same for KSQL.
It would help if you could clarify what exactly you want to do with the KTable(s). Apparently you are doing something that does result in additional topics being created?
1 changelog topic + 1 compact topic:
Why were you thinking about having two separate topics? Normally, changelog topics should always be compacted. And given your use case description, I don't see a reason why it should not be:
Now, what we're trying to achieve is that when a Kafka consumer launches, regardless of the last known state (new consumer, crashed, restart, etc..), it will somehow construct a table with the latest values of all the keys in a topic, and then keeps listening for new updates as normal [...]
Hence compaction would be very useful for your use case. It would also prevent this problem you described:
Consumer starts and consumes the topic from beginning. This worked perfectly, but the consumer has to consume the 10 hours change log to construct the last values table.
Note that, to reconstruct the latest table values, all three of Kafka Streams, KSQL, and the Kafka Consumer must read the table's underlying topic completely (from beginning to end). If that topic is NOT compacted, this might indeed take a long time depending on the data volume, topic retention settings, etc.
From what we have tried, it looks like we need to either increase the server load, or the consumer launch time. Isn't there a "perfect" way to achieve what we're trying to do?
Without knowing more about your use case, particularly what you want to do with the KTable(s) once they are populated, my answer would be:
Make sure the "changelog topic" is also compacted.
Try KSQL first. If this doesn't satisfy your needs, try Kafka Streams. If this doesn't satisfy your needs, try the Kafka Consumer.
For example, I wouldn't use the Kafka Consumer if it is supposed to do any stateful processing with the "table" data, because the Kafka Consumer lacks built-in functionality for fault-tolerant stateful processing.
Consumer starts and consumes the topic from beginning. This worked
perfectly, but the consumer has to consume the 10 hours change log to
construct the last values table.
During the first time your application starts up, what you said is correct.
To avoid this during every restart, store the key-value data in a file.
For example, you might want to use a persistent map (like MapDB).
Since you give the consumer group.id and you commit the offset either periodically or after each record is stored in the map, the next time your application restarts it will read it from the last comitted offset for that group.id.
So the problem of taking a lot of time occurs only initially (during first time). So long as you have the file, you don't need to consume from beginning.
In case, if the file is not there or is deleted, just seekToBeginning in the KafkaConsumer and build it again.
Somewhere, you need to store this key-values for retrieval and why cannot it be a persistent store?
In case if you want to use Kafka streams for whatever reason, then an alternative (not as simple as the above) is to use a persistent backed store.
For example, a persistent global store.
streamsBuilder.addGlobalStore(Stores.keyValueStoreBuilder(Stores.persistentKeyValueStore(topic), keySerde, valueSerde), topic, Consumed.with(keySerde, valueSerde), this::updateValue);
P.S: There will be a file called .checkpoint in the directory which stores the offsets. In case if the topic is deleted in the middle you get OffsetOutOfRangeException. You may want to avoid this, perhaps by using UncaughtExceptionHandler
Refer to https://stackoverflow.com/a/57301986/2534090 for more.
Finally,
It is better to use Consumer with persistent file rather than Streams for this, because of simplicity it offers.

Apache Kafka streaming KTable changelog

I'm using Apache Kafka streaming to do aggregation on data consumed from a Kafka topic. The aggregation is then serialized to another topic, itself consumed and results stored in a DB. Pretty classic use-case I suppose.
The result of the aggregate call is creating a KTable backed up by a Kafka changelog "topic".
This is more complex than that in practice, but let's say it is storing the count and sum of events for a given key (to compute average):
KTable<String, Record> countAndSum = groupedByKeyStream.aggregate(...)
That changelog "topic" does not seem to have a retention period set (I don't see it "expires" on the contrary of the other topics per my global retention setting).
This is actually good/necessary because this avoids losing my aggregation state when a future event comes with the same key.
However on the long run this means this changelog will grow forever (as more keys get in)? And I do potentially have a lot of keys (and my aggregation are not as small as count/sum).
As I have a means to know that I won't get anymore events of a particular key (some events are marked as "final"), is there a way for me to strip the aggregation states for these particular keys of the changelog to avoid having it grows forever as I won't need them anymore, possibly with a slight delay "just" in case?
Or maybe there is a way to do this entirely differently with Kafka streaming to avoid this "issue"?
Yes: changelog topics are configured with log compaction and not with retention time. If you receive the "final" record, your aggregation can just return null as aggregation result. This will delete it from the local RocksDB store as well as the underlying changelog topic.