I'm experimenting with kafka streams and I have the following setup:
I have an existing kafka topic with a key space that is unbounded (but predictable and well known).
My topic has a retention policy (in bytes) to age out old records.
I'd like to materialize this topic into a Ktable where I can use the Interactive Queries API to retrieve records by key.
Is there any way to make my KTable "inherit" the retention policy from my topic? So that when records are aged out of the primary topic, they're no longer available in the ktable?
I'm worried about dumping all of the records into the KTable and having the StateStore grow unbounded.
One solution that I can think of is to transform into a windowed stream with hopping windows equal to a TimeToLive for the record, but I'm wondering if there's a better solution in a more native way.
Thanks.
It is unfortunately not support atm. There is a JIRA though: https://issues.apache.org/jira/browse/KAFKA-4212
Another possibility would be to insert tombstone messages (<key,null>) into the input topic. The KTable would pick those up and delete the corresponding key from the store.
Related
Most of the information I find relates to primary key joins. I understand foreign key joins are a relatively new feature for Kafka Streams. I'm interested in how this will scale. I understand that Kafka Streams parallelism is capped by the number of partitions on each topic, however I have a few questions around what it means to increase the input topic partitions.
Does the foreign key join have the same requirement to co-partition input topics? That is, do both topics need to have the same number of partitions?
How does one add a partitions later after the application has been running in production for months or years? The changelog topics backing each KTable store data from certain input topic partitions. If one is to increase the partitions in the input topics, how does this impact our KTables' state stores and changelogs? Presumably, we cannot just start over and lose that data since it has accumulated over months and years and is essential to performing the join. It may not be quickly replaced by upstream data. Do we need to blow away our state stores, create new input topics, and re-send all KTable changelog topic data to them?
How about the other internal "subscription" topics?
Does the foreign key join have the same requirement to co-partition input topics? That is, do both topics need to have the same number of partitions?
No. For more details check out https://www.confluent.io/blog/data-enrichment-with-kafka-streams-foreign-key-joins/
How does one add a partitions later after the application has been running in production for months or years?
You cannot really do this, even if you don't use Kafka Streams. The issue is, that your input data is partitioned by key, and if you add a partition the partitioning in your input topic breaks. -- The recommended pattern is to create a new topic with different number of partitions.
The changelog topics backing each KTable store data from certain input topic partitions. If one is to increase the partitions in the input topics, how does this impact our KTables' state stores and changelogs?
It would break the application. In fact, Kafka Streams will check and will raise an exception if it detect that the number of input topic partitions does not match the number of changelog topic partitions.
We have a Kafka stream aggregation topology. We need to keep the size of the changeLog topic in check to reduce the Kafka storage costs. So we use transformer (DSL API) in the topology to schedule a punctuation that deletes the old records from the stateStore using keyValueStore.delete().
I am able to verify that after a delete, on further scheduled triggers of the punctuation, the deleted key is not there in the state store. But does it remove the record from the changeLog topic as well? More importantly, does it reduce the size of the changeLog topic as well so that the Kafka storage cost is in check??
Yes, changes to the state store are applied to the changelog topic.
No, there's no actual record deletion into the changelog topic when you issue a "delete" command. Be aware that a "delete" command is in fact a record with a null value (aka tombstone) written into a topic (changelog or any other) - see here:
null values are interpreted in a special way: a record with a null
value represents a "DELETE" or tombstone for the record's key
So, in fact the interpretation is the one that makes it feel like a deletion; one could read a changelog topic (you'll have to know the exact topic's name) as a KStream or by using the Kafka Consumer API and will find the tombstone records there (till removed by the compaction or retention thread). But if you read a changelog or any compacted topic with a KTable than a tombstone record will determine a deletion from the associated store - you'll no longer find the related key in the store despite the fact that it actually exists in the related compacted topic.
If compaction policy is enabled on a topic (by default is enabled on changelog topics) than its records are removed till the last one for a specific key. So at some point you'll only have the delete record because the previous records with the same key are removed by the compaction Kafka thread.
Let’s say I have an A-Event KStream aggregated into an A-Snapshot KTable and a B-Event KStream aggregated into a B-Snapshot KTable. Neither A-Snapshot nor B-Snapshot conveys null values (delete events are instead aggregated as state attribute of the snapshot). At this point, we can assume we have a persisted kafka changelog topic and a rocksDB local store for both the A-KTable and B-KTable aggregations. Then, my topology would join the A-KTable with B-KTable to produce a joined AB-KStream. That said, my problem is with the A-KTable and B-KTable materialization lifecycles (both the changelog topic and the local rocksdb store). Let’s say A-Event topic and B-Event topic retention strategies were set to 2 weeks, is there a way to side effect the kafka internal KTable materialization topic retention policy (changelog and rocksDB) with the upstream event topic delete retention policies? Otherwise, can we configure the KTable materialization with some sort of retention policy that would manage both the changelog topic and the rockdb store lifecycle? Considering I can’t explicitly emit A-KTable and B-KTable snapshot tombstones? I am concerned that the changelog and the local store will grow indefinitely,..,
At the moment, KStream doesn't support out of the box functionality to inject the cleanup in Changelog topics based on the source topics retention policy. By default, it uses the "Compact" retention policy.
There is an open JIRA issue for the same :
https://issues.apache.org/jira/browse/KAFKA-4212
One option is to inject tombstone messages but that's not nice way.
In case of windowed store, you can use "compact, delete" retention policy.
I use Kafka Streams for some aggregations of a TimeWindow.
I'm interested only in the final result of each window, so I use the .suppress() feature which creates a changelog topic for its state.
The retention policy configuration for this changelog topic is defined as "compact" which to my understanding will keep at least the last event for each key in the past.
The problem in my application is that keys often change. This means that the topic will grow indefinitely (each window will bring new keys which will never be deleted).
Since the aggregation is per window, after the aggregation was done, I don't really need the "old" keys.
Is there a way to tell Kafka Streams to remove keys from previous windows?
For that matter, I think configuring the changelog topic retention policy to "compact,delete" will do the job (which is available in kafka according to this: KIP-71, KAFKA-4015.
But is it possible to change the retention policy so using the Kafka Streams api?
suppress() operator sends tombstone messages to the changelog topic if a record is evicted from its buffer and sent downstream. Thus, you don't need to worry about unbounded growth of the topic. Changing the compaction policy might in fact break the guarantees that the operator provide and you might loose data.
Kafka guarantees that messages with same key will always go to the same
partition.
For instance, I have message with the string key: 2329. And two topics t1 and t2. When I perform write of this message it goes into partition 1 in both topics, as expected.
Now the problem itself: I'm using Kafka Streams 0.10.2.0 persistent state store, which automatically creates a backup topic. Now in case of this backup topic message with the key: 2329 goes into another partition (partition 0), which is strange to me.
Has anyone encountered this issue?
I've found where was the issue.
To increase performance I've skipped repartitioning before write data to the state store. And used another column from the value as the state store key. And it worked until additional topic with enrichment information has been added. So I just forgot to perform repartitioning.