How long is the data in KTable stored? - apache-kafka

This as reference, stream of profile updates stored in KTable object.
How long this data will be stored in KTable object?
Let say we run multiple instance of application. And somehow, an instance crash. How about KTable data belong to that instance? Is it will be "recovered" by another instance?
I am thinking about storing update of data that rarely updated. So if an instance crash and another instance will be build those data from scratch again, it is possible they will never get thos data again. Because they never be streamed again, or easy saying, very rarely.

The KTable is backed by a topic, so it would determine on what its retention + cleanup policies are.
If the cleanup policy is compact, then each unique key is stored "forever", or until the broker runs out of space, whichever is sooner.
If you run multiple instances, then each KTable will hold onto a subset of data from the partitions it consumed from, each table will not have all the data.
If any instance crashes, it will need to read all data from the beginning of its changelog topic, but you can configure standby replicas to account for that scenario
More info at https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Internal+Data+Management

Related

How does KStreams handle state store data when adding additional partitions?

I have one partition of data with one app instance and one local state store. It's been running for a time and has lots of stateful data. I need to update that to 5 partitions with 5 app instances. What happens to the one local state store when the partitions are added and the app is brought back online? Do I have to delete the local state store and start over? Will the state store be shuffled across the additional app instance state stores automatically according to the partitioning strategy?
Do I have to delete the local state store and start over?
That is the recommended way to handle it. (cf https://docs.confluent.io/platform/current/streams/developer-guide/app-reset-tool.html) As a matter of fact, if you change the number of input topic partitions and restart your application, Kafka Stream would fail with an error, because the state store has only one shard, while 5 shards would be expected given that you will have 5 input topic partitions now.
Will the state store be shuffled across the additional app instance state stores automatically according to the partitioning strategy?
No. Also note, that this also applies to your data in your input topic. Thus, if you plan to partition your input data by key (ie, when writing into the input topic upstream), old records would remain in the existing partition and thus would not be partitioned properly.
In general, it is recommended to over-partitions your input topics upfront, to avoid that you need to change the number of partitions later on. Thus, you might also consider to maybe go up to 10, or even 20 partitions instead of just 5.

Can we share an application level cache between multiple Kafka Streams tasks

Let's say I have an in memory cache in a Kafka Streams application. the input topic has 2 partitions so for maximum parallelism I configure 1 streams application instance with 2 threads.
Within my stream processor, I make remote call to fetch some data and put it in a Map to cache it.
Since Kafka streams will assign 1 thread to each task and both tasks will try to update the cached map in parallel, do I have to take care of making the cached map thread safe? Is it not advisable to share an application level cache in an application instance that could be running multiple Kafka streams tasks?
I believe what you are looking for, is a GlobalKTable, which stores data from all the partitions. The way I see it is, you would need to make that remote call, push the result into a topic and then use that topic to create a GlobalKTable within the same app. GlobalKTable is backed by a RocksDB instance which stores data in your "local" file system, and can be queried using the key, much like how you would query a Map.
Word of caution: GlobalKTable source topics can get really huge and might impact your startup times if you aren't using a persistent file system, since the GlobalKTable needs to be hydrated with all the data on the "source" topic (this is done by GlobalStreamThread) before the app actually starts. So, you might want to configure compaction on the "source" topic.

Apache Kafka - KStream and KTable hard disk space requirements

I am trying to, better, understand what happens in the level of resources when you create a KStream and a KTable. Below, I wil mention some conclusions that I have come to, as I understand them (feel free to correct me).
Firstly, every topic has a number of partitions and all the messages in those partitions are stored in the hard disk(s) in continuous order.
A KStream does not need to store the messages, that are read from a topic, again to another location, because the offset is sufficient to retrieve those messages from the topic which is connected to.
(Is this correct? )
The question regards the KTable. As I have understand, a KTable, in contrast with a KStream, updates every message with the with the same key. In order to do that, you have to either store externally the messages that arrive from the topic to a static table, or read all the message queue, each time a new message arrives. The later does not seem very efficient regarding time performance. Is the first approach I presented correct?
read all the message queue, each time a new message arrives.
All messages are only read at the fresh start of the application. Once the app reads up to the latest offset, it's just updating the table like any other consumer
How disk usage is determined ultimately depends on the state store you've configured for the application, along with its own settings. For example, in-memory vs rocksdb vs an external state store interface that you've written on your own

Difference between KTable and local store

What the difference between these entities?
As i think, KTable - simple kafka topic with compaction deletion policy. Also, if logging is enabled for KTable, then there is also changelog and then, deletion policy is compaction,delete.
Local store - In-memory key-value cache based on RockDB. But local store also has a changelog.
In both cases, we get the last value for key for a certain period of time (?). Local store is used for aggregation steps, joins and etc. But new topic with compaction strategy also created after it.
For example:
KStream<K, V> source = builder.stream(topic1);
KTable<K, V> table = builder.table(topic2); // what will happen here if i read data from topic with deletion policy delete and compaction? Will additional topic be created for store data or just a local store (cache) be used for it?
// or
KTable<K, V> table2 = builder.table(..., Materialized.as("key-value-store-name")) // what will happen here? As i think, i just specified a concrete name for local store and now i can query it as a regular key-value store
source.groupByKey().aggregate(initialValue, aggregationLogic, Materialized.as(...)) // Will new aggregation topic be created here with compaction deletion policy? Or only local store will be used?
Also i can create a state store using builder builder.addStateStore(...) where i can enable/disable logging(changelog) and caching(???).
I've read this: https://docs.confluent.io/current/streams/developer-guide/memory-mgmt.html, but some details are still unclear for me. Especially the case when we can disable StreamCache (but not RockDB cache) and we will get a full copy of CDC system for relational database
A KTable is a logical abstraction of a table that is updated over time. Additionally, you can think of it not as a materialized table, but as a changelog stream that consists of all update records to the table. Compare https://docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables. Hence, conceptually a KTable is something hybrid if you wish, however, it's easier to think of it as a table that is updated over time.
Internally, a KTable is implemented using RocksDB and a topic in Kafka. RocksDB stores the current data of the table (note, that RocksDB is not an in-memory store, and can write to disk). At the same time, each update to the KTable (ie, to RocksDB) is written into the corresponding Kafka topic. The Kafka topic is used for fault-tolerance reasons (note, that RocksDB itself is considered ephemeral and writing to disk via RocksDB does not provide fault-tolerance, but the used changelog topic), and is configured with log compaction enabled to make sure that the latest state of RocksDB can be restored by reading from the topic.
If you have a KTable that is created by a windowed aggregation, the Kafka topic is configured with compact,delete to expired old data (ie, old windows) to avoid that the table (ie, RocksDB) grows unbounded.
Instead of RocksDB, you can also use an in-memory store for a KTable that does not write to disk. This store would also have a changelog topic that tracks all updates to the store for fault-tolerance reasons.
If you add a store manually via builder.addStateStore() you can also add RocksDB or in-memory stores. In this case, you can enable changelogging for fault-tolerance similar to a KTable (note, that when a KTable is created, internally, it uses the exact same API -- ie, a KTable is a higher level abstractions hiding some internal details).
For caching: this is implemented within Kafka Streams and on top of a store (either RocksDB or in-memory) and you can enable/disable is for "plain" stores you add manually, of for KTables. Compare https://docs.confluent.io/current/streams/developer-guide/memory-mgmt.html Thus, caching is independent of RocksDB caching.

Kafka: topic compaction notification?

I was given the following architecture that I'm trying to improve.
I receive a stream of DB changes which end up in a compacted topic. The stream is basically key/value pairs and the keyspace is large (~4 GB).
The topic is consumed by one kafka stream process that stores the data in RockDB (separate for each consumer/shard). The processor does two different things:
join the data into another stream.
check if a message from the topic is a new key or an update to an existing one. If it is an update it sends the old key/value and the new key/value pair to a different topic (updates are rare).
The construct has a couple of problems:
The two different functionalities of the stream processor belong to different teams and should not be part of the same code base. They are put together to save memory. If we separate it we would have to duplicate RockDB's.
I would prefer to use a normal KTable join instead of the handcrafted join that's currently in the code.
RockDB seems to be a bit of overkill if the data is already persisted in a topic. We currently running into some performance issues and I assume it would be faster if we just keep everything in memory.
Question 1:
Is there a way to hook into the compaction process of a compacted topic? I would like a notification (to a different topic) for every key that is actually compacted (including the old and new value).
If this is somehow possible I could easily split the code bases apart and simplify the join.
Question 2:
Any other idea on how this can be solved more elegantly?
You overall design makes sense.
About your join semantics: I guess you need to stick with Processor API as regular KTable cannot provide you want. It's also not possible to hook into the compaction process.
However, Kafka Streams also supports in-memory state stores: https://kafka.apache.org/documentation/streams/developer-guide/processor-api.html#state-stores
RocksDB is used by default, to allow the state to be larger than available main-memory. Spilling to disk with RocksDB to reliability -- however, it also has the advantage, that stores can be recreated quicker if an instance come back online on the same machine, as it's not required to re-read the whole changelog topic.
If you want to split the app into two, is your own decision on how much resources you want to provide.