Kafka Streams TimestampExtractor - apache-kafka

Hi everybody I have a question about TimestampExtractor and Kafka Streams....
In our application there is a possibility of receiving out-of-order events, so I like to order the events depending on a business date inside of the payload instead in point of time they placed in the topic.
For this purpose I programmed a custom TimestampExtractor to be able to pull the timestamp from the payload. Everything until I told here worked perfectly but when I build the KTable to this topic, I discerned that the event that I receive out-of-order (from Business point of view it is not last event but it received at the end) displayed as last state of the object while ConsumerRecord having the timestamp from the payload.
I don't know may be it was my mistake to assume Kafka Stream will fix this out-of-order problem with TimestampExtractor.
Then during debugging I saw that if the TimestampExtractor returns -1 as result Kafka Streams are ignoring the message and TimestampExtractor also delivering the timestamp of the last accepted Event, so I build a logic that realise the following check (payloadTimestamp < previousTimestamp) return -1, which achieves the logic I want but I am not sure I am sailing on dangerous waters or not.
Am I allowed to deal with a logic like this or what other ways exist to deal with out-of-order events in Kafka streams....
Thx for answers..

Currently (Kafka 2.0), KTables don't consider timestamps when they are updated, because the assumption is, that there is no out-of-order data in the input topic. The reason for this assumption is the "single writer principle" -- it's assumed, that for compacted KTable input topic, there is only one producer per key, and thus, there won't be any out-of-order data with regard to single keys.
It's a know issue: https://issues.apache.org/jira/browse/KAFKA-6521
For your fix: it's not 100% correct or safe to do this "hack":
First, assume you have two different messages with two different key <key1, value1, 5>, <key2, value2, 3>. The second record with timestamp 3 is later, compared to the first record with timestamp 5. However, both have different keys and thus, you actually want to put the second record into the KTable. Only if you have two record with the same key, you want to drop late arriving data IHMO.
Second, if you have two records with the same key and the second one if out-of-order and you crash before processing the second one, the TimestampExtractor looses the timestamp of the first record. Thus on restart, it would not discard the out-of-order record.
To get this right, you will need to filter "manually" in your application logic instead of the stateless and key-agnostic TimestampExtractor. Instead of reading the data via builder#table() you can read it as a stream, and apply an .groupByKey().reduce() to build the KTable. In you Reducer logic, you compare the timestamp of the new and old record and return the record with the larger timestamp.

Related

Is there a way to tell which event occurred first in two kafka topics

If I have two topics in kafka, is there a way to tell if one event in one topic "occured" before an event in another topic if they both come in within a millisecond of each other ie they have the same timestamp?
Background:
I am building an event sourcing based event drive architecture. Often, when an event occurs in one topic, I need to do a scan to find if a separate event has already occurred in a second topic. Likewise, if the event in the second topic comes in, I need to scan to see if the event in topic one occurred.
In order to not duplicate processing, I need a deterministic way to order the events. If the events are more than 1 millisecond apart, I can just use the timestamp in the event. But, because kafka timestamps only go to the millisecond, when two events occur close together, I can no longer use this approach.
In reality, I don't care which topic occured "first", ie if kafka posted one before another, even if they came in a different order, I don't care. I just need a deterministic way to order them.
In reality, I can use some method, such as arranging the events by topic alphabetically, but was hoping there was a built-in mechanism. (don't want to introduce weird bugs because I always process event A before event B; unlikely, but I've seen it happen)
PS I am open to other ideas. I'm thinking this approach because it was possible in redis streams. However, because of things I can't control, I am restricted to kafka. I do want to avoid using an external data store as then I need to start worrying about data synchronization in there.
You're going to run into synchronization issues, regardless. For example - you could try using a stream-topic join in Kafka Streams. If the event doesn't exist for the join, then it hasn't happened yet, but then you're reliant on having absolutely zero lag in the consumer processes building that KTable.
You could try storing nanoseconds as part of the value or header when you create the record if you need higher precision, but again, you're going to either need absolute zero lag or very precise consumer poll events with some comparison window as Kafka does not provide any processing guarantees across multiple topics

Kafka log compaction pointers

Reading about log compaction on a topic, I was wondering if there is any way for a consumer to get hold of any of the positions/offsets of the following?
end of the head
start of the tail
compaction cleaner point
Basically the point at which the compacted and non-compacted parts of the log meet?
I've read that there is a cleaner-offset-checkpoint file that sits on the broker at /var/lib/kafka/data/cleaner-offset-checkpoint but is the info in this file available to a consumer?
My use case is a consumer that will consume compacted keys one way and non-compacted keys another way.
thanks for any advice.
UPDATE:
thinking for example of a topic holding various customer events like here https://www.confluent.io/blog/put-several-event-types-kafka-topic/; new customer, customer updates name, customer updates address, etc. Log compaction, I believe, will leave one event per customer in the tail but still many events per customer in the head (assuming compaction is slower than message production..?) A new consumer of this topic would have to treat all compacted messages as CREATES, but then also treat non-compacted message as their more fine grained event? In any case I was wondering if a consumer could tell how far along a topic compaction has got, at any given time?
It's not possible, with the consumer api, no.
If you want to check that checkpoint file on disk, you could use Jssh, for example, to access a broker, and read the file. If it has offset data, you could then use seek methods, but keep in mind that the Log Cleaner thread may be actively running when you seek to or consume that data
A new consumer of this topic would have to treat all compacted messages as CREATES, but then also treat non-compacted message as their more fine grained event?
I don't think this is a valid use case. For a stream of customer updates, you'd just update a new customer model in a table via a streaming reduce function. If any consumer restarts, it'll have to always read from the beginning of the topic to rebuild its local state then continue reading any updates to those stored values, so doesn't make sense to skip past them all, or have two separate consumers
I also don't necessarily think you need different models. Some UUID would be unique, and every event can contain the full model of a "customer". Most fields can remain optional/nullable until they are provided with a new message with all those fields set (or not), and this defines a batch update since you can set/update/remove multiple attributes at once. If you need more granularity, that's also possible to define at the producer level by storing and looping over your attributes and producing individual "customer" objects with each new attribute

Kafka Streams Sort Within Processing Time Window

I wonder if there's any way to sort records within a window using Kafka Streams DSL or Processor API.
Imagine the following situation as an example (arbitrary one, but similar to what I need):
There is a Kafka topic of some events, let's say user clicks. Let's say topic has 10 partitions. Messages are partitioned by key, but each key is unique, so it's sort of a random partitioning. Each record contains a user id, which is used later to repartition the stream.
We consume the stream, and publish each message to another topic partitioning the record by it's user id (repartition the original stream by user id).
Then we consume this repartitioned stream, and we store consumed records in local state store windowed by 10 minutes. All clicks of a particular user are always in the same partition, but order is not guarantied, because the original topic had 10 partitions.
I understand the windowing model of Kafka Streams, and that time is advanced when new records come in, but I need this window to use processing time, not the event time, and then when window is expired, I need to be able to sort buffered events, and emit them in that order to another topic.
Notice:
We need to be able to flush/process records within the window using processing time, not the event time. We can't wait for the next click to advance the time, because it may never happen.
We need to remove all the records from the store, as soon window is sorted and flushed.
If application crashes, we need to recover (in the same or another instance of the application) and process all the windows that were not processed, without waiting for new records to come for a particular user.
I know Kafka Streams 1.0.0 allows to use wall clock time in Processing API, but I'm not sure what would be the right way to implement what I need (more importantly taking into account the recovery process requirement described above).
You can see my answer to a similar question here:
https://stackoverflow.com/a/44345374/7897191
Since your message keys are already unique you can ignore my comments about de-duplication.
Now that KIP-138 (wall-clock punctuation semantics) has been released in 1.0.0 you should be able to implement the outlined algorithm without issues. It uses the Processor API. I don't know of a way of doing this with only the DSL.

Delayed message consumption in Kafka

How can I produce/consume delayed messages with Apache Kafka? Seems like standard Kafka (and Java kafka-client) functionality doesn't have this feature. I know that I could implement it myself with standard wait/notify mechanism, but it doesn't seem very reliable, so any advices and good practices are appreciated.
Found related question, but it didn't help.
As I see: Kafka is based on sequential reads from file system and can be used only to read topics straightforward keeping message ordering. Am I right?
Indeed, kafka lowest structure is a partition, which are sequential events in a queue with incremental offset - you can't insert a log anywhere else than the end at the moment you produce it. There is no concept of delayed messages.
What do you want to achieve exactly?
Some possibilities in your case:
You want to push a message at a specific time (for example, an event "start job"). In this case, use a scheduled task (not from kafka, use some standard way on your os / language / custom app / whatever) to send the message at the given time - consumers will receive them at the proper time.
You want to send an event now, but which should not be taken into account now by consumers. In this case, you can use a custom structure which would include a "time" in its payload. Consumers will have to understand this field and have custom processing to deal with it. For exemple: "start job at 2017-12-27T20:00:00Z". You could also use headers for this, but headers are not supported by all clients for now.
You can change the timestamp of the message sent. Internally, it would still be read in order, but some functions implying time would work differently, and consumer could use the timestamp of the message for its action - this is kinda like the previous proposition, except the timestamp is one metadata of the event, and not the event payload itself. I would not use this personally - I only deal with timestamp when I proxy some events.
For your last question: basically, yes, but with some notes:
Topics are actually split in partition, and order is only preserved in partition. All message with same key are send to same partition.
Most of time, you only read from memory, except if you read old events - in this case, as those are sequentially read from disk, this is very fast
You can choose where to begin to read - a given offset or a given time - and even change it at runtime
You can parallelize read across process - multiple consumers can read the same topics and never reading the same messages twice (each reading different partition, see consumer groups)

Processing records in order in Storm

I'm new to Storm and I'm having problems to figure out how to process records in order.
I have a dataset which contains records with the following fields:
user_id, location_id, time_of_checking
Now, I would like to identify users which have fulfilled the path I specified (for example, users that went from location A to location B to location C).
I'm using Kafka producer and reading this records from a file to simulate live data. Data is sorted by date.
So, to check if my pattern is fulfilled I need to process records in order. The thing is, due to parallelization (bolt replication) I don't get check-ins of user in order. Because of that patterns won't work.
How to overcome this problem? How to process records in order?
There is no general system support for ordered processing in Storm. Either you use a different system that supports ordered steam processing like Apache Flink (Disclaimer, I am a committer at Flink) or you need to take care of it in your bolt code by yourself.
The only support Storm delivers is using Trident. You can put tuples of a certain time period (for example one minute) into a single batch. Thus, you can process all tuples within a minute at once. However, this only works if your use case allows for it because you cannot related tuples from different batches to each other. In your case, this would only be the case, if you know that there are points in time, in which all users have reached their destination (and no other use started a new interaction); ie, you need points in time in which no overlap of any two users occurs. (It seems to me, that your use-case cannot fulfill this requirement).
For non-system, ie, customized user-code based solution, there would be two approaches:
You could for example buffer up tuples and sort on time stamp within a bolt before processing. To make this work properly, you need to inject punctuations/watermarks that ensure that no tuple with larger timestamp than the punctuation comes after a punctuation. If you received a punctuation from each parallel input substream you can safely trigger sorting and processing.
Another way would be to buffer tuples per incoming substream in district buffers (within a substream order is preserved) and merge the tuples from the buffers in order. This has the advantage that sorting is avoided. However, you need to ensure that each operator emits tuples ordered. Furthermore, to avoid blocking (ie, if no input is available for a substream) punctuations might be needed, too. (I implemented this approach. Feel free to use the code or adapt it to your needs: https://github.com/mjsax/aeolus/blob/master/queries/utils/src/main/java/de/hub/cs/dbis/aeolus/utils/TimestampMerger.java)
Storm supports this use case. For this you just have to ensure that order is maintained throughout your flow in all the involved components. So as first step, in Kafka producer, all the messages for a particular user id should go to the same partition in Kafka. For this you can implement a custom Partitioner in your KafkaProducer. Please refer to the link here for implementation details.
Since a partition in Kafka can be read by one and only one kafkaSpout instance in Storm, the messages in that partition come in order in the spout instance. Thereby ensuring that all the messages of the same user id arrive to the same spout.
Now comes the tricky part - to maintain order in bolt, you want to ensure that you use field grouping on bolt based on "user_id" field emitted from the Kafka spout. A provided kafkaSpout does not break the message to emit field, you would have to override the kafkaSpout to read the message and emit a "user_id" field from the spout. One way of doing so is to have an intermediate bolt which reads the message from the Kafkaspout and emits a stream with "user_id" field.
When finally you specify a bolt with field grouping on "user_id", all messages of a particular user_id value would go to the same instance of the bolt, whatever be the degree of parallelism of the bolt.
A sample topology which work for your case could be as follow -
builder.setSpout("KafkaSpout", Kafkaspout);
builder.setBolt("FieldsEmitterBolt", FieldsEmitterBolt).shuffleGrouping("KafkaSpout");
builder.setBolt("CalculatorBolt", CalculatorBolt).fieldsGrouping("FieldsEmitterBolt", new Fields("user_id")); //user_id field emitted by Bolt2
--Beware, there could be case when all the user_id values come to the same CalculatorBolt instance if you have limited number of user_ids. This in turn would decrease the effective 'parallelism'!