KStream-KStream leftJoin not consistently emitting after window expiry - apache-kafka

We have a service where people can order a battery with their solar panels. As part of provisioning we try to fetch some details about the battery product, however it sometimes fails to get any data but we still want to send through the order to our CRM system.
To achieve this we are using the latest version of Kafka Streams leftJoin:
We receive an event on the order-received topic.
We filter out orders that do not contain a battery product.
We then wait up to 30mins for an event to come through on the order-battery-details topic.
If we dont receive that event, we want to send a new event to the battery-order topic with the data we do have.
This seems to be working fine when we receive both events, however it is inconsistent when we only receive the first event. Sometimes the order will come through immediately after the 30 min window, sometimes it takes several hours.
My question is, if the window has expired (ie. we failed to receive the right side of the join), what determines when the event will be sent? And what could be causing the long delay?
Here's a high level example of our service:
#Component
class BatteryOrderProducer {
#Autowired
fun buildPipeline(streamsBuilder: StreamsBuilder) {
// listen for new orders and filter out everything except orders with a battery
val orderReceivedReceivedStream = streamsBuilder.stream(
"order-received",
Consumed.with(Serdes.String(), JsonSerde<OrderReceivedEvent>())
).filter { _, order ->
// check if the order contains a battery product
}.peek { key, order ->
log.info("Received order with a battery product: $key", order)
}
// listen for battery details events
val batteryDetailsStream = streamsBuilder
.stream(
"order-battery-details",
Consumed.with(Serdes.String(), JsonSerde<BatteryDetailsEvent>())
).peek { key, order ->
log.info("Received battery details: $key", order)
}
val valueJoiner: ValueJoiner<OrderReceivedEvent, BatteryDetailsEvent, BatteryOrder> =
ValueJoiner { orderReceived: OrderReceivedEvent, BatteryDetails: BatteryDetailsEvent? ->
// new BatteryOrder
if (BatteryDetails != null) {
// add battery details to the order if we get them
}
// return the BatteryOrder
}
// we always want to send through the battery order, even if we don't get the 2nd event.
orderReceivedReceivedStream.leftJoin(
batteryDetailsStream,
valueJoiner,
JoinWindows.ofTimeDifferenceAndGrace(
Duration.ofMinutes(30),
Duration.ofMinutes(1)
),
StreamJoined.with(
Serdes.String(),
JsonSerde<OrderReceivedEvent>(),
JsonSerde<BatteryDetailsEvent>()
).withStoreName("battery-store")
).peek { key, value ->
log.info("Merged BatteryOrder", value)
}.to(
"battery-order",
Produced.with(
Serdes.String(),
JsonSerde<BatteryOrder>()
)
)
}
}

The leftJoin will not trigger as long as there are no new recods. So if I have an order-received record with key A at time t, and then there is no new record (on either side of the join) for the next 5 hours, then there will be no output for the join for these 5 hours, because the leftJoin will not be triggered. In particular, leftJoin needs to receive a record that has a timestamp > t + 30m, for a null result to be sent.
I think to satisfy your requirements, you need to work with the more low-level Processor API: https://kafka.apache.org/documentation/streams/developer-guide/processor-api.html
In a Processor, you can define a Punctuator that runs regularly and checks if an order has been waiting for more than half an hour for details, and sends off the null record accordingly.

Related

Stale ktable records when joining kstream with ktable created by kstream aggregation

I'm trying to implement the event sourcing pattern with kafka streams in the following way.
I'm in a Security service and handle two use cases:
Register User, handling RegisterUserCommand should produce UserRegisteredEvent.
Change User Name, handling ChangeUserNameCommand should produce UserNameChangedEvent.
I have two topics:
Command Topic, 'security-command'. Every command is keyed and the key is user's email. For example:
foo#bar.com:{"type": "RegisterUserCommand", "command": {"name":"Alex","email":"foo#bar.com"}}
foo#bar.com:{"type": "ChangeUserNameCommand", "command": {"email":"foo#bar.com","newName":"Alex1"}}
Event Topic, 'security-event'. Every record is keyed by user's email:
foo#bar.com:{"type":"UserRegisteredEvent","event":{"email":"foo#bar.com","name":"Alex", "version":0}}
foo#bar.com:{"type":"UserNameChangedEvent","event":{"email":"foo#bar.com","name":"Alex1","version":1}}
Kafka Streams version 2.8.0
Kafka version 2.8
The implementation idea can be expressed in the following topology:
commandStream = builder.stream("security-command");
eventStream = builder.stream("security-event",
Consumed.with(
...,
new ZeroTimestampExtractor()
/*always returns 0 to get the latest version of snapshot*/));
// build the snapshot to get the current state of the user.
userSnapshots = eventStream.groupByKey()
.aggregate(() -> new UserSnapshot(),
(key /*email*/, event, currentSnapshot) -> currentSnapshot.apply(event));
// join commands with latest snapshot at the time of the join
commandWithSnapshotStream =
commandStream.leftJoin(
userSnapshots,
(command, snapshot) -> new CommandWithUserSnapshot(command, snapshot),
joinParams
);
// handle the command given the current snapshot
resultingEventStream = commandWithSnapshotStream.flatMap((key /*email*/, commandWithSnapshot) -> {
var newEvents = commandHandler(commandWithSnapshot.command(), commandWithSnapshot.snapshot());
return Arrays.stream(newEvents )
.map(e -> new KeyValue<String, DomainEvent>(e.email(), e))
.toList();
});
// append events to events topic
resultingEventStream.to("security-event");
For this topology, I'm using EOS exactly_once_beta.
A more explicit version of this topology:
KStream<String, Command<DomainEvent[]>> commandStream =
builder.stream(
commandTopic,
Consumed.with(Serdes.String(), new SecurityCommandSerde()));
KStream<String, DomainEvent> eventStream =
builder.stream(
eventTopic,
Consumed.with(
Serdes.String(),
new DomainEventSerde(),
new LatestRecordTimestampExtractor() /*always returns 0 to get the latest snapshot of the snapshot.*/));
// build the snapshots ktable by aggregating all the current events for a given user.
KTable<String, UserSnapshot> userSnapshots =
eventStream.groupByKey()
.aggregate(
() -> new UserSnapshot(),
(email, event, currentSnapshot) -> currentSnapshot.apply(event),
Materialized.with(
Serdes.String(),
new UserSnapshotSerde()));
// join command stream and snapshot table to get the stream of pairs <Command, UserSnapshot>
Joined<String, Command<DomainEvent[]>, UserSnapshot> commandWithSnapshotJoinParams =
Joined.with(
Serdes.String(),
new SecurityCommandSerde(),
new UserSnapshotSerde()
);
KStream<String, CommandWithUserSnapshot> commandWithSnapshotStream =
commandStream.leftJoin(
userSnapshots,
(command, snapshot) -> new CommandWithUserSnapshot(command, snapshot),
commandWithSnapshotJoinParams
);
var resultingEventStream = commandWithSnapshotStream.flatMap((key /*email*/, commandWithSnapshot) -> {
var command = commandWithSnapshot.command();
if (command instanceof RegisterUserCommand registerUserCommand) {
var handler = new RegisterUserCommandHandler();
var events = handler.handle(registerUserCommand);
// multiple events might be produced when a command is handled.
return Arrays.stream(events)
.map(e -> new KeyValue<String, DomainEvent>(e.email(), e))
.toList();
}
if (command instanceof ChangeUserNameCommand changeUserNameCommand) {
var handler = new ChangeUserNameCommandHandler();
var events = handler.handle(changeUserNameCommand, commandWithSnapshot.userSnapshot());
return Arrays.stream(events)
.map(e -> new KeyValue<String, DomainEvent>(e.email(), e))
.toList();
}
throw new IllegalArgumentException("...");
});
resultingEventStream.to(eventTopic, Produced.with(Serdes.String(), new DomainEventSerde()));
Problems I'm getting:
Launching the stream app on a command topic with existing records:
foo#bar.com:{"type": "RegisterUserCommand", "command": {"name":"Alex","email":"foo#bar.com"}}
foo#bar.com:{"type": "ChangeUserNameCommand", "command": {"email":"foo#bar.com","newName":"Alex1"}}
Outcome:
1. Stream application fails when processing the ChangeUserNameCommand, because the snapshot is null.
2. The events topic has a record for successful registration, but nothing for changing the name:
/*OK*/foo#bar.com:{"type":"UserRegisteredEvent","event":{"email":"foo#bar.com","name":"Alex", "version":0}}
Thoughts:
When processing the ChangeUserNameCommand, the snapshot is missing in the aggregated KTable, userSnapshots. Restarting the application succesfully produces the following record:
foo#bar.com: {"type":"UserNameChangedEvent","event":{"email":"foo#bar.com","name":"Alex1","version":1}}
Tried increasing the max.task.idle.ms to 4 seconds - no effect.
Launching the stream app and producing a set of ChangeUserNameCommand commands at a time (fast).
Producing:
// Produce to command topic
foo#bar.com:{"type": "RegisterUserCommand", "command": {"name":"Alex","email":"foo#bar.com"}}
// event topic outcome
/*OK*/ foo#bar.com:{"type":"UserRegisteredEvent","event":{"email":"foo#bar.com","name":"Alex", "version":0}}
// Produce at once to command topic
foo#bar.com:{"type": "ChangeUserNameCommand", "command": {"email":"foo#bar.com","newName":"Alex1"}}
foo#bar.com:{"type": "ChangeUserNameCommand", "command": {"email":"foo#bar.com","newName":"Alex2"}}
foo#bar.com:{"type": "ChangeUserNameCommand", "command": {"email":"foo#bar.com","newName":"Alex3"}}
// event topic outcome
/*OK*/foo#bar.com: {"type":"UserNameChangedEvent","event":{"email":"foo#bar.com","name":"Alex1","version":1}}
/*NOK*/foo#bar.com: {"type":"UserNameChangedEvent","event":{"email":"foo#bar.com","name":"Alex2","version":1}}
/*NOK*/foo#bar.com: {"type":"UserNameChangedEvent","event":{"email":"foo#bar.com","name":"Alex3","version":1}}
Thoughts:
'ChangeUserNameCommand' commands are joined with a stale version of snapshot (pay attention to the version attribute).
The expected outcome would be:
foo#bar.com: {"type":"UserNameChangedEvent","event":{"email":"foo#bar.com","name":"Alex1","version":1}}
foo#bar.com: {"type":"UserNameChangedEvent","event":{"email":"foo#bar.com","name":"Alex2","version":2}}
foo#bar.com: {"type":"UserNameChangedEvent","event":{"email":"foo#bar.com","name":"Alex3","version":3}}
Tried increasing the max.task.idle.ms to 4 seconds - no effect, setting the cache_max_bytes_buffering to 0 has no effect.
What am I missing in building such a topology? I expect that every command to be processed on the latest version of the snapshot. If I produce the commands with a few seconds delay between them, everything works as expected.
I think you missed change-log recovery part for the Tables. Read this to understand what happens with change-log recovery.
For tables, it is more complex because they must maintain additional
information—their state—to allow for stateful processing such as joins
and aggregations like COUNT() or SUM(). To achieve this while also
ensuring high processing performance, tables (through their state
stores) are materialized on local disk within a Kafka Streams
application instance or a ksqlDB server. But machines and containers
can be lost, along with any locally stored data. How can we make
tables fault tolerant, too?
The answer is that any data stored in a table is also stored remotely
in Kafka. Every table has its own change stream for this purpose—a
built-in change data capture (CDC) setup, we could say. So if we have
a table of account balances by customer, every time an account balance
is updated, a corresponding change event will be recorded into the
change stream of that table.
Also keep in mind, Restart a Kafka stream application should not process previously processed events. For that you need to commit offset of the message after processed it.
Found the root cause. Not sure if it is by design or a bug, but a stream task will wait only once per processing cycle for data in other partitions.
So if 2 records from command topic were read first, the stream task will wait max.task.idle.ms, allowing the poll() phase to happen, when processing the first command record. After it is processed, during processing the second one, the stream task will not allow polling to get newly generated events that resulted from first command processing.
In kafka 2.8, the code that is responsible for this behavior is in StreamTask.java. IsProcessable() is invoked at the beginning of processing phase. If it returns false, this will lead to repeating the polling phase.
public boolean isProcessable(final long wallClockTime) {
if (state() == State.CLOSED) {
return false;
}
if (hasPendingTxCommit) {
return false;
}
if (partitionGroup.allPartitionsBuffered()) {
idleStartTimeMs = RecordQueue.UNKNOWN;
return true;
} else if (partitionGroup.numBuffered() > 0) {
if (idleStartTimeMs == RecordQueue.UNKNOWN) {
idleStartTimeMs = wallClockTime;
}
if (wallClockTime - idleStartTimeMs >= maxTaskIdleMs) {
return true;
// idleStartTimeMs is not reset to default, RecordQueue.UNKNOWN, value,
// therefore the next time when the check for all buffered partitions is done, `true` is returned, meaning that the task is ready to be processed.
} else {
return false;
}
} else {
// there's no data in any of the topics; we should reset the enforced
// processing timer
idleStartTimeMs = RecordQueue.UNKNOWN;
return false;
}
}

Sample most recent element of Akka Stream with trigger signal, using zipWith?

I have a Planning system that computes kind of a global Schedule from customer orders. This schedule changes over time when customers place or revoke orders to this system, or when certain resources used by events within the schedule become unavailable.
Now another system needs to know the status of certain events in the Schedule. The system sends a StatusRequest(EventName) on a message queue to which I must react with a corresponding StatusSignal(EventStatus) on another queue.
The Planning system gives me an akka-streams Source[Schedule] which emits a Schedule whenever the schedule changed, and I also have a Source[StatusRequest] from which I receive StatusRequests and a Sink[StatusSignal] to which I can send StatusSignal responses.
Whenever I receive a StatusRequest I must inspect the current schedule, ie, the most recent value emitted by Source[Schedule], and send a StatusSignal to the sink.
I came up with the following flow
scheduleSource
.zipWith(statusRequestSource) { (schedule, statusRequest) =>
findEventStatus(schedule, statusRequest.eventName))
}
.map(eventStatus => makeStatusSignal(eventStatus))
.runWith(statusSignalSink)
but I am not at all sure when this flow actually emits values and whether it actually implements my requirement (see bold text above).
The zipWith reference says (emphasis mine):
emits when all of the inputs have an element available
What does this mean? When statusRequestSource emits a value does the flow wait until scheduleSource emits, too? Or does it use the last value scheduleSource emitted? Likewise, what happens when scheduleSource emits a value? Does it trigger a status signal with the last element in statusRequestSource?
If the flow doesn't implement what I need, how could I achieve it instead?
To answer your first set of questions regarding the behavior of zipWith, here is a simple test:
val source1 = Source(1 to 5)
val source2 = Source(1 to 3)
source1
.zipWith(source2){ (s1Elem, s2Elem) => (s1Elem, s2Elem) }
.runForeach(println)
// prints:
// (1,1)
// (2,2)
// (3,3)
zipWith will emit downstream as long as both inputs have respective elements that can be zipped together.
One idea to fulfill your requirement is to decouple scheduleSource and statusRequestSource. Feed scheduleSource to an actor, and have the actor track the most recent element it has received from the stream. Then have statusRequestSource query this actor, which will reply with the most recent element from scheduleSource. This actor could look something like the following:
class LatestElementTracker extends Actor with ActorLogging {
var latestSchedule: Option[Schedule] = None
def receive = {
case schedule: Schedule =>
latestSchedule = Some(schedule)
case status: StatusRequest =>
if (latestSchedule.isEmpty) {
log.debug("No schedules have been received yet.")
} else {
val eventStatus = findEventStatus(latestSchedule.get, status.eventName)
sender() ! eventStatus
}
}
}
To integrate with the above actor:
scheduleSource.runForeach(s => trackerActor ! s)
statusRequestSource
.ask[EventStatus](parallelism = 1)(trackerActor) // adjust parallelism as needed
.map(eventStatus => makeStatusSignal(eventStatus))
.runWith(statusSignalSink)

Apache Kafka Grouping Twice

I'm writing an application where I'm trying to count the number of users who visit a page every hour. I'm trying to filter to specific events, group by the userId and event hour time, then group by just the hour to get the number of users. But grouping the KTable causes excessive cpu burn and locks when trying to close the streams. Is there a better way to do this?
events
.groupBy(...)
.aggregate(...)
.groupBy(...);
.count();
Given the answer to your question above "I just want to know within an hour time window the number of users that perfomed a specific action", I would suggest the following.
Assuming you have a record something like this:
class ActionRecord {
String actionType;
String user;
}
You can define an aggregate class something like this:
class ActionRecordAggregate {
private Set<String> users = new HashSet<>();
public void add(ActionRecord rec) {
users.add(rec.getUser());
}
public int count() {
return users.size();
}
}
Then your streaming app can:
accept the events
rekey them according to event type (the .map() )
group them by event type (.groupByKey())
window them by time (selected 1 minute but YMMV)
aggregate them into ActionRecordAggregate
materialize them into a StateStore
so this looks something like:
stream()
.map((key, val) -> KeyValue.pair(val.actionType, val))
.groupByKey()
.windowedBy(TimeWindows.of(60*1000))
.aggregate(
ActionRecordAggregate::new,
(key, value, agg) -> agg.add(value),
Materialized
.<String, ActionRecordAggregate, WindowStore<Bytes, byte[]>>as("actionTypeLookup")
.withValueSerde(getSerdeForActionRecordAggregate())
);
Then, to get the events back, you can query your state store:
ReadOnlyWindowStore<String, ActionRecordAggregate> store =
streams.store("actionTypeLookup", QueryableStoreTypes.windowStore());
WindowStoreIterator<ActionRecordAggregate> wIt =
store.fetch("actionTypeToGet", startTimestamp, endTimestamp);
int totalCount = 0;
while(wIt.hasNext()) {
totalCount += wIt.next().count();
}
// totalCount is the number of distinct users in your
// time interval that raised action type "actionTypeToGet"
Hope this helps!

How to send final kafka-streams aggregation result of a time windowed KTable?

What I'd like to do is this:
Consume records from a numbers topic (Long's)
Aggregate (count) the values for each 5 sec window
Send the FINAL aggregation result to another topic
My code looks like this:
KStream<String, Long> longs = builder.stream(
Serdes.String(), Serdes.Long(), "longs");
// In one ktable, count by key, on a five second tumbling window.
KTable<Windowed<String>, Long> longCounts =
longs.countByKey(TimeWindows.of("longCounts", 5000L));
// Finally, sink to the long-avgs topic.
longCounts.toStream((wk, v) -> wk.key())
.to("long-counts");
It looks like everything works as expected, but the aggregations are sent to the destination topic for each incoming record. My question is how can I send only the final aggregation result of each window?
In Kafka Streams there is no such thing as a "final aggregation". Windows are kept open all the time to handle out-of-order records that arrive after the window end-time passed. However, windows are not kept forever. They get discarded once their retention time expires. There is no special action as to when a window gets discarded.
See Confluent documentation for more details: http://docs.confluent.io/current/streams/
Thus, for each update to an aggregation, a result record is produced (because Kafka Streams also update the aggregation result on out-of-order records). Your "final result" would be the latest result record (before a window gets discarded). Depending on your use case, manual de-duplication would be a way to resolve the issue (using lower lever API, transform() or process())
This blog post might help, too: https://timothyrenner.github.io/engineering/2016/08/11/kafka-streams-not-looking-at-facebook.html
Another blog post addressing this issue without using punctuations: http://blog.inovatrend.com/2018/03/making-of-message-gateway-with-kafka.html
Update
With KIP-328, a KTable#suppress() operator is added, that will allow to suppress consecutive updates in a strict manner and to emit a single result record per window; the tradeoff is an increase latency.
From Kafka Streams version 2.1, you can achieve this using suppress.
There is an example from the mentioned apache Kafka Streams documentation that sends an alert when a user has less than three events in an hour:
KGroupedStream<UserId, Event> grouped = ...;
grouped
.windowedBy(TimeWindows.of(Duration.ofHours(1)).grace(ofMinutes(10)))
.count()
.suppress(Suppressed.untilWindowCloses(unbounded()))
.filter((windowedUserId, count) -> count < 3)
.toStream()
.foreach((windowedUserId, count) -> sendAlert(windowedUserId.window(), windowedUserId.key(), count));
As mentioned in the update of this answer, you should be aware of the tradeoff. Moreover, note that suppress() is based on event-time.
I faced the issue, but I solve this problem to add grace(0) after the fixed window and using Suppressed API
public void process(KStream<SensorKeyDTO, SensorDataDTO> stream) {
buildAggregateMetricsBySensor(stream)
.to(outputTopic, Produced.with(String(), new SensorAggregateMetricsSerde()));
}
private KStream<String, SensorAggregateMetricsDTO> buildAggregateMetricsBySensor(KStream<SensorKeyDTO, SensorDataDTO> stream) {
return stream
.map((key, val) -> new KeyValue<>(val.getId(), val))
.groupByKey(Grouped.with(String(), new SensorDataSerde()))
.windowedBy(TimeWindows.of(Duration.ofMinutes(WINDOW_SIZE_IN_MINUTES)).grace(Duration.ofMillis(0)))
.aggregate(SensorAggregateMetricsDTO::new,
(String k, SensorDataDTO v, SensorAggregateMetricsDTO va) -> aggregateData(v, va),
buildWindowPersistentStore())
.suppress(Suppressed.untilWindowCloses(unbounded()))
.toStream()
.map((key, value) -> KeyValue.pair(key.key(), value));
}
private Materialized<String, SensorAggregateMetricsDTO, WindowStore<Bytes, byte[]>> buildWindowPersistentStore() {
return Materialized
.<String, SensorAggregateMetricsDTO, WindowStore<Bytes, byte[]>>as(WINDOW_STORE_NAME)
.withKeySerde(String())
.withValueSerde(new SensorAggregateMetricsSerde());
}
Here you can see the result

rx reactive extension: how to have each subscriber get a different value (the next one) from an observable?

Using reactive extension, it is easy to subscribe 2 times to the same observable.
When a new value is available in the observable, both subscribers are called with this same value.
Is there a way to have each subscriber get a different value (the next one) from this observable ?
Ex of what i'm after:
source sequence: [1,2,3,4,5,...] (infinite)
The source is constantly adding new items at an unknown rate.
I'm trying to execute a lenghty async action for each item using N subscribers.
1st subscriber: 1,2,4,...
2nd subscriber: 3,5,...
...
or
1st subscriber: 1,3,...
2nd subscriber: 2,4,5,...
...
or
1st subscriber: 1,3,5,...
2nd subscriber: 2,4,6,...
I would agree with Asti.
You could use Rx to populate a Queue (Blocking Collection) and then have competing consumers read from the queue. This way if one process was for some reason faster it could pick up the next item potentially before the other consumer if it was still busy.
However, if you want to do it, against good advice :), then you could just use the Select operator that will provide you with the index of each element. You can then pass that down to your subscribers and they can fiter on a modulus. (Yuck! Leaky abstractions, magic numbers, potentially blocking, potentiall side effects to the source sequence etc)
var source = Obserservable.Interval(1.Seconds())
.Select((i,element)=>{new Index=i, Element=element});
var subscription1 = source.Where(x=>x.Index%2==0).Subscribe(x=>DoWithThing1(x.Element));
var subscription2 = source.Where(x=>x.Index%2==1).Subscribe(x=>DoWithThing2(x.Element));
Also remember that the work done on the OnNext handler if it is blocking will still block the scheduler that it is on. This could affect the speed of your source/producer. Another reason why Asti's answer is a better option.
Ask if that is not clear :-)
How about:
IObservable<TRet> SomeLengthyOperation(T input)
{
return Observable.Defer(() => Observable.Start(() => {
return someCalculatedValueThatTookALongTime;
}, Scheduler.TaskPoolScheduler));
}
someObservableSource
.SelectMany(x => SomeLengthyOperation(input))
.Subscribe(x => Console.WriteLine("The result was {0}", x);
You can even limit the number of concurrent operations:
someObservableSource
.Select(x => SomeLengthyOperation(input))
.Merge(4 /* at a time */)
.Subscribe(x => Console.WriteLine("The result was {0}", x);
It's important for the Merge(4) to work, that the Observable returned by SomeLengthyOperation be a Cold Observable, which is what the Defer does here - it makes the Observable.Start not happen until someone Subscribes.