How to check time difference between two events - complex-event-processing

How to check time difference between two events. can someone suggest me the approach to do this?
I have different events coming from a device. initially it will be sending "started" event and after sometime it will send "completed" event. I need to calculate the difference in the start and completed event. How can I do this?

In CEP events the time variable is of java type Date so you can use something like:
d1.getTime() - d2.getTime()
and you have the difference in milliseconds

Related

anylogic triggering an event

In accordance, I have a question in the modeling process. I want my agent to have an event that is triggered by time and condition, example: goToSchool if it is more than 6 am and there is a school bus. I am confused about whether to use the timeout trigger (but cannot use the condition) or the condition (but cannot use the timeout) or is there any possible alternative?
In your example, "if it is more than 6 am" is a condition and not a timeout. A timeout trigger is used when you want an event to happen at an exact time. In your case, while "more than 6 am" is time related, it is still a condition. So I would use a condition triggered event with two conditions:
getHourOfDay() > 6 && <bus condition>
getHourOfDay() function returns the hour of the day in a 24-hr format.
You need to keep in mind something important related to condition triggered events, they are only evaluated "on change". I recommend you read this carefully:
https://help.anylogic.com/index.jsp?topic=%2Fcom.anylogic.help%2Fhtml%2Fstatecharts%2Fcondition-event.html
My recommendation would be to use the onChange() function in the block controlling your bus arrival so that the condition is evaluated each time a bus arrives.

Is the mongo timestamp type atomic with the reads?

I guess the title is confusing, but I could not find a better one.
I have an event stream in MongoDB with multiple producers and one consumer. To ensure that I read each event exactly once in the correct order, I use the MongoDB timestamp type as an incrementing value, populated by the server. In the SQL world I would probably use an auto-incremented integer.
My consumer just polls MongoDB and asks for all events since the last timestamp it has seen. In one of the environments we have realized that sometimes the consumer does not handle all events. It does not happen very often, like one of 50.000 events is missed, but ideally it should not happen at all.
My assumption is that MongoDB does something like this internally.
ParseDocument(doc);
lock
{
SetTimestamp(doc);
}
WriteDocument(doc);
UpdateIndex(doc);
So it could happen that for a very short period of time an document is not available when the consumer queries the events, because only event #1, #2 and #4 is written yet and event #3 is written a fraction of a millisecond later.
I Have seen this with a C# client and MongoDB 4.2 running in Docker, but I guess the client does not matter here.
Is this assumption correct and if yes, what can I do it?
My idea is to change my consumer to ask for all events since the last timestamp minus a few seconds and then filter out the already received events in the consumer.
But is there a more elegant solution? Perhaps some way to enforce collection level write locks or could transactions help?
Since you said "consumer" - singular, I suggest:
Use a change stream to be notified of events. Change stream, if correctly iterated, will not skip changes nor will it return the same change twice.
Whenever a document is returned from change stream, when it is processed by the singular consumer, add a counter to it. Since there is only one consumer it is relatively easy to implement the counter without race conditions and such.
Also write the current resume token into each event being processed.
If you wish, you can use the counter to uniquely identify the events.
To iterate events again, use the counter to look up events in the past. Given that each event has both a counter and a resume token, once you get to the most recent event you can seamlessly transition from iterating based on the counter to iterating based on the resume token.

Session window how calculate gap?

I try to understand this shema of window session:
As I got right we have four events:
12:00:00 - event started in this time
12:00:25 - another event was ended
12:00:30 - event started in this time
12:00:50 - another event was ended
How do we get gap 15 seconds?
Could you explain what is start/end - is it one event or two different?
Events don't have a start or end time, but only a single scalar event-timestamp.
If you use session windows, events that have a time difference to each other smaller than the gap parameter, fall into the same window.
Thus, the start and end of a session window always corresponds to an event.
Note that session windows are not designed for the case when you have dedicate start/end events in your input stream. Thinks of session windows more like a "session detection" scenario, i.e., you don't have sessions in your input stream, and want to sessionize your input data based on the record timestamps.
Check out the docs for more details: https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#session-windows

Delay fixed window from triggering for several minutes

Using Fixed Windows in Apache Beam. The watermark is set by the event time.
Some data may arrive out of order and cause the window to close.
How can a trigger be defined in Java to occur say 2 minutes after the last data was seen?
It's not entire clear what behavior you expect. One question is what do you expect to happen if the data arrives within the two minutes? Do you want to restart the two minutes interval, don't restart it, re-emit the data or not?
Looks like the trigger you are trying to describe is something along these lines:
wait until the watermark passed the end of window, in event time;
wait for additional 2 minutes in processing time;
emit the data;
If in step 2 it was event time, i.e. you wanted to re-emit the window if a late element arrives that fits within window + 2min, then you could use withAllowedLateness(). Though it sounds different from what you want, because it can keep re-emitting the window contents every time a matching late element arrives.
With processing-time in step 2 this is not possible in general with basic triggers that are available in Beam. You can probably achieve a behavior you want if you manually manage state and timers in your own ParDo, e.g. you can watch for the incoming elements, keep track on them in the state, and then on timer emit what you want. This can become very complicated and might still be not flexible enough for your specific use case.
One of the major problems is that there is no good way to define processing time triggers in Beam in general. It would be complicated to define a general mechanism of working with timers in this manner. For example, when you want to express "wait for 2 minutes", the framework needs to understand in relation to what these two minutes are, when to start the timer, so you need a mechanism to express that as well. And with composition, continuation and other complications this doesn't seem easy to reason about. So it's not in the framework in this general form.
In order to implement only the "wait for 2 minutes after the last element was seen in the window", the framework has to watch for it and set the timer. Technically it is possible to do something like this but doesn't seem like anyone has done it yet.
There seems to be only one meaningful processing time trigger available in Beam but it's not generic enough and doesn't do what you want. You can look at composite triggers like AfterFirst or AfterAll but they likely won't help you without a better general processing time trigger.
I decided against using Beam and implemented the solution in Kafka Streams.
I basically grouped by, then used fixed windows and the aggregated the result.
The "grace" on the window allows data to arrive late.
KGroupedStream<Long, OxyStreamItem> grouped = input.groupByKey();
TimeWindowedKStream<Long, OxyStreamItem> windowed =
grouped.windowedBy(
TimeWindows.of(WIN_SIZE)
.advanceBy(WIN_SIZE)
.grace(Duration.ofSeconds(5L)));
return windowed
.aggregate(
makeInitializer(),
makeAggregator(),
Materialized
.<Long, Aggregate, WindowStore<Bytes, byte[]>>as("tmp")
.withValueSerde(new AggregateSerde()))
.suppress(
Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded()))
.toStream()
.map(calculateAvg());

Flink session window with onEventTime trigger?

I want to create an EventTime based session-window in Flink, such that it triggers when the event time of a new message is more than 180 seconds greater than the event time of the message, that created the window.
For example:
t1(0 seconds) : msg1 <-- This is the first message which causes the session-windows to be created
t2(13 seconds) : msg2
t3(39 seconds) : msg3
.
.
.
.
t7(190 seconds) : msg7 <-- The event time (t7) is more than 180 seconds than t1 (t7 - t1 = 190), so the window should be triggered and processed now.
t8(193 seconds) : msg8 <-- This message, and all subsequent messages have to be ignored as this window was processed at t7
I want to create a trigger such that the above behavior is achieved through appropriate watermark or onEventTime trigger. Can anyone please provide some examples to achieve this?
The best way to approach this might be with a ProcessFunction, rather than with custom windowing. If, as shown in your example, the events will be processed in timestamp order, then this will be pretty straightforward. If, on the other hand, you have to handle out-of-order events (which is common when working with event time data), it will be somewhat more complex. (Imagine that msg6 with for time 187 arrives after t8. If that's possible, and if that will affect the results you want to produce, then this has to be handled.)
If the events are in order, then the logic would look roughly like this:
Use an AscendingTimestampExtractor as the basis for watermarking.
Use Flink state (perhaps ListState) to store the window contents. When an event arrives, add it to the window and check to see if it has been more than 180 seconds since the first event. If so, process the window contents and clear the list.
If your events can be out-of-order, then use a BoundedOutOfOrdernessTimestampExtractor, and don't process the window's contents until currentWatermark indicates that event time has passed 180 seconds past the window's start time (you can use an event time timer for this). Don't completely clear the list when triggering a window, but just remove the elements that belong to the window that is closing.