NEsper/Esper EPL event statement - complex-event-processing

Can anyone help me define the EPL statement to catch the event when the following situation occurs:
Assumming that there are events with 3 attributes - (string)Symbol, (boolean)Value, (datetime)Timestamp.
If the events have the same Symbol and have Value both true and false at the same time, should be captured. For example event1(Symbol - apple, Value - True, Timestamp - 20210614-14:00:00) and event2(Symbol - apple, Value - False, Timestamp - 20210614-14:00:00).
But if the events have different Symbols (like apple and banana) should be ignored (not captured).
Thanks for any help.
Narsu

This would match two events immediately following each other (no criteria were stated as to what can come in between)
select * from MyEvent
match_recognize (
partition by symbol
measures a, b
pattern (a b)
define
b as b.timestamp = a.timestamp and b.value != a.value
)

Related

Siddhi query with conditions within multiple occurrences

We can write Siddhi query with few occurrences of events with some condition like
For 3 events with customerId 'xyz' and source as 'log', we can use
from every (e1 = CargoStream[e1.customerId == 'xyz' AND e1.source = 'log']<3>)
But what we need to do is add conditions between these 3 events.
Something like all these three elements should have the same source, not a specific value.
from every (e1 = CargoStream[e1.customerId == 'xyz' AND all these 3 events have same source does not matter the value]<3>)
We tried query with access to indexed events in occurrences but does not seem to be triggering events well.
from every (e1 = CargoStream[e1.customerId == 'xyz' AND (e1[0].source == e1[1].sourse AND e1[1].source == e1[2].source)]<3>)
Is this even possible with Siddhi Query? If yes, then how?
For your question, for having the same condition across the events. you can use partitions
https://siddhi.io/en/v5.1/docs/query-guide/#partition
also, look into this issue - https://github.com/siddhi-io/siddhi/issues/1425
the query would be like -
define stream AuthenticationStream (ip string, type string);
#purge(enable='true', interval='15 sec', idle.period='2 min')
partition with (ip of AuthenticationStream)
begin
from every (e1=AuthenticationStream[type == 'FAILURE' ]<1:> ->
e2=AuthenticationStream[type == 'SUCCESS' ]) within 1 min
select e1[0].ip as ip, e1[3].ip as ip4
having not(ip4 is null)
insert into BreakIn
end;

Esper distinct events on multiple attributes

I have a problem with the stream semantics in Esper. My aim is to output only events with pairwise distinct attributes. Additionally, there are temporal conditions which have to hold between the attributes (see Espers Interval Algebra Reference).
An example statement:
insert into output_stream select a.*, b.*
from stream1#length(100) as a, stream2#length(100) as b
where a.before(b) or a.meets(b) or a.overlaps(b)
Pairwise distinct attributes means, I want to ensure that there are no two outputs o1, o2 where o1.a = o2.a or o1.b = o2.b. To give a more concrete example, if there are the results
o1: (a = a1, b = b1),
o2: (a = a1, b = b2),
o3: (a = a2, b = b2),
o4: (a = a2, b = b1)
only two of them shall be output (e.g. o1 and o3 or o2 and o4). Which one does not matter for now.
I wanted to accomplish the pairwise distinct attributes with a NOT EXISTS clause like this:
NOT EXISTS (
select * from output_stream#length(100) as otherOutput
where a = otherOutput.a or b = otherOutput.b )
which works partly, for successive output the assertion o1.a = o2.a or o1.b = o2.b always holds.
However, when stream1 first delivers multiple "a"s and then stream2 delivers one "b", that matches the conditions to be joined with both "a"s, there are multiple outputs at once. This is not covered by my NOT EXISTS clause, because in the same step multiple outputs with the same "b" occur, and thus they are not yet in the output_stream.
The distinct keyword is not suitable here, since it checks all attributes together and not pairwise. Likewise, a simple group by on all attributes is unsuitable. I would love to have something like "distinct on a and distinct on b" as a criterion, but it does not exist.
I could possibly solve this with nested group bys where I group on each attribute
select first(*) from (select first(*) from output_stream group by a) group by b
but according to one comment has no well-defined semantics in stream processing systems. Thus, Esper does not allow subqueries in the from part of the query.
What I need is a way to force only output one output at a time and thus have the NOT EXISTS condition rechecked on every further output, or somehow check the outputs that occur at the same time against one another, before actually inserting them into the stream.
Update:
Timing of the output is not very critical. The output_stream will be used by other such statements, so I can account for delays by increasing the length of the windows. stream1 and stream2 deliver events in the order of their startTimestamp property.
create schema Pair(a string, b string);
create window PairWindow#length(100) as Pair;
insert into PairWindow select * from Pair;
on PairWindow as arriving select * from PairWindow as other
where arriving.a = other.a or arriving.b = other.b
Here is a sample self-join using a named window that keeps the last 100 pairs.
EDIT: Above query was designed for my understanding of the original requirements. Below query is designed for the new clarifications. It checks whether "a" or "b" had any previous value (in the last 100 events, leave #length(100) off as needed)
create schema Pair(a string, b string);
create window PairUniqueByA#firstunique(a)#length(100) as Pair;
create window PairUniqueByB#firstunique(b)#length(100) as Pair;
insert into PairUniqueByA select * from Pair;
insert into PairUniqueByB select * from Pair;
select * from Pair as pair
where not exists (select a from PairUniqueByA as uba where uba.a = pair.a)
and not exists (select a from PairUniqueByB as ubb where ubb.b = pair.b);

Can you do a sub select within a Case statement

Probably something really trivial but I haven't quite found the answer I am looking for on the internet and I get syntax errors with this. What I want/need to do is to provide a special case in my where clause where the doctype is 1. If it is, then it needs to match the claimID from a sub select of a temp table. If the doctype is not a 1 then we just need to continue on and ignore the select.
AND
CASE
WHEN #DocType = 1 THEN (c.ClaimID IN (SELECT TNE.ClaimID FROM TNE)
END
I have seen some for if statements but I didn't seem to get that to work and haven't found anything online as of yet that shows a case statement doing what I would like. Is this even possible?
You don't need a case statement, you could do:
AND (#DocType <> 1 or c.ClaimID in (SELECT TNE.ClaimID FROM TNE))
A CASE expression (not statement) returns a single value. SQL Server supports the bit data type. (Valid values are 0, 1, 'TRUE' and 'FALSE'.) There is a boolean data type (with values TRUE, FALSE and UNKNOWN), but you cannot get a firm grip on one. Your CASE expression attempts to return a boolean, give or take the unmatched parenthesis, which is not supported in this context.
You could use something like this, though Luc's answer is more applicable to the stated problem:
and
case
when #DocType = 1 and c.ClaimId in ( select TNE.ClaimId from TNE ) then 1
when #DocType = 2 and ... then 1
...
else 0
end = 1
Note that the CASE returns a value which you must then compare (= 1).

SQLAlchemy Core: Creating a last_value window function for postgresql

I'm trying to create the following PostgreSQL query using SQLAlchemy Core:
SELECT DISTINCT ON (carrier) carrier,
LAST_VALUE(ground) OVER wnd AS ground,
LAST_VALUE(destinationzipstart) OVER wnd AS destinationzipstart,
LAST_VALUE(destinationzipend) OVER wnd AS destionationzipend
FROM tblshippingzone
WHERE sourcezipstart <= 43234
AND sourcezipend >= 43234
AND destinationzipstart NOT BETWEEN 99500 AND 99950
AND destinationzipstart NOT BETWEEN 96700 AND 96899
AND destinationzipstart >= 1000
AND (contiguous IS NULL OR contiguous = True)
AND ground IS NOT NULL
WINDOW wnd AS (
PARTITION BY carrier ORDER BY ground DESC, destinationzipstart);
This is what I have so far:
# Short-hand for accessing cols
all_cols = ShippingZoneDAL._table.c
# Window params
window_p = {'partition_by': all_cols.carrier,
'order_by': [desc(all_cols.ground), all_cols.destination_zip_start]}
# Select columns
select_cols = [distinct(all_cols.carrier).label('carrier'),
over(func.last_value(all_cols.ground), **window_p).label('ground'),
over(func.last_value(all_cols.destination_zip_start), **window_p).label('destination_zip_start'),
over(func.last_value(all_cols.destination_zip_end), **window_p).label('destination_zip_end')]
# Filter exprs
exprs = [all_cols.source_zip_start <= 43234,
all_cols.source_zip_end >= 43234,
~all_cols.destination_zip_start.between(99500, 99950), # Alaska zip codes
~all_cols.destination_zip_start.between(96700, 96899), # Hawaii zip codes
all_cols.destination_zip_start >= 1000, # Eliminates unusual territories
or_(all_cols.contiguous == True, all_cols.contiguous == None),
all_cols.ground != None]
# Build query
query = select(*select_cols).where(and_(*exprs))
But I get an error when building the query:
ArgumentError: FROM expression expected
Any ideas what I'm missing here?
BONUS POINTS:
I originally wanted the window function to be this instead:
WINDOW wnd AS (
PARTITION BY carrier ORDER BY ground
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);
But it seemed like sqlalchemy didn't support the 'ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING', based on this support request:
https://bitbucket.org/zzzeek/sqlalchemy/issue/3049/support-range-specificaiton-in-window
Is there a way to use that clause, or no?
It was mostly just a matter of re-arranging various methods into a working order. Here's the answer, if anyone runs into something similar:
# Short-hand for accessing cols
all_cols = ShippingZoneDAL._table.c
# Window params
window_p = {'partition_by': all_cols.carrier,
'order_by': [desc(desc(all_cols.ground)), all_cols.destination_zip_start]}
# Select columns
select_cols = select(
[all_cols.carrier,
func.last_value(all_cols.ground).over(**window_p).label(shipment_method),
func.last_value(all_cols.destination_zip_start).over(**window_p).label('destination_zip_start'),
func.last_value(all_cols.destination_zip_end).over(**window_p).label('destination_zip_end')])
# Filter exprs
exprs = [all_cols.source_zip_start <= 43234,
all_cols.source_zip_end >= 43234,
~all_cols.destination_zip_start.between(99500, 99950),
~all_cols.destination_zip_start.between(96700, 96899),
all_cols.destination_zip_start >= 1000,
or_(all_cols.contiguous == True, all_cols.contiguous == None),
all_cols.ground != None]
# Build query
query = select_cols.where(and_(*exprs)).distinct(all_cols.carrier)
Key notes to keep in mind with the solution above:
SQLAlchemy Core won't see select(*select_cols) as equivalent to select([all_cols.ground, etc]) in this scenario. Probably because the over method needs to be computed in the context of a select, or you lose reference to the FROM table.
To use DISTINCT ON from PostgreSQL, make sure the distinct comes after the primary select. If just used in the SELECT itself, it will just become a standard DISTINCT clause for that column.
Be careful with the labels themselves - the columns returned will only have key defined, and not name like a normal table column from the object.
If anyone still wants to tackle my bonus question, feel free to :) Still not sure if there's a way to use that yet in SQLAlchemy.

"Or" statement - ORMLite

I'm tring to do this query with ORMLite but I just can't use the or() statement properly.
SELECT DISTINCT x FROM x x INNER JOIN x.w w WHERE :date >= x.startDate
AND w.company.id = :companyId AND w.status = :status AND x.status =
:status AND (x.endDate = NULL OR x.endDate >= :date)
My code:
QueryBuilder<x, Integer> xQB = this.xDao.queryBuilder();
xQB.where().eq("status", StatusEnum.ENABLED).and().le("startDate", date)
.and().ge("endDate", date).or().isNull("endDate");
If date is less than startDate this state stil returning values of endDate equal null. If I remove the or() statement everything works fine.
Thanks.
I have a feeling you are getting confused around the AND and OR grouping. Any and() or or() operation (without args) takes the previous element on the stack and combines it with the next argument and then pushes the result on the stack. So a AND b AND c OR e turns into approximately (((a AND b) AND c) OR e).
The Where class also has and(...) and or(...) method that take arguments which wrap the comparison in parenthesis. This is useful in situations like yours when you need to be explicit about what tow things you are comparing. I'd change your's to be:
QueryBuilder<x, Integer> xQB = this.xDao.queryBuilder();
Where<x, Integer> where = xQB.where();
where.and(where.eq("status", StatusEnum.ENABLED),
where.and(where.le("startDate", date),
where.or(where.ge("endDate", date), where.isNull("endDate"))));
This should generate approximately `(a AND (b AND (c OR d)))' which seems to be what you want.
To see the various ways to construct queries, check out the docs:
http://ormlite.com/docs/building-queries