Order of precedence of chained F.when conditions

Order of precedence of chained F.when conditions - pyspark

Suppose I have a chained F.when().otherwise condition:
F.when(Condition A, 1).when(Condition B, 2).otherwise(0)
and both Condition A and B are fulfilled, which takes precedence? Is there any way to break after a certain condition is fulfilled and not cascade into subsequent conditions?

First encountered is the right one. In your case, it would be condition A, so output is 1

Related

how set order in a multiple condition in `when` function?

I have a complex code and I am using when to make a new column under some conditions. Consider the following code:
df.select(
'*',
F.when((A)|(B)|(C),top_val['val']).alias('match'))
let A,B and C are my conditions. I want to put an order on these conditions like this:
If A satisfied then don't check B and C
If B satisfied then don't check C.
Is there any way to put this order?

As stated in this blog and quoted in this answer, I don't think you can guarantee the order of evaluation of an or expression.
Spark SQL (including SQL and the DataFrame and Dataset API) does not guarantee the order of evaluation of subexpressions. In particular, the inputs of an operator or function are not necessarily evaluated left-to-right or in any other fixed order. For example, logical AND and OR expressions do not have left-to-right “short-circuiting” semantics.
However, you can nest the when() inside .otherwise() to form a series like this and achieve what you want:
df.select(
'*',
F.when((A),top_val['val'])
.otherwise(F.when((B),top_val['val'])
.otherwise(F.when((C), top_val['val']))).alias('match'))

Question about match_recognize syntax with esper CEP

I am using the match_recognize syntax when doing CEP querying with esper. I noticed that after matching some events it ignores them for future matches. for example if using the simple following pattern:
select * from Event
match_recognize (
measures A as a, B as b, C as c
pattern (A B C)
)
it would match the events number 1,2 and 3 in the stream. After that it would match the events number 4,5 and 6. But I want it to match 1,2,3 and then events number 2,3,4 and then 3,4,5 and so forth (of course I'll add more conditions later).
Is there some simple adjustement to this syntax that could do that?

Look at after match skip in the syntax. doc link
match_recognize (
...
after match skip to current row
pattern (...)
)

ANDALSO options, stop evaluating when one fails

I have a SQL select statement that reads items. There are conditions for which items to display, but when one condition fails, I don't need to check the other.
For example:
where item like 'M%'
and item_class='B'
and fGetOnHand(item)>0
If either of the first 2 fail, i do NOT want to do the last one (a call to a user defined function).

From what I have read on this site, SQL Server's AND and OR operators do not follow short circuiting behavior. This means that the call to the UDF could happen first, or maybe not at all, should one of the other conditions happen first and fail.
We might be able to try rewriting your logic using a CASE expression, where the execution order is fixed:
WHERE
CASE WHEN item NOT LIKE 'M%' OR item_class <> 'B'
THEN 0
WHEN fGetOnHand(item) <= 0
THEN 0
ELSE 1 END = 1
The above logic forces the check on item and item_class to happen first. Should either fail, then the first branch of the CASE expression evaluates to 0, and the condition fails. Only if both these two checks pass would the UDF be evaluated.
This is very verbose, but if the UDF call is a serious penalty, then perhaps phrasing your WHERE clause as above would be worth the trade off of verbose code for better performance.

Iteration over a column in decision table _ODM

I have a decision table which looks like :
and my input to the decision table looks like this :
A = 1, B = 1,4,5 and C =1 .
The requirement is that the decision table processing should halt when the first match is encountered with the decision table . In this case , row number 1 (B =1 ) , it should not check for B=4 and 5.
Please advise how to achieve this logic. I am using ODM 8.9
Thanks.

One way to execute just a single row of a decision table is to use specify the Exit Criteria in the properties of the rule task in which the decision table appears. If the Exit Criteria is set to Rule Instance, then only one rule will fire -- after the first rule fires, the rule task will end. If your decision table is the only thing in the rule task then this should give the desired behavior. It would also work if your decision table was the first thing in the rule task to be evaluated, in terms of order and priority.

How is the HAVING clause optimized in PostgreSQL?

In PostgreSQL, if we have such a query:
SELECT --
FROM --
WHERE --
GROUP BY --
HAVING
func1 AND func2;
I think there could be three strategy in planner:
func1 first perform on target list then func2 on same target list
func1 first perform on target list, generate a smaller result set, and then func2 perform on the small result set
suppose func1 cost c1, func2 cost c2, and c1>c2, func2 first perform on target list , generate a smaller result set, and then func1 perform on the small result set
which one is the actually approach in PostgreSQL?

If either func is a non-aggregate and non-VOLATILE expression, the planner may effectively move it to the WHERE clause.
Otherwise, (func1 AND func2) would be applied as a single filter expression on the resulting groups. At this point, the executor's lazy boolean evaluation rules kick in; if the first condition evaluates to false, it will not bother to execute the second. So the behaviour is closest to your second or third options, but performed in a single pass of the result set.
The order of evaluation is up to the planner, so in theory it may decide to execute func2 first. However, I'm not sure what might trigger this behaviour; even when func1 has a cost of 1000000000, it still seems to favour left-to-right evaluation.
The EXPLAIN ANALYSE output will show you where in the execution plan that these conditions are applied, and by adding some RAISE NOTICE statements to the body of the functions, you can observe the exact sequence of function calls.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Order of precedence of chained F.when conditions - pyspark

Suppose I have a chained F.when().otherwise condition: F.when(Condition A, 1).when(Condition B, 2).otherwise(0) and both Condition A and B are fulfilled, which takes precedence? Is there any way to break after a certain condition is fulfilled and not cascade into subsequent conditions?

First encountered is the right one. In your case, it would be condition A, so output is 1

Related

how set order in a multiple condition in `when` function?

Question about match_recognize syntax with esper CEP

ANDALSO options, stop evaluating when one fails

Iteration over a column in decision table _ODM

How is the HAVING clause optimized in PostgreSQL?

Categories

Resources