SQL translation of IF statement to PowerBi Dax Query - tsql

I'm kinda new to DAX and PowerBi and I need to translate my SQL IF statement for whatever syntax is this on PowerBi to achieve the output I want.
Sql code I want to translate:
IF (Payment.payment>0) AND (Account.PV = Account.GV) THEN 1 ELSE 0
I want to make a calculated column on Payment table which will return 1 or 0 so that I can use this to filter all the records that meets my condition
account_id is the relationship of these two tables
Here is a sample data for reference: Account table
account_id
pv
gv
due_date
123
100
200
08/08/2022
124
200
200
08/09/2022
125
300
800
08/10/2022
126
400
670
08/11/2022
127
500
500
08/12/2022
128
600
600
08/13/2022
129
700
1000
08/14/2022
130
800
760
08/15/2022
131
900
900
08/16/2022
132
1000
1000
08/17/2022
133
1100
2300
08/09/2022
Here is a sample data for reference: Payment table
payment_id
payment_number
payment
payment_date
account_id
_test
101
554321
1000
03/01/2022
123
0
102
554322
1200
03/21/2022
124
1
103
554322
1100
04/28/2022
124
1
104
554323
2500
05/04/2022
131
1
105
554324
3000
05/14/2022
133
0
106
554325
3000
05/14/2022
132
1
107
554322
1200
03/21/2022
124
1
108
554323
2500
04/05/2022
131
1
109
554328
3100
04/05/2022
128
0
Codes I tried but I can't help myself to find the correct way to do it correctly and return the output that I need
_test = IF(Payments[payment]>0 && RELATED('Account'[PV])=RELATED('Account'[GV]), 1)
_test = IF(AND(Payments[payment])>0, RELATED('Account'[PV])=RELATED('Account'[GV])),1,0)
Any suggestion is much appreciated. Please recommend what kind of syntax/function should be used in order to achieve the output or what would be the work around to use other than IF statement

The problem that you are facing with RELATED is that RELATED only works from 1 side to many side.
Meaning, that if you bring the axis from 1-side and perform a calculation on the many side the filter works perfectly. Take a look at the direction of the filter below. The direction of the filter tells you on normal circumstances, you should bring your axis from Account and whatever calculation you perform on `Payment table it will work out.
But you are doing exactly the reverse. You are bringing the axis from Payment and hoping for RELATED to work. It won't cause the direction to be as such.
However, DAX is much more dynamic than that. If for whatever reason, you need to bring axis from many side where you need to still filter on 1-side, you can define a reverse filter direction on-the-fly (because DAX is magical) without needing to change anything in the data model by using CROSSFILTER. With CROSSFILTER you are customizing the filter direction as such
CROSSFILTER(<LEFtblcolumnName1>, <RIGHTtblcolumnName2>, <direction>)
This is how, (with your given dataset)
Column =
VAR cond1 =
CALCULATE (
MAX ( Account[Account.pv] ),
CROSSFILTER ( Payment[Payment.account_id], Account[Account.account_id], BOTH )
)
- CALCULATE (
MAX ( Account[Account.gv] ),
CROSSFILTER ( Payment[Payment.account_id], Account[Account.account_id], BOTH )
)
RETURN
IF ( cond1 == 0 && Payment[Payment.payment] > 0, 1, 0 )

Related

How can I efficiently convert the output of one KDB function into three table columns?

I have a function that takes as input some of the values in a table and returns a tuple if you will - three separate return values, which I want to transpose into the output of a query. Here's a simplified example of what I want to achieve:
multiplier:{(x*2;x*3;x*3)};
select twoX:multiplier[price][0]; threeX:multiplier[price][1]; fourX:multiplier[price][2] from data;
The above basically works (I think I've got the syntax right for the simplified example - if not then hopefully my intention is clear), but is inefficient because I'm calling the function three times and throwing away most of the output each time. I want to rewrite the query to only call the function once, and I'm struggling.
Update
I think I missed a crucial piece of information in my explanation of the problem which affects the outcome - I need to get other data in the query alongside the output of my function. Here's a hopefully more realistic example:
multiplier:{(x*2;x*3;x*4)};
select average:avg price, total:sum price, twoX:multiplier[sum price][0]; threeX:multiplier[sum price][1]; fourX:multiplier[sum price][2] by category from data;
I'll have a go at adapting your answers to fit this requirement anyway, and apologies for missing this bit of information. The real function if a proprietary and fairly complex algorithm and the real query has about 30 output columns, hence the attempt at simplifying the example :)
If you're just looking for the results themselves you can extract (exec) as lists, create dictionary and then flip the dictionary into a table:
q)exec flip`twoX`threeX`fourX!multiplier[price] from ([]price:til 10)
twoX threeX fourX
-----------------
0 0 0
2 3 4
4 6 8
6 9 12
8 12 16
10 15 20
12 18 24
14 21 28
16 24 32
18 27 36
If you need other columns from the original table too then its trickier but you could join the tables sideways using ,'
q)t:([]price:til 10)
q)t,'exec flip`twoX`threeX`fourX!multiplier[price] from t
An apply # can also achieve what you want. Here data is just a table with 10 random prices. # is then used to apply the multiplier function to the price column while also assigning a column name to each of the three resulting lists:
q)data:([] price:10?100)
q)multiplier:{(x*2;x*3;x*3)}
q)#[data;`twoX`threeX`fourX;:;multiplier data`price]
price twoX threeX fourX
-----------------------
80 160 240 240
24 48 72 72
41 82 123 123
0 0 0 0
81 162 243 243
10 20 30 30
36 72 108 108
36 72 108 108
16 32 48 48
17 34 51 51

Wrong Output in Postgres

I am joining 3 tables to get the retention rate. Here is my query:
select first_visit.first_month as first_month,
new_users.new_users as new_users,
count(distinct visit_tracker.customer__id) as retained,
cast(count(distinct visit_tracker.customer__id) / new_users.new_users as float) as retention_percent
from first_visit
left join visit_tracker
on visit_tracker.customer__id=first_visit.customer__id
left join new_users
on new_users.first_month=first_visit.first_month
group by 1,2;
I get the following output:
first_month new_users retained retention_percent
0 93 34 0
1 119 42 0
2 188 102 0
3 223 71 0
and so on
What I want is this:
first_month new_users retained retention_percent
0 93 34 0.37
1 119 42 0.35
2 188 102 0.54
3 223 71 0.32
I am not sure why it's not producing the results I want. Any inputs?
This looks like a classic case of an integer division problem.
In this case count(distinct visit_tracker.customer__id) will return an integer which is then divided by a float. It looks like the float is cast into an integer and the result of the division is therefore an integer. Because the expected answer is less than one, it truncates to zero. The as float part of your query will not help as this happens after the truncation has already occured.
Try making sure both the numerator and the denominator are floats before performing the division or multiply by 100 beforehand as this stackoverflow answer suggests.

How to insert a value

I want to insert a number in the following matrix: n x 1 matrix
6
103
104
660
579
750
300
299
300
750
579
661
580
760
302
301
302
760
580
662
581
How to I insert it in the middle and shift the remaining numbers? I tried the following code:
Idx=[723];
c=false(1,length(Element_set2)+length(Idx));
c(Idx)=true;
result=nan(size(c));
result(~c)=Element_set2;
result(c)=8
You are complicating things. Simply find the middle index by finding the length of the array, dividing by 2 and truncating any decimal points, then using simply indexing to update the new matrix. Supposing that result is the column vector that was created by you and number is the value you want to insert in the middle, do the following:
number = 8; %// Change to suit whatever number you desire
middle = floor(numel(result) / 2);
result = [result(1:middle); number; result(middle+1:end)];
In the future, please read this great MATLAB tutorial on indexing directly from MathWorks: http://www.mathworks.com/company/newsletters/articles/matrix-indexing-in-matlab.html. It's a good resource on the kinds of indexing operations one expects from starting out in MATLAB.

How do I find the index of maximum and minimum values in MATLAB?

I need to write a code to display the location of the highest and lowest tx value. Nothing appears to be working. Here is my code:
%times
tx=[tf-to];
tx=[130 103 152 163 218 278 82 195 221 154 94 159 214 185];
s=(130+103+52+163+218+278+82+195+221+154+94+159+214+185);
%minimum and maximum times
minvalue=min(tx);
maxvalue=max(tx);
How do I edit this code to show the max and min values of tx only??
[minvalue,idx_min]=min(tx);
[maxvalue,idx_max]=max(tx);
This uses the second output of both min and max, which returns the index of the min/max value respectively.
Adding two inline functions to return the min and max is a possibility.
min_index = #(vector) find(vector==min(vector))
max_index = #(vector) find(vector==max(vector))
idx_min = min_index(tx);
idx_max = max_index(tx);

Clustering data after linkage algorithm

I am not an expert in statistics and data analysis, hence I can't understand if the behavior which I obtain is correct or not. I am here looking for your help.
Assume I have these samples which I would like to cluster (10 points in the plane - reduced version of the problem):
[X Y] =
266 450
266 400
258 168
290 442
295 438
273 432
294 158
318 161
250 423
253 413
To cluster them I can use a cluster tree
Z = linkage([ X Y ],'complete');
which is (by dendrogram(Z,10))
Now I would like to extract clusters on the basis of the distance attached to the nodes of the tree.
Say that my distance is 150, I would expect that the call
T = cluster(Z,'Cutoff',150);
returns me 2 clusters. But it gives me just one (I suppose), i.e.
T =
1
1
1
1
1
1
1
1
1
1
What am I missing?
Use inconsistent(Z,150) and look at the values in column 4. Increasing the cutoff from a small positive number steps you along the tree.
E.g.
cluster(Z,'cutoff',0.7)
does not give you what you want (I think)
but
cluster(Z,'cutoff',0.8)
does.
The criterion for cluster is inconsistency ('inconsistent') by default.
Since the height in dendrogram is distance, you can change the criterion to 'distance',
i.e:
T = cluster(Z, 'Cutoff', 150, 'criterion', 'distance');