I have a table that looks liek this
column1 column2 value
a d 10
a e 20
b d 30
b e 40
and I want to get the rank of the value without considering column1 so I use and LOD to first get the total value with: {EXCLUDE [column1]: SUM([value])}
this works and produces
rank
d 40
e 60
d 40
e 60
BUT what I want to do is get the rank. So I'd like
rank
d 2
e 1
d 2
e 1
when I do this RANK( {EXCLUDE [Pct Of Adv Buckets]: SUM([Notional])} )
I get an error "all fields must be aggregate or constants when using table calcualtions. Can you advise how to get teh rank.
Did you tried RANK(SUM({EXCLUDE [Column1]: SUM([Value])}))
I have a different raking, but I think that is that you are looking for.
Related
As we have except function for lists in kdb to find the elements which are present in one list and not in another, similarly do we have any utility to extract the rows present in one table and not in another based on a column?
Eg: I have two tables:
l:([]c1:`a`b`c`d;c2:10 20 30 40)
r:([]c1:`a`a`a`b`b;c3:100 200 300 400 50)
Since, for column c1 in table l we have row c d which are not present in column c1 of table r.
Do we have any utility in kdb which can be used to get output like below?
c1 c2
-----
c 30
d 40
I got the output using -
select from l where c1 in l[`c1] except r`c1
But, I'm searching for better/optimised solution/utility to get the same output.
I don't think there's anything wrong with your current implementation but you could use drop (aka _) on a keyed table for a more succinct approach:
q)#[1#`c1;r]_1!l
c1| c2
--| --
c | 30
d | 40
This also remains pretty neat when they "key" is more than one column:
l0:([]c0:`x`y`z`w;c1:`a`b`c`d;c2:10 20 30 40)
r0:([]c0:`y`x`x`x`y;c1:`a`a`a`b`b;c3:100 200 300 400 50)
q)#[`c0`c1;r0]_2!l0
c0 c1| c2
-----| --
z c | 30
w d | 40
A more functional form would be this:
{cl:cols[x]inter cols y;x where not(cl#x)in cl#y}[l;r]
c1 c2
-----
c 30
d 40
This should work if you don't know the columns to match on because of cols[x] inter cols[y] at the start which obtains common cols between the two tables. It also works without columns being keyed.
Although in this specific case, the following would be a little bit faster:
l where not l[`c1] in r[`c1]
i have a large table, here is a snippet of how it looks like
name class brand rating
12 d 1 3.8
22 d 1 3.9
33 a 2 1.1
12 d 1 2.3
12 a 3 3.4
44 b 1 9.8
22 c 2 3.0
i calculated for the average of the rating doing the below
select avg(rating) over(partition by name,class,brand) as avg_rating from df
i'm aware that postgres doesn't have a median function but i would like to calculate for that column and have the output in a similar structure to that of my window function for average
in case of even number of rows, i would like the average number between the middle two numbers
To get the median, you should use percentile_cont
SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY rating) FROM df;
I have a function that does something with a date and a function that takes two arguments to perform a calculation. For now let's assume that they look as follows:
d:{[x] :x.hh}
f:{[x;y] :x+y}
Now I want to use function f in a query as follows:
select f each (columnOne,d[columnTwo]) from myTable
Hence, I first want to convert one column to the corresponding numbers using function d. Then, using both columnOne and the output of d[columnTwo], I want to calculate the outcome of f.
Clearly, the approach above does not work, as it fails with a 'rank error.
I've also tried select f ./: (columnOne,'d[columnTwo]) from myTable, which also doesn't work.
How do I do this? Note that I need to input columnOne and columnTwo into f such that the corresponding rows still match. E.g. input row 1 of columnOne and row 1 of columnTwo simultaneously into f.
I've also tried select f ./: (columnOne,'d[columnTwo]) from myTable, which also doesn't work.
You're very close with that code. The issue is the d function, in particular the x.hh within function d - the .hh notation doesn't work in this context, and you will need to do `hh$x instead, so d becomes:
d:{[x] :`hh$x}
So making only this change to the above code, we get:
q)d:{[x] :`hh$x}
q)f:{[x;y] :x+y}
q)myTable:([] columnOne:10?5; columnTwo:10?.z.t);
q)update res:f ./: (columnOne,'d[columnTwo]) from myTable
columnOne columnTwo res
--------------------------
1 21:10:45.900 22
0 20:23:25.800 20
2 19:03:52.074 21
4 00:29:38.945 4
1 04:30:47.898 5
2 04:07:38.923 6
0 06:22:45.093 6
1 19:06:46.591 20
1 10:07:47.382 11
2 00:45:40.134 2
(I've changed select to update so you can see other columns in result table)
Other syntax to achieve the same:
q)update res:f'[columnOne;d columnTwo] from myTable
columnOne columnTwo res
--------------------------
1 21:10:45.900 22
0 20:23:25.800 20
2 19:03:52.074 21
4 00:29:38.945 4
1 04:30:47.898 5
2 04:07:38.923 6
0 06:22:45.093 6
1 19:06:46.591 20
1 10:07:47.382 11
2 00:45:40.134 2
Only other note worthy point - in the above example, function d is vectorised (works with vector arg), if this wasn't the case, you'd need to change d[columnTwo] to d each columnTwo (or d'[columnTwo])
This would then result in one of the following queries:
select res:f'[columnOne;d'[columnTwo]] from myTable
select res:f ./: (columnOne,'d each columnTwo) from myTable
select res:f ./: (columnOne,'d'[columnTwo]) from myTable
Name Rupees Company
Group 1st
A 50 A
B 70 B
C 45 C
Group 2nd
D 150 D
E 100 E
G 60 F
and I want the result
Name Rupees Company
Group 1st
A 50
B 70 B
C 45
Group 2nd
D 150 D
E 100
G 60
i want to suppress the Company filed on Maximum Rupees Filed and also Group wise.
Yo can write a supress condition of Company column.
Code looks something like this.
if(maximum(filed))=Group1 (Ruppes)
Then false
Else True
By this if the maximum of the required filed matches with the max of the group1 then the company part will display..rest of others will get supressed.
This should work
Let me know if you still face any issue.
I want to insert a row number in a records like counting rows in a specific number of range. example output:
RowNumber ID Name
1 20 a
2 21 b
3 22 c
1 23 d
2 24 e
3 25 f
1 26 g
2 27 h
3 28 i
1 29 j
2 30 k
I rather to try using the rownumber() over (partition by order by column name) but my real records are not containing columns that will count into 1-3 rownumber.
I already try to loop each of record to insert a row count 1-3 but this loop affects the performance of the query. The query will use for the RDL report, that is why as much as possible the performance of the query must be good.
any suggestions are welcome. Thanks
have you tried modulo-ing rownumber()?
SELECT
((row_number() over (order by ID)-1) % 3) +1 as RowNumber
FROM table