How do I calculate the standard deviation of a table in kdb

How do I calculate the standard deviation of a table in kdb - kdb

I have a table of returns that I want to calculate standard deviation on. My columns look like
`day1`day2`day3
How can I calculate the standard deviation of each column efficiently?
I know there's a dev function. However, unlike avg, dev cannot be called on the entire table
Any help will be much appreciated!
Thank you!

Yes dev doesn't support a table as input but you can use dev on individual columns in a select query:
q)t:([]day1:til 10;day2:2*til 10;day3:3*til 10)
q)t
day1 day2 day3
--------------
0 0 0
1 2 3
2 4 6
3 6 9
4 8 12
5 10 15
6 12 18
7 14 21
8 16 24
9 18 27
q)select dev day1, dev day2, dev day3 from t
day1 day2 day3
--------------------------
2.872281 5.744563 8.616844
Edt: If unsure how to create a dynamic query with functional form use parse:
q)parse"select dev day1 from t"
?
`t
()
0b
(,`day1)!,(dev;`day1)
It is useful for creating the code for multiple columns:
// [table;where;by;cols]
?[t;();0b;
raze { (enlist x)!enlist (dev;x) } each `day1`day2`day3]
day1 day2 day3
--------------------------
2.872281 5.744563 8.616844
or since you have 100+ columns, use cols with except to get all columns you want to get the standard deviation of and ignore the columns you don't
?[t;();0b;
raze { (enlist x)!enlist (dev;x) } each except[cols[t];`columns`to`ignore]

If your table is "flippable" then you could do:
q)enlist dev each flip t
day1 day2 day3
--------------------------
2.872281 5.744563 8.616844

Related

Calculate percentage difference between two rows

I have this query that produced the table below.
select season,
guildname,
count(guildname) as mp_count,
(count(guildname)/600::float)*100 as grank
from mp_rankings
group by season, guildname
order by grank desc
season
guildname
mp_count
grank
10
LEGENDS
56
9.33333333333333
9
LEGENDS
54
9
10
EVERGLADE
50
8.33333333333333
9
Mystic
46
7.66666666666667
10
Mystic
42
7
9
EVERGLADE
39
6.5
10
100
36
6
9
PARABELLUM
33
5.5
10
PARABELLUM
29
4.83333333333333
9
100
29
4.83333333333333
I wanted to create a new column that calculates the percentage difference between the two seasons using identical guildnames. For example:
season
guildname
mp_count
grank
prev_season_percent_diff
10
LEGENDS
56
9.33333333333333
0.33%
10
EVERGLADE
50
8.33333333333333
1.83%
The resulting table will only show the current season (which is the highest season value, 10 in this case) and adds a new column prev_season_percent_diff, which is the current season's grank minus the previous season's grank.
How can I achieve this?

Use a Common Table Expression ("CTE") for the grouped result and join it to itself to calculate the difference to the previous season:
with summary as (
select
season,
guildname,
count(*) as mp_count, -- simplified equivalent expression
count(*)/6 as grank -- simplified equivalent expression
from mp_rankings
group by season, guildname
)
select
a.season,
a.guildname,
a.mp_count,
a.grank,
a.mp_count - b.mp_count as prev_season_percent_diff
from summary a
left join summary b on b.guildname = a.guildname
and b.season = a.season - 1
where a.season = (select max(season) from summary)
order by a.grank desc
If you actually want a % in the result, concatenate a % to the difference calculation.

Getting data from alternate dates of same ID column

I've a table data as below, now I need to fetch the record with in same code, where (Value2-Value1)*2 of one row >= (Value2-Value1) of consequtive date row. (all dates are uniform with in all codes)
---------------------------------------
code Date Value1 Value2
---------------------------------------
1 1-1-2018 13 14
1 2-1-2018 14 16
1 4-1-2018 15 18
2 1-1-2019 1 3
2 2-1-2018 2 3
2 4-1-2018 3 7
ex: output needs to be
1 1-1-2018 13 14
as I am begginer to SQL coding, tried my best, but cannot get through with compare only on consequtive dates.

Use a self join.
You can specify all the conditions you've listed in the ON clause:
SELECT T0.code, T0.Date, T0.Value1, T0.Value2
FROM Table As T0
JOIN Table As T1
ON T0.code = T1.code
AND T0.Date = DateAdd(Day, 1, T1.Date)
AND (T0.Value2 - T0.Value1) * 2 >= T1.Value2 - T1.Value1

select all columns with suffix _test in q kdb

I have a partitioned table, similar to below table:
q)t:([]date:3#2019.01.01; a:1 2 3; a_test:2 3 4; b_test:3 4 5; c: 6 7 8);
date a a_test b_test c
----------------------------
2019.01.01 1 2 3 6
2019.01.01 2 3 4 7
2019.01.01 3 4 5 8
Now, I want to fetch date column and all columns have names with suffix "_test" from table t.
Expected output:
date a_test b_test
------------------------
2019.01.01 2 3
2019.01.01 3 4
2019.01.01 4 5
In my original table, there are more than 100 columns with name having _test so below is not a practical solution in this case.
q)select date, a_test, b_test from t where date=2019.01.01
I tried various options like below, but of no use:
q)delete all except date, *_test from select from t where date=2019.01.01

If the columns you are selecting are variable then you should use a functional qSQL statement to perform the query. The following can be used in your case
q)query:{[tab;dt;c]?[tab;enlist (=;`date;dt);0b;(`date,c)!`date,c]}
q)query[t;2019.01.01;cols[t] where cols[t] like "*_*"]
date a_test b_test
------------------------
2019.01.01 2 3
2019.01.01 3 4
2019.01.01 4 5
In order to craft a particular functional statement, you can parse your query, putting dummy columns in place if you aren't sure what they should be
q)parse "select date,c1,c2 from tab where date=dt"
?
`tab
,,(=;`date;`dt)
0b
`date`c1`c2!`date`c1`c2

A functional select is probably the best way to go here if you require adding further filters.
?[`t;();0b;{x!x}`date,exec c from meta t where c like "*_test"]
The functional form of any select quesry can be obtained by using the -5! operator on any SQL style statement.
In the example below I have created a table with 20 fields, each one beginning with either a or b.
I then use the functional form to define which fields I want.
q)tab:{[x] enlist x!count[x]#0}`$"_" sv ' raze string `a`b,/:\:til 10
q){[t;s]?[t;();0b;{[x] x!x} cols[t] where cols[t] like s]}[tab;"b*"]
b_0 b_1 b_2 b_3 b_4 b_5 b_6 b_7 b_8 b_9
---------------------------------------
0 0 0 0 0 0 0 0 0 0
q){[t;s]?[t;();0b;{[x] x!x} cols[t] where cols[t] like s]}[tab;"a*"]
a_0 a_1 a_2 a_3 a_4 a_5 a_6 a_7 a_8 a_9
---------------------------------------
0 0 0 0 0 0 0 0 0 0
q)-5!" select a,b from c"
?
`c
()
0b
`a`b!`a`b
Alternatively, if I don't require any filtering I can use the # operator as in below:
{[x;s] (cols[x] where cols[x] like s)#x}[ tab;"a*"]

Tableau - How sum values with 12 last months

in Tableau I have a table with this form :
rows: Score.
columns:MY(month), sum(good), sum(bad).
This is the information when I use: month 201811
201611 201612 ... 201801 ... 201811 TOTAL
Score Good Bad Good Bad Good Bad ... Good Bad
1 3 0 7 3 6 3 2 1
2 5 1 1 1 1 1 4 4
3 10 3 2 1 0 3 3 3
I want to use a filter with 'Month' column ,when I filter month=201811, show since 201611 to 201711 (last 12 months) in Total column(Totals in Bad and Good columns) by Score.
Filter: 201811
Formula: sum(Good) and sum(Bad) since '201611' to '201711'
I trying "IF DATEDIFF('month', [Good], today()) <=12" but doesn't work.
Thanks for your help.

Try this:
If DATEDIFF("month",TODAY(),[Your Date Field],"Sunday") <= -12
then [Your Date Field] else null end
Then use that as your date column. The "Sunday" is supposed to be whatever you consider the starting day of the week. I wasn't sure what your date field is named so I named it "[Your Date Field]"

Convert day of year (from extract) back to a date

I am trying to group data by the day of the year that it falls on. I have been able to achieve this with the code below. The issue is that I lose the information as to which day (i.e. Jan 1st, Jan 2nd etc) each grouping represents. I am simply left with a number (e.g. 1, 2 etc.) representing the day of the year. Is there any to convert this number back into the more descriptive date? Thanks a lot.
CREATE TABLE tmp2 AS
SELECT extract(doy from trd_exctn_dt) as day_of_year
,sum(dollar_vol) AS dollar_vol
FROM tmp
GROUP BY extract(doy from trd_exctn_dt);
Current Output:
day_of_year | dollar_vol
------------|------------
1 10
2 15
3 7
Desired Output: N.b. The exact format of the first column doesn't matter too much. I would be happy with DD/MM, MM/DD or any other clear output.
day_of_year | dollar_vol
------------|------------
Jan 1 | 10
Jan 2 | 15
Jan 3 | 7

Using the to_char fucntion:
SELECT to_char(trd_exctn_dt,'MM/DD') as day_of_year ,sum(dollar_vol) AS dollar_vol
FROM tmp
GROUP BY day_of_year ;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How do I calculate the standard deviation of a table in kdb - kdb

If your table is "flippable" then you could do: q)enlist dev each flip t day1 day2 day3 -------------------------- 2.872281 5.744563 8.616844

Related

Calculate percentage difference between two rows

Getting data from alternate dates of same ID column

select all columns with suffix _test in q kdb

Tableau - How sum values with 12 last months

Convert day of year (from extract) back to a date

Categories

Resources