Pivot table with multiple keyed columns - kdb

I have the following table:
t:(([]y:2001 2002) cross ([]m:5 6 7) cross ([]sector:`running`hiking`swimming`cycling)),'([]sales: 14 12 5 9 4 894 1 4 87 12 24 6 4 8 64 354 3 4 86 43 1053 2 43 4);
y m sector sales
------------------------
2001 5 running 14
2001 5 hiking 12
2001 5 swimming 5
2001 5 cycling 9
2001 6 running 4
2001 6 hiking 894
2001 6 swimming 1
2001 6 cycling 4
...
2002 5 running 4
2002 5 hiking 8
2002 5 swimming 64
2002 5 cycling 354
2002 6 running 3
...
I want to pivot the sales values by sector, while keeping the first two y and m columns, such that the resulting table would look like this:
y m cycling hiking running swimming
--------------------------------------
2001 5 9 12 14 5
2001 6 4 894 4 1
2001 7 6 12 87 24
2002 5 354 8 4 64
2002 6 43 4 3 86
2002 7 4 2 1053 43

As per
https://code.kx.com/v2/kb/pivoting-tables/
q) P:asc exec distinct sector from t;
q) exec P#(sector!sales) by y:y,m:m from t
You can unkey the result by () xkey if you need a normal table.

Related

merge lists alternating k items from each

given list of lists, how to merge them into a single list taking up to k items at a time from each of the lists until all items are merged?
for example, the following input and output is expected:
q)alternate[1] ("abc";"de";"fghi")
"adfbegchi"
for k~1 items at a time, the solution is:
q)mesh:{raze[y]rank x} / https://code.kx.com/phrases/sort/#mesh
q)alternate:{mesh[raze where each 1&{0|x-1}\[count each x];x]}
q)alternate ("abc";"de";"fghi")
"adfbegchi"
the above works because:
q)mesh[0 1 2 0 1 2 0 2 2;] ("abc";"de";"fghi")
"adfbegchi"
How to elegantly generalize alternate for any k<=max count each x ? python solution is here
the below should achieve this
q)f:{raze[y]iasc raze(x-1)|til'[count'[y]]};
q)f[1;("abc";"de";"fghi")]
"adfbegchi"
q)f[2;("abc";"de";"fghi")]
"abdefgchi"
Not sure if this is any faster or cleaner but a different approach:
q)alt:{(raze/)value each(s;::;)each til max count each s:x cut'y};
q)alt[1;("abc";"de";"fghi")]
"adfbegchi"
q)alt[2;("abc";"de";"fghi")]
"abdefgchi"
q)alt[2;(1 2 3 4;5 6 7;8 9 10 11;12 13 14 15)]
1 2 5 6 8 9 12 13 3 4 7 10 11 14 15
q)alt[3;(1 2 3 4;5 6 7;8 9 10 11;12 13 14 15)]
1 2 3 5 6 7 8 9 10 12 13 14 4 11 15
I was hoping to use case but unfortunately case doesn't like lists of different count (that aren't atomic values)
not sure if this is the fastest solution, substituting x for the hardcoded 1 in the alternate function above, and thus making it dyadic:
q)alternate:{mesh[raze where each x&{0|y-x}[x]\[count each y];y]}
q)alternate[2] ("abc";"de";"fghi")
"abdefgchi"
q)alternate[2] (1 2 3 4;5 6 7;8 9 10 11;12 13 14 15)
1 2 5 6 8 9 12 13 3 4 7 10 11 14 15
q)alternate[3] (1 2 3 4;5 6 7;8 9 10 11;12 13 14 15)
1 2 3 5 6 7 8 9 10 12 13 14 4 11 15
edit:
naïve benchmark comparison with #terrylynch 's alt and #Matthew Madill's f :
q)\t:10000 f[2] ("abc";"de";"fghi")
49
q)\t:10000 alternate[2] ("abc";"de";"fghi")
56
q)\t:10000 alt[2] ("abc";"de";"fghi")
87
q)\t:10000 f[3] (1 2 3 4;5 6 7;8 9 10 11;12 13 14 15)
55
q)\t:10000 alternate[3] (1 2 3 4;5 6 7;8 9 10 11;12 13 14 15)
61
q)\t:10000 alt[3] (1 2 3 4;5 6 7;8 9 10 11;12 13 14 15)
103

KDB/Q: multiple PEACH?

I have a function that takes 2 parameters: date and sym. I would like to do this for multiple dates and multiple sym. I have a list for each parameter. I can currently loop through 1 list using
raze function[2020.07.07;] peach symlist
How can I do something similar but looping through the list of dates too?
You may try following:
Create list of pairs of input parameters.
Write anonymous function which calls your function and use peachon list op paired parameters
For example
symlist: `A`B`C; // symlist defined for testing
function: {(x;y)}; // function defined for testing
raze {function . x} peach (2020.07.07 2020.07.08 2020.07.09) cross symlist
I think this could work:
raze function'[2020.07.07 2020.07.08 2020.07.09;] peach symlist
If not some more things to consider. Could you change your function to accept a sym list instead of individual syms by including an each/peach inside it? Then you could each the dates.
Also, you could create a new list of each date matched with the symlist and create a new function which takes this list and does whatever the initial function did by separating the elements of the list.
q)dates
2020.08.31 2020.09.01 2020.09.02
q)sym
`llme`obpb`dhca`mhod`mgpg`jokg`kgnd`nhke`oofi`fnca`jffe`hjca`mdmc
q)func
{[date;syms]string[date],/:string peach syms}
q)func2
{[list]func[list 0;list 1]}
q)\t res1:func[;sym]each dates
220
q)\t res2:func[;sym]peach dates
102
q)
q)func2
{[list]func[list 0;list 1]}
q)dateSymList:dates,\:enlist sym
q)\t res3:func2 peach dateSymList
80
q)res3~res2
1b
q)res3~res1
1b
Let us know if any of those solutions work, thanks.
Some possible ways to do this
Can project dyadic f as monadic & parallelise over list of argument pairs
q)a:"ABC";b:til 3;f:{(x;y)}
q)\s 4
q)(f .)peach l:raze a,\:/:b
"A" 0
"B" 0
"C" 0
"A" 1
"B" 1
"C" 1
"A" 2
"B" 2
"C" 2
Or could define function to take a dictionary argument & parallelise over a table
q)f:{x`c1`c2}
q)f peach flip`c1`c2!flip l
"A" 0
"B" 0
"C" 0
"A" 1
"B" 1
"C" 1
"A" 2
"B" 2
"C" 2
Jason
I'll generalize everything, if you have a given function foo which will operate on an atom dt with a vector s
q)foo:{[dt;s] dt +\: s}
q)dt:10?10
q)s:100?10
q)dt
8 1 9 5 4 6 6 1 8 5
q)s
4 9 2 7 0 1 9 2 1 8 8 1 7 2 4 5 4 2 7 8 5 6 4 1 3 3 7 8 2 1 4 2 8 0 5 8 5 2 8..
q)foo[;s] each dt
12 17 10 15 8 9 17 10 9 16 16 9 15 10 12 13 12 10 15 16 13 14 12 9 11 11 ..
5 10 3 8 1 2 10 3 2 9 9 2 8 3 5 6 5 3 8 9 6 7 5 2 4 4 ..
13 18 11 16 9 10 18 11 10 17 17 10 16 11 13 14 13 11 16 17 14 15 13 10 12 12 ..
9 14 7 12 5 6 14 7 6 13 13 6 12 7 9 10 9 7 12 13 10 11 9 6 8 8 ..
The solution is to project the symList over the function in question, then use each (or peach) for the date variable.
If your function requires an atomic date and sym, then you can just create a new function to implement this
q)bar:{[x;y] foo[x;] each y};
datelist:`date$10?10
symlist:10?`IBM`MSFT`GOOG
function:{0N!(x;y)}
{.[function;x]} each datelist cross symlist

Add a column which is the result of two queries

code Table
code_grille code_grille_talend
s01 4 7 2 8
s02 5 2 8 9 6 3 7
s03 3 6 4 7 5 8 2
s04 2 6 4 8 5 2 8 0
s05 4 7 8 5 9 7 4 5 8
s06 2 4 7 8 9 3 6 5
s07 2 5 4 7 8
s08 2 3 4 5 6 7 8 9
s09 9 8 2 5 7 3 6 4
s10 2 4 5 2 8 7 9 3 6
s11 4 5 7 2 3 2 3 8
commande table
code_commande code_article taille
001 1 s
001 1 m
001 1 xl
001 1 x52
001 2 m
001 1 5566
001 2 x52
001 1 xl
002 1 s
002 2 m
001 3 xxl
code T table (result of the first query)
code
2
3
4
1
12
I have two queries which I need to use the result of the first query in the second query dynamically.
The first query returns much code which I need to put them in the second query to have resulted for each row.
I have load the result of the first query in a table but I have one result in the second query.
The first query is:
select [code_commande],[code_article],[code]
from [dbo].[conversion],[dbo].[commande]
where [dbo].[conversion].taille=[dbo].[commande].taille
and code_article=? and code_commande=?
The second query is:
select top 1 (G.[code_grille_talend]), count(C.code) as counter
from [dbo].[code] G
left join [dbo].[codeT] C
on G.code_grille_talend not like '%'+LTRIM(RTRIM(C.code))+'%'
group by G.code_grille_talend
having LEN(g.code_grille_talend)+count(C.code)<=40 or count(C.code)=0
order by len(g.code_grille_talend)desc
I have loaded the result of the first query in a table (codeT)

Plot dates matlab

I've a matrix called datevector containing the year, month, day, hour, minutes, seconds of the timeseries that I would like to plot.
datevector = [...
2009 11 4 11 35 0
2009 11 4 11 36 0
2009 11 4 11 37 0
2009 11 4 11 38 0
2009 11 4 11 39 0
2009 11 4 11 40 0]
To plot my data with respect to this time series I create the array containing the time series
xdate = datenum(datevector);
and then I try to plot my data = [1 2 3 4 5 6]
figure
plot(xdate',data)
datetick('x','yyyy-mm-dd HH:MM:SS')
...well the figure I get is not the one expected...I would like to have a minute resolution as in datavector...can you help me?
Thanks!

Sorting a vector by the number of time each value occurs

We have the following case:
Q = [idxcell{:,1}];
Sort = sort(Q,'descend')
Sort =
Columns 1 through 13
23 23 22 22 20 19 18 18 18 18 17 17 17
Columns 14 through 26
15 15 14 14 13 13 13 12 12 12 11 10 9
Columns 27 through 39
9 9 8 8 8 8 8 7 7 7 7 7 7
Columns 40 through 52
7 6 6 6 5 4 4 3 3 3 3 2 2
Columns 53 through 64
2 2 2 2 2 2 2 1 1 1 1 1
How can we sort matrix Sort according to how many times its values are repeated?
Awaiting result should be:
repeatedSort = 2(9) 7(7) 1(5) 8(5) 3(4) 18(4) 6(3) 9(3) 12(3) 13(3) 17(3) 4(2) 14(2) 15(2) 22(2) 23(2) 5(1) 10(1) 11(1) 19(1) 20(1)
or
repeatedSort = 2 7 1 8 3 18 6 9 12 13 17 4 14 15 22 23 5 10 11 19 20
Thank you in advance.
You can use the TABULATE function from the Statistics Toolbox, then call SORTROWS to sort by the frequency.
Example:
x = randi(10, [20 1]); %# random values
t = tabulate(x); %# unique values and counts
t = t(find(t(:,2)),1:2); %# get rid of entries with zero count
t = sortrows(t, -2) %# sort according to frequency
the result, where first column are the unique values, second is their count:
t =
2 4 %# value 2 appeared four times
5 4 %# etc...
1 3
8 3
7 2
9 2
4 1
6 1
Here's one way of doing it:
d = randi(10,1,30); %Some fake data
n = histc(d,1:10);
[y,ii] = sort(n,'descend');
disp(ii) % ii is now sorted according to frequency