Cannot access the value of the kdb table column - kdb

I am new to KDB and cannot understand why i can access the order column for the table stocks but not trader. The below is my code with the error.
q)trader
item brand | price order
---------------| -----------
soda fry | 1.5 200
bacon prok | 1.99 180
mushroom veggie| 0.88 110
eggs veggie| 1.55 210
tomatoes veggie| 1.35 100
q)trader.order
'order
[0] trader.order
^
q)stock.order
50 82 45 92
q)stock
item brand price order
---------------------------
soda fry 1.5 50
bacon prok 1.99 82
mushroom veggie 0.88 45
eggs veggie 1.55 92
q)trader.order
'order
[0] trader.order
^

Your table trader is keyed and you cannot use trader.order to select the order column.
You can use this instead if you want
(0!trader)`order
The reason is because when you do trader.order what you actually do is you use indexing. It's the same as if you'd do list.index. A table is just a list of dictionaries and you use dot(.) to index into it. However a keyed table does not have the same structure so you'll have to unkey it first.

Related

How to roll numbers up in Tableau for aggregation?

I have a data structure issue. I have a problem where I need to roll up my data within tableau so that aggregated numbers do not skew in a certain manner.
Example of current data
ID Model_Number Value
123 fff 2
123 ggg 2
123 hhh 2
123 uuu 2
124 yyy 5
124 qqq 5
124 eee 5
Avg: NA 3.28
Ideal state of data and aggregation
ID Value
123 2
124 5
Avg 3.5
As you see since the data is at two different grains the aggregated number (avg) will be different. I would like to roll up my numbers to the distinct value of ID and then calculate my average which will result in a different (correct in my context) aggergated number.
Here is one calculated field that could help.
{ FIXED [ID] : AVG([Value]) }
This will give you the avg value by ID. You can then use a grand total(avg) to get the 3.5

duplicating table columns in KDB

Consider the code below:
q)tab:flip `items`sales`prices!(`nut`bolt`cam`cog;6 8 0 3;10 20 15 20)
q)tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
I would like to duplicate the prices column. I can write a query like this:
q)update prices_copy: prices from tab
I also can write a query like this:
q)select items, sales, prices, prices_copy: first prices by items from tab
Both would work. I would like to know how the "by" version would work and the motivation for writing each version. I cannot help but think the "by" version is more thinking in rows.
Your initial query would be ideally what you want for your duplicate column requirement.
The by creates groups of the column items in your example and collapses every other column in the select query according to the indices calculated from grouping items. More info on by here - http://code.kx.com/wiki/Reference/select and http://code.kx.com/wiki/JB:QforMortals2/queries_q_sql#The_by_Phrase
In your example, the column items is already unique and so no collapsing into groups is actually performed, however, the by will create nested lists from the other columns (i.e. lists of lists). The use of first will just un-nest the items column, thus collapsing it to a normal (long-typed) vector.
When the grouping is finished the by columns are used as the key column[s] of the result and you will see this by the use of a vertical line to the right hand side of the key column[s]. All other columns within the select query are placed to the right hand side of the key.
The logic of the by version coincidentally creates a copy of prices. But by changes the order:
q)ungroup select sales, prices by items from tab
items sales prices
------------------
bolt 8 20
cam 0 15
cog 3 20
nut 6 10
q)tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
The by version works only because items is unique. For a tab with multiple values for item eg. 8#tab, the query only produces 4 values for prices_copy.
q)select items, sales, prices, prices_copy: first prices by items from 8#tab
items| items sales prices prices_copy
-----| ----------------------------------
bolt | bolt bolt 8 8 20 20 20
cam | cam cam 0 0 15 15 15
cog | cog cog 3 3 20 20 20
nut | nut nut 6 6 10 10 10
There is a fundamental difference between a simple update and update by queries.
Let's explore it by adding an extra column brand to the table
tab2:flip `items`sales`prices`brand!(`nut`bolt`cam`cog`nut`bolt`cam`cog;6 8 0 3 1 2 3 4;10 20 15 20 30 40 50 60;`b1`b1`b1`b1`b2`b2`b2`b2)
The following will now simply copy the column :
asc update prices_copy: prices from tab2
However, the following query is copying the first item price regardless of the brand and updating it for all other brands of same item.
asc ungroup select sales, prices,brand, prices_copy: first prices by items from tab2
items sales prices brand prices_copy
------------------------------------
bolt 2 40 b2 20
bolt 8 20 b1 20 //b2 price
cam 0 15 b1 15 //b2 price
cam 3 50 b2 15
cog 3 20 b1 20
cog 4 60 b2 20 //b2 price
nut 1 30 b2 10 //b2 price
nut 6 10 b1 10
update by might be useful in the case where you want to copy the max price of the items regardless of the brand or some other aggregation query.
asc ungroup select sales, prices,brand, prices_copy: max prices by items from tab2
items sales prices brand prices_copy
------------------------------------
bolt 2 40 b2 40
bolt 8 20 b1 40 //max price in bolts regardless of the brand
cam 0 15 b1 50
cam 3 50 b2 50
cog 3 20 b1 60
cog 4 60 b2 60
nut 1 30 b2 30
nut 6 10 b1 30

Tableau: How to perform "Summarize totals except top 3"?

I have data something like below for the name of person and the total sales he/she made:
ABC1 34
ABC2 45
ABC3 78
ABC4 79
ABC5 23
ABC6 61
ABC7 34
ABC8 54
ABC9 90
I have to display the dashboard as below, top 3 sales guys and the overall total sales made by rest of the team as ROT which is 498 - (90 + 78 + 79) = 251 team:
ABC9 90
ABC4 79
ABC3 78
ROT 251
For the top sales made, I gave a filter by sales person name, with Limit condition as "Top 3". But I am struggling to display the ROI even in a separate worksheet. Any help is appreciated.
Right click on your dimension [Sales Guy] and choose Create/Set
Define the set by the Top N (either hard code it or use a parameter to change it easily) and call it [TopNSalesGuy]
Create a calculated field [TopNSalesGuysPlusOther] with the formular:
IF [TopNSalesGuy] THEN [Sales Guy] ELSE 'ROT' END
Use [TopNSalesGuysPlusOther] in your table/graph and you should have the top N sales guys by name and everythign else as 'ROT

Difference between rows in KDB/Q

I'm new to KDB/Q and have a question around getting the difference between two (not necessarily adjacent) rows.
I have only one table, which looks like the below:
q)tickers:`ibm`bac`dis`gs`ibm`gs`dis`bac
q)pxs:100 50 30 250 110 240 45 48
q)dates:2013.05.01 2013.01.05 2013.02.03 2013.02.11 2013.06.17 2013.06.21 2013.04.24 2013.01.06
q)trades:([tickers;dates];pxs)
q)trades
tickers dates | pxs
------------------| ---
ibm 2013.05.01| 100
bac 2013.01.05| 50
dis 2013.02.03| 30
gs 2013.02.11| 250
ibm 2013.06.17| 110
gs 2013.06.21| 240
dis 2013.04.24| 45
bac 2013.01.06| 48
I would like to be able to have a either another column in the table that stores the difference between the current and the previous price, or another structure similar in structure. The key question that the resulting needs to answer is "by how much did the stock change compared to the previous time a price was recorded?"
So far I've tried something along the lines of:
select tickers, dates, pxs - pxs(dates bin (exec dates from trades where tickers = trades.tickers)) from trades
which doesn't really work (at all). Definitely due to trying to do SQL-like queries and having a row-oriented mindset.
Please find below an exemple of the sought after answer:
q)trades: do magic with trades
q)trades
tickers dates | pxs | delta
------------------| --- | -----
ibm 2013.05.01| 100 | 0
bac 2013.01.05| 50 | 0
dis 2013.02.03| 30 | 0
gs 2013.02.11| 250 | 0
ibm 2013.06.17| 110 | 10
gs 2013.06.21| 240 | -10
dis 2013.04.24| 45 | 15
bac 2013.01.06| 48 | -2
Thanks for your help,
Dan
q)update delta:{0,1_deltas x}pxs by tickers from trades
tickers dates | pxs delta
------------------| ---------
ibm 2013.05.01| 100 0
bac 2013.01.05| 50 0
dis 2013.02.03| 30 0
gs 2013.02.11| 250 0
ibm 2013.06.17| 110 10
gs 2013.06.21| 240 -10
dis 2013.04.24| 45 15
bac 2013.01.06| 48 -2
if you do:
select pxs by dates,tickers from table
you will have a complex column (pxs) which is a list of prices for the particular date and ticker. You can then apply deltas:
select deltas pxs by dates,tickers from table
Which will give you the running difference. The first value is the original pxs though so you'll need to update the first one to 0.
EDIT
Just re-read and having looked at your result, you'll need to join back to your original trade table
update dates, pxs, delta:(0N,(-1_ pxs) - 1_ pxs) by tickers from trades
Please find how it works:
select pxs by tickets from trades
creates table which rows contains: ticket and list pxs.
So in every row we have a list:
tickers| pxs
-------| -------
bac | 50 48
dis | 30 45
gs | 250 240
ibm | 100 110
now we have to apply function which will calculate delta. Best function mentioned above: deltas, but my version is about the same.
if we select - then we will have table with tickers|list of pxs|list of deltas, but is we use update .. by, then it ungroup groupped values.
You can get the same results using the prev function. One thing worth highlighting that prev automatically adds the null (0N) as the first element. This is important as we don't have the previous information available, however, adding a 0 as the first element suggests that there has not been any change; though it depends on how you want to handle the first record.
q)update delta:pxs-prev[pxs] by tickers from trades
tickers dates | pxs delta
------------------| ---------
ibm 2013.05.01| 100
bac 2013.01.05| 50
dis 2013.02.03| 30
gs 2013.02.11| 250
ibm 2013.06.17| 110 10
gs 2013.06.21| 240 -10
dis 2013.04.24| 45 15
bac 2013.01.06| 48 -2
using deltas to get the same results (0N instead of 0)
q)update delta:{0N,1_deltas x}pxs by tickers from trades

How to sum multiple elements from single record

I have table trade:([]time:`time$(); sym:`symbol$(); price:`float$(); size:`long$())
with e.g. 1000 records, with e.g. 10 unique syms. I want to sum the first 4 prices for each sym.
My code looks like:
priceTable: select price by sym from trade;
amountTable: select count price by sym from trade;
amountTable: `sym`amount xcol amountTable;
resultTable: amountTable ij priceTable;
So my new table looks like: resultTable
sym | amount price
-------| --------------------------------------------------------------
instr0 | 106 179.2208 153.7646 155.2658 143.8163 107.9041 195.521 ..
The result of command: res: select sum price from resultTable where i = 1:
price
..
----------------------------------
14.71512 153.2244 154.1642 196.5744
Now, when I want to sum elements I receive: sum res
price| 14.71512 153.2244 154.1642 196.5744 170.6052 61.26522 45.70606
46.9057..
When I want to count elements in res: count res
1
I assume that res is a single record with many values, how can I sum all of those values, or how can I sum first for?
You can use "each" to run the sum on each row:
select sum each price from res
Or if you want to run on resoultTable:
select sum each price from resoultTable
To sum the first four prices for each row, use a dyadic each-right:
select sum each 4#/:price from resoultTable
Or you could do all of this very easily, in one step:
select COUNT:count i, SUM:sum price, SUM4:sum 4#price by sym from trade
q)trade:([]time:10?.z.d; sym:10#`a`b`c; price:100.+til 10; size:10+til 10)
One caveat with take (#) operator is, if the elements in the list are lesser than the take count , it treats the list as circular and start retruning the repetative results. E.g. check out the 4th price for symbol b and c.
q)select 4#price by sym from trade
sym| price
---| ---------------
a | 100 103 106 109
b | 101 104 107 101 //101 - 2 times
c | 102 105 108 102 //102 - 2 times
Using sublist can ensure that it the elemnts are lesser than passed count argument , it will just return the smaller list.
q)select sublist[4;price] by sym from trade
sym| price
---| ----------------
a | 100 103 106 109f
b | 101 104 107f
c | 102 105 108f