duplicating table columns in KDB - kdb

Consider the code below:
q)tab:flip `items`sales`prices!(`nut`bolt`cam`cog;6 8 0 3;10 20 15 20)
q)tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
I would like to duplicate the prices column. I can write a query like this:
q)update prices_copy: prices from tab
I also can write a query like this:
q)select items, sales, prices, prices_copy: first prices by items from tab
Both would work. I would like to know how the "by" version would work and the motivation for writing each version. I cannot help but think the "by" version is more thinking in rows.

Your initial query would be ideally what you want for your duplicate column requirement.
The by creates groups of the column items in your example and collapses every other column in the select query according to the indices calculated from grouping items. More info on by here - http://code.kx.com/wiki/Reference/select and http://code.kx.com/wiki/JB:QforMortals2/queries_q_sql#The_by_Phrase
In your example, the column items is already unique and so no collapsing into groups is actually performed, however, the by will create nested lists from the other columns (i.e. lists of lists). The use of first will just un-nest the items column, thus collapsing it to a normal (long-typed) vector.
When the grouping is finished the by columns are used as the key column[s] of the result and you will see this by the use of a vertical line to the right hand side of the key column[s]. All other columns within the select query are placed to the right hand side of the key.

The logic of the by version coincidentally creates a copy of prices. But by changes the order:
q)ungroup select sales, prices by items from tab
items sales prices
------------------
bolt 8 20
cam 0 15
cog 3 20
nut 6 10
q)tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
The by version works only because items is unique. For a tab with multiple values for item eg. 8#tab, the query only produces 4 values for prices_copy.
q)select items, sales, prices, prices_copy: first prices by items from 8#tab
items| items sales prices prices_copy
-----| ----------------------------------
bolt | bolt bolt 8 8 20 20 20
cam | cam cam 0 0 15 15 15
cog | cog cog 3 3 20 20 20
nut | nut nut 6 6 10 10 10

There is a fundamental difference between a simple update and update by queries.
Let's explore it by adding an extra column brand to the table
tab2:flip `items`sales`prices`brand!(`nut`bolt`cam`cog`nut`bolt`cam`cog;6 8 0 3 1 2 3 4;10 20 15 20 30 40 50 60;`b1`b1`b1`b1`b2`b2`b2`b2)
The following will now simply copy the column :
asc update prices_copy: prices from tab2
However, the following query is copying the first item price regardless of the brand and updating it for all other brands of same item.
asc ungroup select sales, prices,brand, prices_copy: first prices by items from tab2
items sales prices brand prices_copy
------------------------------------
bolt 2 40 b2 20
bolt 8 20 b1 20 //b2 price
cam 0 15 b1 15 //b2 price
cam 3 50 b2 15
cog 3 20 b1 20
cog 4 60 b2 20 //b2 price
nut 1 30 b2 10 //b2 price
nut 6 10 b1 10
update by might be useful in the case where you want to copy the max price of the items regardless of the brand or some other aggregation query.
asc ungroup select sales, prices,brand, prices_copy: max prices by items from tab2
items sales prices brand prices_copy
------------------------------------
bolt 2 40 b2 40
bolt 8 20 b1 40 //max price in bolts regardless of the brand
cam 0 15 b1 50
cam 3 50 b2 50
cog 3 20 b1 60
cog 4 60 b2 60
nut 1 30 b2 30
nut 6 10 b1 30

Related

Calculate percentage difference between two rows

I have this query that produced the table below.
select season,
guildname,
count(guildname) as mp_count,
(count(guildname)/600::float)*100 as grank
from mp_rankings
group by season, guildname
order by grank desc
season
guildname
mp_count
grank
10
LEGENDS
56
9.33333333333333
9
LEGENDS
54
9
10
EVERGLADE
50
8.33333333333333
9
Mystic
46
7.66666666666667
10
Mystic
42
7
9
EVERGLADE
39
6.5
10
100
36
6
9
PARABELLUM
33
5.5
10
PARABELLUM
29
4.83333333333333
9
100
29
4.83333333333333
I wanted to create a new column that calculates the percentage difference between the two seasons using identical guildnames. For example:
season
guildname
mp_count
grank
prev_season_percent_diff
10
LEGENDS
56
9.33333333333333
0.33%
10
EVERGLADE
50
8.33333333333333
1.83%
The resulting table will only show the current season (which is the highest season value, 10 in this case) and adds a new column prev_season_percent_diff, which is the current season's grank minus the previous season's grank.
How can I achieve this?
Use a Common Table Expression ("CTE") for the grouped result and join it to itself to calculate the difference to the previous season:
with summary as (
select
season,
guildname,
count(*) as mp_count, -- simplified equivalent expression
count(*)/6 as grank -- simplified equivalent expression
from mp_rankings
group by season, guildname
)
select
a.season,
a.guildname,
a.mp_count,
a.grank,
a.mp_count - b.mp_count as prev_season_percent_diff
from summary a
left join summary b on b.guildname = a.guildname
and b.season = a.season - 1
where a.season = (select max(season) from summary)
order by a.grank desc
If you actually want a % in the result, concatenate a % to the difference calculation.

KDB: how do I print a string literal along each record of table?

How do I select a table along with a string literal value?
tab:flip `items`sales`prices!(`nut`bolt`cam`cog;6 8 0 3;10 20 15 20)
select a:"abcdef", items, sales from tab
Expected output:
a items sales prices
----------------------------
"abcdef" nut 6 10
"abcdef" bolt 8 20
"abcdef" cam 0 15
"abcdef" cog 3 20
In case if you just want to add a new column to a table; here we are using the KDB hidden index column for counting the records.
q)update a:count[i]#enlist "abcdef" from tab
items sales prices a
---------------------------
nut 6 10 "abcdef"
bolt 8 20 "abcdef"
cam 0 15 "abcdef"
cog 3 20 "abcdef"
You can do it within a select statement provided the fabricated column isn't first
q)select items,a:count[i]#enlist"abcdef",sales from tab
items a sales
--------------------
nut "abcdef" 6
bolt "abcdef" 8
cam "abcdef" 0
cog "abcdef" 3
If the fabricated column is first then it groups the values into lists which would require an ungroup
An alternative but less conventional approach would be to use cross
q)([]a:enlist "abcdef")cross tab
a items sales prices
---------------------------
"abcdef" nut 6 10
"abcdef" bolt 8 20
"abcdef" cam 0 15
"abcdef" cog 3 20
you can do this:
q) update a:count[t]#enlist "abcdef" from t:select items, sales from tab
This will also work if you have where clause:
q)update a:count[t]#enlist "abcdef" from t:select items, sales from tab where sales<4
Output:
a items sales prices
----------------------------
"abcdef" cam 0 15
"abcdef" cog 3 20

Postgresql conditional sum

I have a table with the following data for sales and inventory (oh):
category sales oh item_num
Clothes 12 10 1
Clothes 11 10 1
Clothes 10 10 1
Clothes 5 10 1
Clothes 8 10 1
Clothes 4 10 1
Clothes 23 10 2
Clothes 5 10 2
Clothes 20 10 2
Clothes 5 10 2
Clothes 13 10 2
Clothes 9 10 2
Food 6 25 3
Food 8 25 3
Food 7 25 3
Food 14 25 3
I am trying to query this table to get a sum of both the sales and oh columns by category:
SELECT category, SUM(sales) AS sales, SUM(oh) AS oh
FROM data
GROUP BY category
However, the problem is I need the SUM(oh) to only sum distinct items but the SUM(sales) to sum all the values. So the result should be:
category sales oh
Clothes 125 20
Food 35 25
I tried SUM(DISTINCT oh), but that only works for distinct oh values not distinct items. I really need something like SUM(DISTINCT(item_num) oh).
I experimented with various window functions, but could not come up with a solution. Does anyone know how to return this kind of sum on a unique key?
Here's how I'd do it:
SELECT category, SUM(sales) AS sales, SUM(oh) AS oh
FROM (
SELECT category, SUM(sales) AS sales, oh
FROM data
GROUP BY category, item_num, oh
) ttl
GROUP BY category;
Basically tackle the problem in stages. First group up the items by category and item number to get the sum of sales then group and sum by category to get the sum of oh.
Result:
category | sales | oh
----------+-------+----
Food | 35 | 25
Clothes | 125 | 20
(2 rows)
Edit: Included simplified query.

Compare number of instances in a column for two different dates in SAS

I have the following dataset (items) with transactions on any date and amount paid on the next business day.
The amount paid for each id on the next business day is $10 for the ids whose rate is >5
My task is to compare the number of instances where rate > 5 are in line with amount paid on the next business day (This will have a standard code 121)
For instance, there are four instances with rate > 5 on 4/14/2017' - The amount$40 (4*10)is paid on4/17/2017`
Date id rate code batch
4/14/2017 1 12 100 A1
4/14/2017 1 2 101 A1
4/14/2017 1 13 101 A1
4/14/2017 1 10 100 A1
4/14/2017 1 10 100 A1
4/17/2017 1 40 121
4/20/2017 2 12 100 A1
4/20/2017 2 2 101 A1
4/20/2017 2 3 101 A1
4/20/2017 2 10 100 A1
4/20/2017 2 10 100 A1
4/21/2017 2 30 121
My code
proc sql;
create table items2 as select
count(id) as id_count,
(case when code='121' then rate/10 else 0 end) as rate_count
from items
group by date,id;
quit;
This has not yielded the desired result and the challenge I have here is to check the transaction dates (4/14/2017 and 4/20/2017) and next business day dates (4/17/2017,4/21/2017).
Appreciate your help.
LAG function will do the trick here. As we can use lagged values to create the condition we want without having to use the rate>5 condition.
Here is the solution:-
Data items;
set items;
Lag_dt=LAG(Date);
Lag_id=LAG(id);
Lag_rate=LAG(rate);
if ((id=lag_id) and (code=121) and (Date>lag_dt)) then rate_count=(rate/lag_rate);
else rate_count=0;
Drop lag_dt lag_id lag_rate;
run;
Hope this helps.

take sum of similar column from multiple data table based on unique id in crystal report

I have four datatables like
Table 1
id name Afee Insfee
1 a 100 10
2 b 100 10
Table 2
id name Bfee Insfee
2 b 100 10
1 a 100 10
3 c 100 10
Table 3
id name Cfee Insfee
1 a 100 10
3 c 100 10
Table 4
id name Dfee Insfee
1 a 100 10
2 b 100 10
in the crystal report i want to get the result as
Name Afee Bfee Cfee Dfee Insfee total
a 100 100 100 100 40 440
b 100 100 0 100 30 330
c 0 100 100 0 20 220
where this INSfee should be the sum from all the four table for a particular ID and
total should be the sum of a row in that in that report.
How to do this in a sap crystal report.
To get the sum of Insfee, Create a formula and add the field (Insfee) from all tables using sign "+" and place it adjacent to afee, dfee... etc.
Now to get the total use below code:
Create formulas for all fileds(afee,bfee...etc) in below code I named those as a, a1,a1.
Now create a another formula for "total" and implement below code
Place the formulas in detail section, You will get result.
EvaluateAfter({#a});
EvaluateAfter({#a 1});
EvaluateAfter({#a 2});
{#a}+{#a 1}+{#a 2}