Getting a "The sum function requires 1 argument(s)." error - select

Trying to 2 values in a column together. The idea is that I get the m1, m2, and m3 values that fit the criteria; area ='000000' , ownership = '50', and code =113 or 114. The values should be 42, 40, and 44 respectively. Until now, I have been doing this in excel but am trying to take Excel out of this process. There are no NULL values involved in this.
Any idea why I am getting this error?
select sum (m1,m2,m3),
from dbo.tablename
where area='000000' and ownership='50' and (code='113' or code='114');
sample data
area ownership code m1 m2 m3
000000 50 113 40 38 42
000000 50 114 2 2 2
desired result
000000 50 113+114 42 40 44

In SQL, SUM(column) is an aggregate function that sums the values across different rows. If you want to add values from a single row, you can do SELECT m1 + m2 + m3 FROM.... You can also add the column values inside the rows, then sum it across rows like SUM(m1 + m2 + m3). I would re-write you query as:
SELECT SUM(m1) sum1, SUM(m2) sum2, SUM(m3) sum3
FROM dbo.tablename
WHERE area='000000' AND ownership='50' AND (code='113' OR code='114');

to get that specific answer as below.
desired result
area | ownership| code | m1 | m2 | m3
000000| 50 | 113+114| 42 | 40 | 44
once you want to see area and ownership this should have this columns on the sql and group by condition.
Like:
select area, ownership, sum(code), sum(m1), sum(m2), sum(m3)
from dbo.tablename
where area='000000' and ownership='50' and (code='113' or code='114')
group by area, ownership;

Related

Calculate percentage difference between two rows

I have this query that produced the table below.
select season,
guildname,
count(guildname) as mp_count,
(count(guildname)/600::float)*100 as grank
from mp_rankings
group by season, guildname
order by grank desc
season
guildname
mp_count
grank
10
LEGENDS
56
9.33333333333333
9
LEGENDS
54
9
10
EVERGLADE
50
8.33333333333333
9
Mystic
46
7.66666666666667
10
Mystic
42
7
9
EVERGLADE
39
6.5
10
100
36
6
9
PARABELLUM
33
5.5
10
PARABELLUM
29
4.83333333333333
9
100
29
4.83333333333333
I wanted to create a new column that calculates the percentage difference between the two seasons using identical guildnames. For example:
season
guildname
mp_count
grank
prev_season_percent_diff
10
LEGENDS
56
9.33333333333333
0.33%
10
EVERGLADE
50
8.33333333333333
1.83%
The resulting table will only show the current season (which is the highest season value, 10 in this case) and adds a new column prev_season_percent_diff, which is the current season's grank minus the previous season's grank.
How can I achieve this?
Use a Common Table Expression ("CTE") for the grouped result and join it to itself to calculate the difference to the previous season:
with summary as (
select
season,
guildname,
count(*) as mp_count, -- simplified equivalent expression
count(*)/6 as grank -- simplified equivalent expression
from mp_rankings
group by season, guildname
)
select
a.season,
a.guildname,
a.mp_count,
a.grank,
a.mp_count - b.mp_count as prev_season_percent_diff
from summary a
left join summary b on b.guildname = a.guildname
and b.season = a.season - 1
where a.season = (select max(season) from summary)
order by a.grank desc
If you actually want a % in the result, concatenate a % to the difference calculation.

Utility like except for tables in kdb

As we have except function for lists in kdb to find the elements which are present in one list and not in another, similarly do we have any utility to extract the rows present in one table and not in another based on a column?
Eg: I have two tables:
l:([]c1:`a`b`c`d;c2:10 20 30 40)
r:([]c1:`a`a`a`b`b;c3:100 200 300 400 50)
Since, for column c1 in table l we have row c d which are not present in column c1 of table r.
Do we have any utility in kdb which can be used to get output like below?
c1 c2
-----
c 30
d 40
I got the output using -
select from l where c1 in l[`c1] except r`c1
But, I'm searching for better/optimised solution/utility to get the same output.
I don't think there's anything wrong with your current implementation but you could use drop (aka _) on a keyed table for a more succinct approach:
q)#[1#`c1;r]_1!l
c1| c2
--| --
c | 30
d | 40
This also remains pretty neat when they "key" is more than one column:
l0:([]c0:`x`y`z`w;c1:`a`b`c`d;c2:10 20 30 40)
r0:([]c0:`y`x`x`x`y;c1:`a`a`a`b`b;c3:100 200 300 400 50)
q)#[`c0`c1;r0]_2!l0
c0 c1| c2
-----| --
z c | 30
w d | 40
A more functional form would be this:
{cl:cols[x]inter cols y;x where not(cl#x)in cl#y}[l;r]
c1 c2
-----
c 30
d 40
This should work if you don't know the columns to match on because of cols[x] inter cols[y] at the start which obtains common cols between the two tables. It also works without columns being keyed.
Although in this specific case, the following would be a little bit faster:
l where not l[`c1] in r[`c1]

How to divide two values from the same column but at different rows

I have a table like this:
postcode | value | uns
AA | 10 | 51
AB | 20 | 78
AA | 20 | 78
AB | 50 | 51
and I want to get a result like:
AA | 0.5
AB | 2.5
where the new values are the division for the same postcode between the value with uns = 51 and the value with uns = 78.
How can I do that with Postgres? I already checked window functions and partitions but I am not sure how to do it.
If (postcode, uns) is unique, all you need is a self-join:
select postcode, uns51.value / nullif(uns78.value, 0)
from t uns51
join t uns78 using (postcode)
where uns51.uns = 51
and uns78.uns = 78
If the rows with either t.uns = 51 or t.uns = 78 may be missing, you could use a full join instead (with possibly coalesce() to provide default values for missing rows).
pozs' solution is nice and simple, nothing wrong with it. Just adding two alternatives:
1. Correlated subquery
SELECT postcode
, value / (SELECT NULLIF(value, 0) FROM t WHERE postcode = uns51.postcode AND uns = 78)
FROM t uns51
WHERE uns = 51;
For only one or a few rows.
2. Conditional aggregate
SELECT postcode
, min(value) FILTER (WHERE uns = 51)/ NULLIF(min(value) FILTER (WHERE uns = 78), 0)
FROM t
GROUP BY postcode;
May be faster when processing most or all of the table.
Can also deal with duplicates per (postcode, uns), use an aggregate function of your choice to pick the right value from each group. For just one row in each group, min() is just as good as max() or sum().
About the aggregate FILTER:
Aggregate columns with additional (distinct) filters

KDB selecting first row from each group

Very silly question... Consider the table t1 below which is sorted by sym.
t1:([]sym:(3#`A),(2#`B),(4#`C);val:10 40 12 50 58 75 22 103 108)
sym val
A 10
A 40
A 12
B 50
B 58
C 75
C 22
C 103
C 108
I want to select the first row corresponding to each sym, like this:
(`sym`val)!(`A`B`C;10j, 50j, 75j)
sym val
A 10
B 50
C 75
There's got to be a one-liner to do this. To get the LAST row for each sym, it would be as simple as select by sym from t1. Any hints?
select first val by sym from t1
Or for multiple columns, you can reverse the table and run your query:
select by sym from reverse t1
You could use fby
q)select from t1 where i=(first;i) fby sym
sym val
-------
A 10
B 50
C 75

How to sum multiple elements from single record

I have table trade:([]time:`time$(); sym:`symbol$(); price:`float$(); size:`long$())
with e.g. 1000 records, with e.g. 10 unique syms. I want to sum the first 4 prices for each sym.
My code looks like:
priceTable: select price by sym from trade;
amountTable: select count price by sym from trade;
amountTable: `sym`amount xcol amountTable;
resultTable: amountTable ij priceTable;
So my new table looks like: resultTable
sym | amount price
-------| --------------------------------------------------------------
instr0 | 106 179.2208 153.7646 155.2658 143.8163 107.9041 195.521 ..
The result of command: res: select sum price from resultTable where i = 1:
price
..
----------------------------------
14.71512 153.2244 154.1642 196.5744
Now, when I want to sum elements I receive: sum res
price| 14.71512 153.2244 154.1642 196.5744 170.6052 61.26522 45.70606
46.9057..
When I want to count elements in res: count res
1
I assume that res is a single record with many values, how can I sum all of those values, or how can I sum first for?
You can use "each" to run the sum on each row:
select sum each price from res
Or if you want to run on resoultTable:
select sum each price from resoultTable
To sum the first four prices for each row, use a dyadic each-right:
select sum each 4#/:price from resoultTable
Or you could do all of this very easily, in one step:
select COUNT:count i, SUM:sum price, SUM4:sum 4#price by sym from trade
q)trade:([]time:10?.z.d; sym:10#`a`b`c; price:100.+til 10; size:10+til 10)
One caveat with take (#) operator is, if the elements in the list are lesser than the take count , it treats the list as circular and start retruning the repetative results. E.g. check out the 4th price for symbol b and c.
q)select 4#price by sym from trade
sym| price
---| ---------------
a | 100 103 106 109
b | 101 104 107 101 //101 - 2 times
c | 102 105 108 102 //102 - 2 times
Using sublist can ensure that it the elemnts are lesser than passed count argument , it will just return the smaller list.
q)select sublist[4;price] by sym from trade
sym| price
---| ----------------
a | 100 103 106 109f
b | 101 104 107f
c | 102 105 108f