Aggregation Column must be in Group By - postgresql

In postgresql i can't execute this code
SELECT name, SUM(sws) as sws, SUM(sws) over (partition by sws) swsrange FROM professoren
JOIN vorlesungen on persnr = gelesenvon
group by name
order by sws desc
The Error code is this:
FEHLER: Spalte »vorlesungen.sws« muss in der GROUP-BY-Klausel erscheinen oder in einer Aggregatfunktion verwendet werden
LINE 1: SELECT name, SUM(sws) as sws, SUM(sws) over (partition by s...
which means he wants "sws" must be in the group by or in a aggregation function (which it actually is).
UPDATE:
i changed it to
SELECT name, SUM(sws) as swscount, SUM(sws) over (partition by name) swsrange FROM professoren
JOIN vorlesungen on persnr = gelesenvon
group by name, sws
the output is:
Augustinus,2,2|Kant,8,4|Popper,2,2|Russel,2,5|Russel,6,5|Sokrates,2,6|Sokrates,8,6
but it should be like the first Column is the name, the second is the sum() of all lessons he gives and the third is like a ranking who has the most lessons:
Sokrates,10,1|Kant,8,2|Russel,8,2|Augustinus,2,3|Popper,2,3
i can't see the issue here.
thanks for your help.

You specified SUM(sws) over (partition by sws) but sws is not specified in GROUP BY.
After your edit of question, could it be what you are looking for?:
DROP TABLE T2;
CREATE TABLE T2 (NAME VARCHAR(20), SWSCOUNT INT)
;
INSERT INTO T2 VALUES ('Augustinus',2), ('Kant',8), ('Popper',2), ('Russel',2), ('Russel',6),('Sokrates',2),('Sokrates',8);
SELECT * FROM T2;
SELECT *, DENSE_RANK() OVER (ORDER BY swscount DESC)
FROM (SELECT NAME, SUM(SWSCOUNT) AS SWSCOUNT
FROM T2
GROUP BY NAME) X
Ouput:
name swscount
1 Augustinus 2
2 Kant 8
3 Popper 2
4 Russel 2
5 Russel 6
6 Sokrates 2
7 Sokrates 8
name swscount dense_rank
1 Sokrates 10 1
2 Kant 8 2
3 Russel 8 2
4 Popper 2 3
5 Augustinus 2 3

Related

How to get id of the row which was selected by aggregate function? [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 4 years ago.
I have next data:
id | name | amount | datefrom
---------------------------
3 | a | 8 | 2018-01-01
4 | a | 3 | 2018-01-15 10:00
5 | b | 1 | 2018-02-20
I can group result with the next query:
select name, max(amount) from table group by name
But I need the id of selected row too. Thus I have tried:
select max(id), name, max(amount) from table group by name
And as it was expected it returns:
id | name | amount
-----------
4 | a | 8
5 | b | 1
But I need the id to have 3 for the amount of 8:
id | name | amount
-----------
3 | a | 8
5 | b | 1
Is this possible?
PS. This is required for billing task. At some day 2018-01-15 configuration of a was changed and user consumes some resource 10h with the amount of 8 and rests the day 14h -- 3. I need to count such a day by the maximum value. Thus row with id = 4 is just ignored for 2018-01-15 day. (for next day 2018-01-16 I will bill the amount of 3)
So I take for billing the row:
3 | a | 8 | 2018-01-01
And if something is wrong with it. I must report that row with id == 3 is wrong.
But when I used aggregation function the information about id is lost.
Would be awesome if this is possible:
select current(id), name, max(amount) from table group by name
select aggregated_row(id), name, max(amount) from table group by name
Here agg_row refer to the row which was selected by aggregation function max
UPD
I resolve the task as:
SELECT
(
SELECT id FROM t2
WHERE id = ANY ( ARRAY_AGG( tf.id ) ) AND amount = MAX( tf.amount )
) id,
name,
MAX(amount) ma,
SUM( ratio )
FROM t2 tf
GROUP BY name
UPD
It would be much better to use window functions
There are at least 3 ways, see below:
CREATE TEMP TABLE test (
id integer, name text, amount numeric, datefrom timestamptz
);
COPY test FROM STDIN (FORMAT csv);
3,a,8,2018-01-01
4,a,3,2018-01-15 10:00
5,b,1,2018-02-20
6,b,1,2019-01-01
\.
Method 1. using DISTINCT ON (PostgreSQL-specific)
SELECT DISTINCT ON (name)
id, name, amount
FROM test
ORDER BY name, amount DESC, datefrom ASC;
Method 2. using window functions
SELECT id, name, amount FROM (
SELECT *, row_number() OVER (
PARTITION BY name
ORDER BY amount DESC, datefrom ASC) AS __rn
FROM test) AS x
WHERE x.__rn = 1;
Method 3. using corelated subquery
SELECT id, name, amount FROM test
WHERE id = (
SELECT id FROM test AS t2
WHERE t2.name = test.name
ORDER BY amount DESC, datefrom ASC
LIMIT 1
);
demo: db<>fiddle
You need DISTINCT ON which filters the first row per group.
SELECT DISTINCT ON (name)
*
FROM table
ORDER BY name, amount DESC
You need a nested inner join. Try this -
SELECT id, T2.name, T2.amount
FROM TABLE T
INNER JOIN (SELECT name, MAX(amount) amount
FROM TABLE
GROUP BY name) T2
ON T.amount = T2.amount

Subsetting records that contain multiple values in one column

In my postgres table, I have two columns of interest: id and name - my goal is to only keep records where id has more than one value in name. In other words, would like to keep all records of ids that have multiple values and where at least one of those values is B
UPDATE: I have tried adding WHERE EXISTS to the queries below but this does not work
The sample data would look like this:
> test
id name
1 1 A
2 2 A
3 3 A
4 4 A
5 5 A
6 6 A
7 7 A
8 2 B
9 1 B
10 2 B
and the output would look like this:
> output
id name
1 1 A
2 2 A
8 2 B
9 1 B
10 2 B
How would one write a query to select only these kinds records?
Based on your description you would seem to want:
select id, name
from (select t.*, min(name) over (partition by id) as min_name,
max(name) over (partition by id) as max_name
from t
) t
where min_name < max_name;
This can be done using EXISTS:
select id, name
from test t1
where exists (select *
from test t2
where t1.id = t2.id
and t1.name <> t2.name) -- this will select those with multiple names for the id
and exists (select *
from test t3
where t1.id = t3.id
and t3.name = 'B') -- this will select those with at least one b for that id
Those records where for their id more than one name shines up, right?
This could be formulated in "SQL" as follows:
select * from table t1
where id in (
select id
from table t2
group by id
having count(name) > 1)

Row_number() over partition

I am working on peoplesoft. I have a requirement where I have to update the column value in a sequence ordered based on some ID.
For eg.
CA24100001648- 1
CA24100001648- 2
CA24100001664- 1
CA24100001664- 2
CA24100001664- 3
CA24100001664- 4
CA24100001664- 5
CA24100001664- 6
But, I am getting '1' as the value for all the rows on updating.
Here is my query, can anyone please help out on this.
UPDATE PS_UC_CA_CONT_STG C
SET C.CONTRACT_LINE_NUM2 = ( SELECT row_number() over(PARTITION BY D.CONTRACT_NUM
order by D.CONTRACT_NUM)
FROM PS_UC_CA_HDR_STG D
WHERE C.CONTRACT_NUM=D.CONTRACT_NUM );
Thanksenter image description here
update emp a
set comm =
(with cnt as ( select deptno,empno,row_number() over (partition by deptno order by deptno) rn from emp)
select c.rn from cnt c where c.empno=a.empno)

How to normalize group by count results?

How can the results of a "group by" count be normalized by the count's sum?
For example, given:
User Rating (1-5)
----------------------
1 3
1 4
1 2
3 5
4 3
3 2
2 3
The result will be:
User Count Percentage
---------------------------
1 3 .42 (=3/7)
2 1 .14 (=1/7)
3 2 .28 (...)
4 1 .14
So for each user the number of ratings they provided is given as the percentage of the total ratings provided by everyone.
SELECT DISTINCT ON (user) user, count(*) OVER (PARTITION BY user) AS cnt,
count(*) OVER (PARTITION BY user) / count(*) OVER () AS percentage;
The count(*) OVER (PARTITION BY user) is a so-called window function. Window functions let you perform some operation over a "window" created by some "partition" which is here made over the user id. In plain and simple English: the partitioned count(*) is calculated for each distinct user value, so in effect it counts the number of rows for each user value.
Without using a windowing function or variables, you will need to cross join a grouped subquery on a second "maxed" subquery then select again to return a subset you can work with.
SELECT
B.UserID,
B.UserCount,
A.CountAll
FROM
(
SELECT
CountAll=SUM(UserCount)
FROM
(
SELECT
UserCount=COUNT(*)
FROM
MyTable
GROUP BY
UserID
) AS A
)AS C
CROSS JOIN(
SELECT
UserID,
UserCount=COUNT(*)
FROM
MyTable
GROUP BY
UserID
)AS B

T-SQL End of Month sum

I have a table with some transaction fields, primary id is a CUSTomer field and a TXN_DATE and for two of them, NOM_AMOUNT and GRS_AMOUNT I need an EndOfMonth SUM (no rolling, just EOM, can be 0 if no transaction in the month) for these two amount fields. How can I do it? I need also a 0 reported for months with no transactions..
Thank you!
If you group by the expresion month(txn_date) you can calculate the sum. If you use a temporary table with a join on month you can determine which months have no records and thus report a 0 (or null if you don't use the coalesce fiunction).
This will be your end result, I assume you are able to add the other column you need to sum and adapt for your schema.
select mnt as month
, sum(coalesce(NOM_AMOUNT ,0)) as NOM_AMOUNT_EOM
, sum(coalesce(GRS_AMOUNT ,0)) as GRS_AMOUNT_EOM
from (
select 1 as mnt
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9
union all select 10
union all select 11
union all select 12) as m
left outer join Table1 as t
on m.mnt = month(txn_date)
group by mnt
Here is the initial working sqlfiddle