I made a game, with level and scores saved into an sql table like this :
create table if not exists api.scores (
id serial primary key,
pseudo varchar(50),
level int,
score int,
created_at timestamptz default CURRENT_TIMESTAMP
);
I want to display the scores in the ui with the rank of each score, based on the score column, ordered by desc.
Here is a sample data :
id | pseudo | level | score | created_at
----+----------+-------+-------+-------------------------------
1 | test | 1 | 1 | 2020-05-01 11:25:20.446402+02
2 | test | 1 | 1 | 2020-05-01 11:28:11.04001+02
3 | szef | 1 | 115 | 2020-05-01 15:45:06.201135+02
4 | erg | 1 | 115 | 2020-05-01 15:55:19.621372+02
5 | zef | 1 | 115 | 2020-05-01 16:14:09.718861+02
6 | aa | 1 | 115 | 2020-05-01 16:16:49.369718+02
7 | zesf | 1 | 115 | 2020-05-01 16:17:42.504354+02
8 | zesf | 2 | 236 | 2020-05-01 16:18:07.070728+02
9 | zef | 1 | 115 | 2020-05-01 16:22:23.406013+02
10 | zefzef | 1 | 115 | 2020-05-01 16:23:49.720094+02
Here is what I want :
id | pseudo | level | score | created_at | rank
----+----------+-------+-------+-------------------------------+------
31 | zef | 7 | 730 | 2020-05-01 18:40:42.586224+02 | 1
50 | Cyprien | 5 | 588 | 2020-05-02 14:08:39.034112+02 | 2
49 | cyprien | 4 | 438 | 2020-05-01 23:35:13.440595+02 | 3
51 | Cyprien | 3 | 374 | 2020-05-02 14:13:41.071752+02 | 4
47 | cyprien | 3 | 337 | 2020-05-01 23:27:53.025475+02 | 5
45 | balek | 3 | 337 | 2020-05-01 19:57:39.888233+02 | 5
46 | cyprien | 3 | 337 | 2020-05-01 23:25:56.047495+02 | 5
48 | cyprien | 3 | 337 | 2020-05-01 23:28:54.190989+02 | 5
54 | Cyzekfj | 2 | 245 | 2020-05-02 14:14:34.830314+02 | 9
8 | zesf | 2 | 236 | 2020-05-01 16:18:07.070728+02 | 10
13 | zef | 1 | 197 | 2020-05-01 16:28:59.95383+02 | 11
14 | azd | 1 | 155 | 2020-05-01 17:53:30.372793+02 | 12
38 | balek | 1 | 155 | 2020-05-01 19:08:57.622195+02 | 12
I want to retreive the rank based on the full table whatever the result set.
I'm using the postgrest webserver.
How do I do that ?
You are describing window function rank():
select t.*, rank() over(order by score desc) rnk
from mytable t
order by score desc
I have the following data in a reviews table for certain set of items, using a score system that ranges from 0 to 100
+-----------+---------+-------+
| review_id | item_id | score |
+-----------+---------+-------+
| 1 | 1 | 90 |
+-----------+---------+-------+
| 2 | 1 | 40 |
+-----------+---------+-------+
| 3 | 1 | 10 |
+-----------+---------+-------+
| 4 | 2 | 90 |
+-----------+---------+-------+
| 5 | 2 | 90 |
+-----------+---------+-------+
| 6 | 2 | 70 |
+-----------+---------+-------+
| 7 | 3 | 80 |
+-----------+---------+-------+
| 8 | 3 | 80 |
+-----------+---------+-------+
| 9 | 3 | 80 |
+-----------+---------+-------+
| 10 | 3 | 80 |
+-----------+---------+-------+
| 11 | 4 | 10 |
+-----------+---------+-------+
| 12 | 4 | 30 |
+-----------+---------+-------+
| 13 | 4 | 50 |
+-----------+---------+-------+
| 14 | 4 | 80 |
+-----------+---------+-------+
I am trying to create a histogram of the score values with a bin size of five. My goal is to generate a histogram per item. In order to create a histogram of the entire table, it is possible to use the width_bucket. This can also be tuned to operate on a per-item basis:
SELECT item_id, g.n as bucket, COUNT(m.score) as count
FROM generate_series(1, 5) g(n) LEFT JOIN
review as m
ON width_bucket(score, 0, 100, 4) = g.n
GROUP BY item_id, g.n
ORDER BY item_id, g.n;
However, the result looks like this:
+---------+--------+-------+
| item_id | bucket | count |
+---------+--------+-------+
| 1 | 5 | 1 |
+---------+--------+-------+
| 1 | 3 | 1 |
+---------+--------+-------+
| 1 | 1 | 1 |
+---------+--------+-------+
| 2 | 5 | 2 |
+---------+--------+-------+
| 2 | 4 | 2 |
+---------+--------+-------+
| 3 | 4 | 4 |
+---------+--------+-------+
| 4 | 1 | 1 |
+---------+--------+-------+
| 4 | 2 | 1 |
+---------+--------+-------+
| 4 | 3 | 1 |
+---------+--------+-------+
| 4 | 4 | 1 |
+---------+--------+-------+
That is, bins with no entries are not included. While I find this not to be a bad solution, I would rather have either all buckets, with 0 on those with no entries. Even better, using this structure:
+---------+----------+----------+----------+----------+----------+
| item_id | bucket_1 | bucket_2 | bucket_3 | bucket_4 | bucket_5 |
+---------+----------+----------+----------+----------+----------+
| 1 | 1 | 0 | 1 | 0 | 1 |
+---------+----------+----------+----------+----------+----------+
| 2 | 0 | 0 | 0 | 2 | 2 |
+---------+----------+----------+----------+----------+----------+
| 3 | 0 | 0 | 0 | 4 | 0 |
+---------+----------+----------+----------+----------+----------+
| 4 | 1 | 1 | 1 | 1 | 0 |
+---------+----------+----------+----------+----------+----------+
I prefer this solution as it uses a row per item (instead of 5n), which is simpler to query and minimizes memory consumption and data transfer costs. My current approach is as follows:
select item_id,
(sum(case when score >= 0 and score <= 19 then 1 else 0 end)) as bucket_1,
(sum(case when score >= 20 and score <= 39 then 1 else 0 end)) as bucket_2,
(sum(case when score >= 40 and score <= 59 then 1 else 0 end)) as bucket_3,
(sum(case when score >= 60 and score <= 79 then 1 else 0 end)) as bucket_4,
(sum(case when score >= 80 and score <= 100 then 1 else 0 end)) as bucket_5
from review;
Even though this query satisfies my requirements, I am curious to see if there might be a more elegant approach. so many case statements are not easy to read and changes in the bin criteria might require updating every sum. Also I am curious about the potential performance concerns that this query might have.
The second query can be rewritten to use ranges to make editing and writing the query a bit easier:
with buckets (b1, b2, b3, b4, b5) as (
values (
int4range(0, 20), int4range(20, 40), int4range(40, 60), int4range(60, 80), int4range(80, 100)
)
)
select item_id,
count(*) filter (where b1 #> score) as bucket_1,
count(*) filter (where b2 #> score) as bucket_2,
count(*) filter (where b3 #> score) as bucket_3,
count(*) filter (where b4 #> score) as bucket_4,
count(*) filter (where b5 #> score) as bucket_5
from review
cross join buckets
group by item_id
order by item_id;
A range constructed with int4range(0,20) includes the lower end and excludes the upper end.
The CTE named buckets only creates a single row, so the cross join does not change the number of rows from the review table.
I found this post useful
CREATE FUNCTION temp_histogram(table_name_or_subquery text, column_name text)
RETURNS TABLE(bucket int, "range" numrange, freq bigint, bar text)
AS $func$
BEGIN
RETURN QUERY EXECUTE format('
WITH
source AS (
SELECT * FROM %s
),
min_max AS (
SELECT min(%s) AS min, max(%s) AS max FROM source
),
temp_histogram AS (
SELECT
width_bucket(%s, min_max.min, min_max.max, 100) AS bucket,
numrange(min(%s)::numeric, max(%s)::numeric, ''[]'') AS "range",
count(%s) AS freq
FROM source, min_max
WHERE %s IS NOT NULL
GROUP BY bucket
ORDER BY bucket
)
SELECT
bucket,
"range",
freq::bigint,
repeat(''*'', (freq::float / (max(freq) over() + 1) * 15)::int) AS bar
FROM temp_histogram',
table_name_or_subquery,
column_name,
column_name,
column_name,
column_name,
column_name,
column_name,
column_name
);
END
$func$ LANGUAGE plpgsql;
Use the bucket numbers(100 in above script) in your favour.
Invoke like this
SELECT * FROM histogram($table_name_or_subquery, $column_name);
Example:
SELECT * FROM histogram('transactions_tbl', 'amount_colm');
How to do a sum of different values but same ID without duplicate different values on a column?
My Input in SQL Command.
SELECT
students.id AS student_id,
students.name,
COUNT(*) AS enrolled,
c2.price AS course_price,
(COUNT(*) * price) AS paid
FROM students
LEFT JOIN enrolls e on students.id = e.student_id
LEFT JOIN courses c2 on e.course_id = c2.id
WHERE student_id NOTNULL
GROUP BY students.id, students.name, c2.price
ORDER BY student_id ASC;
My result.
student_id | name | enrolled | paid
------------+---------------------+----------+------
1001 | Gulbadan Bálint | 1 | 90
1002 | Hanna Adair | 5 | 450
1003 | Taddeo Bhattacharya | 1 | 90
1004 | Persis Havlíček | 1 | 75
1004 | Persis Havlíček | 5 | 450
1005 | Tory Bateson | 1 | 90
1007 | Dávid Fèvre | 1 | 90
1008 | Masuyo Stoddard | 1 | 90
1009 | Iiris Levitt | 1 | 75
1009 | Iiris Levitt | 2 | 180
1013 | Artair Kovač | 1 | 30
1013 | Artair Kovač | 1 | 90
1015 | Matilda Guinness | 2 | 180
1017 | Margarita Ek | 1 | 90
1018 | Misti Zima | 3 | 270
1019 | Conall Ventura | 1 | 90
1020 | Vivian Monday | 2 | 180
My expected result.
student_id | name | enrolled | paid
------------+---------------------+----------+------
1001 | Gulbadan Bálint | 1 | 90
1002 | Hanna Adair | 5 | 450
1003 | Taddeo Bhattacharya | 1 | 90
1004 | Persis Havlíček | 6 | 525
1005 | Tory Bateson | 1 | 90
1007 | Dávid Fèvre | 1 | 90
1008 | Masuyo Stoddard | 1 | 90
1009 | Iiris Levitt | 3 | 255
1013 | Artair Kovač | 2 | 120
1015 | Matilda Guinness | 2 | 180
1017 | Margarita Ek | 1 | 90
1018 | Misti Zima | 3 | 270
1019 | Conall Ventura | 1 | 90
1020 | Vivian Monday | 2 | 180
I think that the cause come from a GROUP BY command but it will throw an error if I do not write a GROUP BY price.
Perhaps you can use SUM() function.
Please see link below, maybe it's same case with you:
how to group by and return sum row in Postgres
You have excluded course_price column both in your current and expected result. It seems you had wrongly included that in group by.
SELECT
students.id AS student_id,
students.name,
COUNT(*) AS enrolled,
--c2.price AS course_price, --exclude this in o/p?
(COUNT(*) * price) AS paid
FROM students
LEFT JOIN enrolls e on students.id = e.student_id
LEFT JOIN courses c2 on e.course_id = c2.id
WHERE student_id NOTNULL
GROUP BY students.id, students.name --,c2.price --and remove it from here
ORDER BY student_id ASC;
My table is:
id sub_id datetime resource
---|-----|------------|-------
1 | 10 | 04/03/2009 | 399
2 | 11 | 04/03/2009 | 244
3 | 10 | 04/03/2009 | 555
4 | 10 | 03/03/2009 | 300
5 | 11 | 03/03/2009 | 200
6 | 11 | 03/03/2009 | 500
7 | 11 | 24/12/2008 | 600
8 | 13 | 01/01/2009 | 750
9 | 10 | 01/01/2009 | 760
10 | 13 | 01/01/2009 | 570
11 | 11 | 01/01/2009 | 870
12 | 13 | 01/01/2009 | 670
13 | 13 | 01/01/2009 | 703
14 | 13 | 01/01/2009 | 705
I need to select for each sub_id only 2 times
Result would be:
id sub_id datetime resource
---|-----|------------|-------
1 | 10 | 04/03/2009 | 399
3 | 10 | 04/03/2009 | 555
5 | 11 | 03/03/2009 | 200
6 | 11 | 03/03/2009 | 500
8 | 13 | 01/01/2009 | 750
10 | 13 | 01/01/2009 | 570
How can I achieve this result in postgres ?
Use the window function row_number():
select id, sub_id, datetime, resource
from (
select *, row_number() over (partition by sub_id order by id)
from my_table
) s
where row_number < 3;
look at the order column (I use id to match your sample):
t=# with data as (select *,count(1) over (partition by sub_id order by id) from t)
select id,sub_id,datetime,resource from data where count <3;
id | sub_id | datetime | resource
----+--------+------------+----------
1 | 10 | 2009-03-04 | 399
3 | 10 | 2009-03-04 | 555
2 | 11 | 2009-03-04 | 244
5 | 11 | 2009-03-03 | 200
8 | 13 | 2009-01-01 | 750
10 | 13 | 2009-01-01 | 570
(6 rows)
I have two table table:
I. Table 1 like this:
------------------------------------------
codeid | pos | neg | category
-----------------------------------------
1 | 10 | 3 | begin2016
1 | 3 | 5 | justhere
3 | 7 | 7 | justthere
4 | 1 | 1 | else
4 | 12 | 0 | begin2015
4 | 5 | 12 | begin2013
1 | 2 | 50 | now
2 | 5 | 33 | now
5 | 33 | 0 | Begin2011
5 | 11 | 7 | begin2000
II. Table 2 like this:
------------------------------------------
codeid | codedesc | codegroupid
-----------------------------------------
1 | road runner | 1
2 | bike warrior | 2
3 | lazy driver | 4
4 | clever runner | 1
5 | worker | 3
6 | smarty | 1
7 | sweety | 3
8 | sweeper | 1
I want to have one result like this having two (or more) conditions:
sum pos and neg where codegroupid IN('1', '2', '3')
BUt do not sum pos and neg if category like 'begin%'
So the result will like this:
------------------------------------------
codeid | codedesc | sumpos | sumneg
-----------------------------------------
1 | roadrunner | 5 | 55 => (sumpos = 3+2, because 10 have category like 'begin%' so doesn't sum)
2 | bike warrior | 5 | 33
4 | clever runner | 1 | 1
5 | worker | 0 | 0 => (sumpos=sumneg=0) becase codeid 5 category ilike 'begin%'
Group by codeid, codedesc;
Sumpos is sum(pos) where category NOT ILIKE 'begin%', BUT IF category ILKIE 'begin%' make all pos values become zero (0);
Sumpos is sum(neg) where category NOT ILIKE 'begin%', BUT IF category ILKIE 'begin%' make all neg values become zero;
Any ideas how to do it?
Try:
SELECT
b.codeid,
b.codedesc,
sum(CASE WHEN category LIKE 'begin%' THEN 0 ELSE a.pos END) AS sumpos,
sum(CASE WHEN category LIKE 'begin%' THEN 0 ELSE a.neg END) AS sumneg
FROM
table1 AS a
JOIN
table2 AS b ON a.codeid = b.codeid
WHERE b.codegroupid IN (1, 2, 3)
GROUP BY
b.codeid,
b.codedesc;