Postgres - partial pivot with crosstab - postgresql

I'm struggling, and knowing the terminology to search for the answer is likely my problem as I can't imagine this is an edge case.
dbfiddle available
I have a table in Postgres 9.4:
id serial PRIMARY KEY, cust_id INTEGER,
category VARCHAR, key INTEGER, value INTEGER
INSERT INTO test (cust_id, category, key, value)
(1, 'alpha', 0,300),(1, 'bravo', 0,150),(1, 'alpha', 1,300),
(1, 'bravo', 1,200),(1, 'alpha', 2,300),(1, 'bravo', 2,250),
(2, 'alpha', 0,301),(2, 'bravo', 0,151),(2, 'alpha', 1,301),
(2, 'bravo', 1,201),(2, 'alpha', 2,301),(2, 'bravo', 2,251),
(3, 'alpha', 0,302),(3, 'bravo', 0,152),(3, 'alpha', 1,302),
(3, 'bravo', 1,202),(3, 'alpha', 2,302),(3, 'bravo', 2,252);
id | cust_id | category | key | value
1 | 1 | alpha | 0 | 300
2 | 1 | bravo | 0 | 150
3 | 1 | alpha | 1 | 300
4 | 1 | bravo | 1 | 200
5 | 1 | alpha | 2 | 300
6 | 1 | bravo | 2 | 250
7 | 2 | alpha | 0 | 301
8 | 2 | bravo | 0 | 151
9 | 2 | alpha | 1 | 301
10 | 2 | bravo | 1 | 201
11 | 2 | alpha | 2 | 301
12 | 2 | bravo | 2 | 251
13 | 3 | alpha | 0 | 302
14 | 3 | bravo | 0 | 152
15 | 3 | alpha | 1 | 302
16 | 3 | bravo | 1 | 202
17 | 3 | alpha | 2 | 302
18 | 3 | bravo | 2 | 252
(18 rows)
I'd like to query the results to look like the following:
cust_id | category | 0 | 1 | 2
1 | alpha | 300 | 300 | 300
1 | bravo | 150 | 200 | 250
2 | alpha | 301 | 301 | 301
2 | bravo | 151 | 201 | 251
3 | alpha | 302 | 302 | 302
3 | bravo | 152 | 202 | 252
(6 rows)
I've tried:
'SELECT cust_id,category,key,value FROM test ORDER BY cust_id,category,key',
$$values ('0'::INT),
('2'::INT) $$
) AS ct (
"cust_id" INT, "category" TEXT, "0" INT,
"1" INT, "2" INT
which nets me (lacking the bravo category rows and uses bravo values for columns 1,2,3):
cust_id | category | 0 | 1 | 2
1 | alpha | 150 | 200 | 250
2 | alpha | 151 | 201 | 251
3 | alpha | 152 | 202 | 252
(2 rows)
I get closer with the following by removing the cust_id field and limiting to a single id:
'SELECT category,key,value FROM test WHERE cust_id = 1 ORDER BY category,key',
$$values ('0'::INT),
('2'::INT) $$
) AS ct (
"category" TEXT, "0" INT,
"1" INT, "2" INT
but this only gives the result for a single cust_id, but I need this for all customers:
category | 0 | 1 | 2
alpha | 300 | 300 | 300
bravo | 150 | 200 | 250
(2 rows)

here is one way :
select cust_id , category
, max(case when key = 0 then value end) "0"
, max(case when key = 1 then value end) "1"
, max(case when key = 2 then value end) "2"
from test
group by cust_id , category
order by cust_id , category


postgrest retreive ranked results

I made a game, with level and scores saved into an sql table like this :
create table if not exists api.scores (
id serial primary key,
pseudo varchar(50),
level int,
score int,
created_at timestamptz default CURRENT_TIMESTAMP
I want to display the scores in the ui with the rank of each score, based on the score column, ordered by desc.
Here is a sample data :
id | pseudo | level | score | created_at
1 | test | 1 | 1 | 2020-05-01 11:25:20.446402+02
2 | test | 1 | 1 | 2020-05-01 11:28:11.04001+02
3 | szef | 1 | 115 | 2020-05-01 15:45:06.201135+02
4 | erg | 1 | 115 | 2020-05-01 15:55:19.621372+02
5 | zef | 1 | 115 | 2020-05-01 16:14:09.718861+02
6 | aa | 1 | 115 | 2020-05-01 16:16:49.369718+02
7 | zesf | 1 | 115 | 2020-05-01 16:17:42.504354+02
8 | zesf | 2 | 236 | 2020-05-01 16:18:07.070728+02
9 | zef | 1 | 115 | 2020-05-01 16:22:23.406013+02
10 | zefzef | 1 | 115 | 2020-05-01 16:23:49.720094+02
Here is what I want :
id | pseudo | level | score | created_at | rank
31 | zef | 7 | 730 | 2020-05-01 18:40:42.586224+02 | 1
50 | Cyprien | 5 | 588 | 2020-05-02 14:08:39.034112+02 | 2
49 | cyprien | 4 | 438 | 2020-05-01 23:35:13.440595+02 | 3
51 | Cyprien | 3 | 374 | 2020-05-02 14:13:41.071752+02 | 4
47 | cyprien | 3 | 337 | 2020-05-01 23:27:53.025475+02 | 5
45 | balek | 3 | 337 | 2020-05-01 19:57:39.888233+02 | 5
46 | cyprien | 3 | 337 | 2020-05-01 23:25:56.047495+02 | 5
48 | cyprien | 3 | 337 | 2020-05-01 23:28:54.190989+02 | 5
54 | Cyzekfj | 2 | 245 | 2020-05-02 14:14:34.830314+02 | 9
8 | zesf | 2 | 236 | 2020-05-01 16:18:07.070728+02 | 10
13 | zef | 1 | 197 | 2020-05-01 16:28:59.95383+02 | 11
14 | azd | 1 | 155 | 2020-05-01 17:53:30.372793+02 | 12
38 | balek | 1 | 155 | 2020-05-01 19:08:57.622195+02 | 12
I want to retreive the rank based on the full table whatever the result set.
I'm using the postgrest webserver.
How do I do that ?
You are describing window function rank():
select t.*, rank() over(order by score desc) rnk
from mytable t
order by score desc

Generate a histogram of values grouped by a column

I have the following data in a reviews table for certain set of items, using a score system that ranges from 0 to 100
| review_id | item_id | score |
| 1 | 1 | 90 |
| 2 | 1 | 40 |
| 3 | 1 | 10 |
| 4 | 2 | 90 |
| 5 | 2 | 90 |
| 6 | 2 | 70 |
| 7 | 3 | 80 |
| 8 | 3 | 80 |
| 9 | 3 | 80 |
| 10 | 3 | 80 |
| 11 | 4 | 10 |
| 12 | 4 | 30 |
| 13 | 4 | 50 |
| 14 | 4 | 80 |
I am trying to create a histogram of the score values with a bin size of five. My goal is to generate a histogram per item. In order to create a histogram of the entire table, it is possible to use the width_bucket. This can also be tuned to operate on a per-item basis:
SELECT item_id, g.n as bucket, COUNT(m.score) as count
FROM generate_series(1, 5) g(n) LEFT JOIN
review as m
ON width_bucket(score, 0, 100, 4) = g.n
GROUP BY item_id, g.n
ORDER BY item_id, g.n;
However, the result looks like this:
| item_id | bucket | count |
| 1 | 5 | 1 |
| 1 | 3 | 1 |
| 1 | 1 | 1 |
| 2 | 5 | 2 |
| 2 | 4 | 2 |
| 3 | 4 | 4 |
| 4 | 1 | 1 |
| 4 | 2 | 1 |
| 4 | 3 | 1 |
| 4 | 4 | 1 |
That is, bins with no entries are not included. While I find this not to be a bad solution, I would rather have either all buckets, with 0 on those with no entries. Even better, using this structure:
| item_id | bucket_1 | bucket_2 | bucket_3 | bucket_4 | bucket_5 |
| 1 | 1 | 0 | 1 | 0 | 1 |
| 2 | 0 | 0 | 0 | 2 | 2 |
| 3 | 0 | 0 | 0 | 4 | 0 |
| 4 | 1 | 1 | 1 | 1 | 0 |
I prefer this solution as it uses a row per item (instead of 5n), which is simpler to query and minimizes memory consumption and data transfer costs. My current approach is as follows:
select item_id,
(sum(case when score >= 0 and score <= 19 then 1 else 0 end)) as bucket_1,
(sum(case when score >= 20 and score <= 39 then 1 else 0 end)) as bucket_2,
(sum(case when score >= 40 and score <= 59 then 1 else 0 end)) as bucket_3,
(sum(case when score >= 60 and score <= 79 then 1 else 0 end)) as bucket_4,
(sum(case when score >= 80 and score <= 100 then 1 else 0 end)) as bucket_5
from review;
Even though this query satisfies my requirements, I am curious to see if there might be a more elegant approach. so many case statements are not easy to read and changes in the bin criteria might require updating every sum. Also I am curious about the potential performance concerns that this query might have.
The second query can be rewritten to use ranges to make editing and writing the query a bit easier:
with buckets (b1, b2, b3, b4, b5) as (
values (
int4range(0, 20), int4range(20, 40), int4range(40, 60), int4range(60, 80), int4range(80, 100)
select item_id,
count(*) filter (where b1 #> score) as bucket_1,
count(*) filter (where b2 #> score) as bucket_2,
count(*) filter (where b3 #> score) as bucket_3,
count(*) filter (where b4 #> score) as bucket_4,
count(*) filter (where b5 #> score) as bucket_5
from review
cross join buckets
group by item_id
order by item_id;
A range constructed with int4range(0,20) includes the lower end and excludes the upper end.
The CTE named buckets only creates a single row, so the cross join does not change the number of rows from the review table.
I found this post useful
CREATE FUNCTION temp_histogram(table_name_or_subquery text, column_name text)
RETURNS TABLE(bucket int, "range" numrange, freq bigint, bar text)
AS $func$
source AS (
min_max AS (
SELECT min(%s) AS min, max(%s) AS max FROM source
temp_histogram AS (
width_bucket(%s, min_max.min, min_max.max, 100) AS bucket,
numrange(min(%s)::numeric, max(%s)::numeric, ''[]'') AS "range",
count(%s) AS freq
FROM source, min_max
GROUP BY bucket
ORDER BY bucket
repeat(''*'', (freq::float / (max(freq) over() + 1) * 15)::int) AS bar
FROM temp_histogram',
$func$ LANGUAGE plpgsql;
Use the bucket numbers(100 in above script) in your favour.
Invoke like this
SELECT * FROM histogram($table_name_or_subquery, $column_name);
SELECT * FROM histogram('transactions_tbl', 'amount_colm');

How to do sum of different values without duplicate

How to do a sum of different values but same ID without duplicate different values on a column?
My Input in SQL Command.
SELECT AS student_id,,
COUNT(*) AS enrolled,
c2.price AS course_price,
(COUNT(*) * price) AS paid
FROM students
LEFT JOIN enrolls e on = e.student_id
LEFT JOIN courses c2 on e.course_id =
WHERE student_id NOTNULL
GROUP BY,, c2.price
ORDER BY student_id ASC;
My result.
student_id | name | enrolled | paid
1001 | Gulbadan Bálint | 1 | 90
1002 | Hanna Adair | 5 | 450
1003 | Taddeo Bhattacharya | 1 | 90
1004 | Persis Havlíček | 1 | 75
1004 | Persis Havlíček | 5 | 450
1005 | Tory Bateson | 1 | 90
1007 | Dávid Fèvre | 1 | 90
1008 | Masuyo Stoddard | 1 | 90
1009 | Iiris Levitt | 1 | 75
1009 | Iiris Levitt | 2 | 180
1013 | Artair Kovač | 1 | 30
1013 | Artair Kovač | 1 | 90
1015 | Matilda Guinness | 2 | 180
1017 | Margarita Ek | 1 | 90
1018 | Misti Zima | 3 | 270
1019 | Conall Ventura | 1 | 90
1020 | Vivian Monday | 2 | 180
My expected result.
student_id | name | enrolled | paid
1001 | Gulbadan Bálint | 1 | 90
1002 | Hanna Adair | 5 | 450
1003 | Taddeo Bhattacharya | 1 | 90
1004 | Persis Havlíček | 6 | 525
1005 | Tory Bateson | 1 | 90
1007 | Dávid Fèvre | 1 | 90
1008 | Masuyo Stoddard | 1 | 90
1009 | Iiris Levitt | 3 | 255
1013 | Artair Kovač | 2 | 120
1015 | Matilda Guinness | 2 | 180
1017 | Margarita Ek | 1 | 90
1018 | Misti Zima | 3 | 270
1019 | Conall Ventura | 1 | 90
1020 | Vivian Monday | 2 | 180
I think that the cause come from a GROUP BY command but it will throw an error if I do not write a GROUP BY price.
Perhaps you can use SUM() function.
Please see link below, maybe it's same case with you:
how to group by and return sum row in Postgres
You have excluded course_price column both in your current and expected result. It seems you had wrongly included that in group by.
SELECT AS student_id,,
COUNT(*) AS enrolled,
--c2.price AS course_price, --exclude this in o/p?
(COUNT(*) * price) AS paid
FROM students
LEFT JOIN enrolls e on = e.student_id
LEFT JOIN courses c2 on e.course_id =
WHERE student_id NOTNULL
GROUP BY, --,c2.price --and remove it from here
ORDER BY student_id ASC;

Select rows by one column value should only be repeat N times

My table is:
id sub_id datetime resource
1 | 10 | 04/03/2009 | 399
2 | 11 | 04/03/2009 | 244
3 | 10 | 04/03/2009 | 555
4 | 10 | 03/03/2009 | 300
5 | 11 | 03/03/2009 | 200
6 | 11 | 03/03/2009 | 500
7 | 11 | 24/12/2008 | 600
8 | 13 | 01/01/2009 | 750
9 | 10 | 01/01/2009 | 760
10 | 13 | 01/01/2009 | 570
11 | 11 | 01/01/2009 | 870
12 | 13 | 01/01/2009 | 670
13 | 13 | 01/01/2009 | 703
14 | 13 | 01/01/2009 | 705
I need to select for each sub_id only 2 times
Result would be:
id sub_id datetime resource
1 | 10 | 04/03/2009 | 399
3 | 10 | 04/03/2009 | 555
5 | 11 | 03/03/2009 | 200
6 | 11 | 03/03/2009 | 500
8 | 13 | 01/01/2009 | 750
10 | 13 | 01/01/2009 | 570
How can I achieve this result in postgres ?
Use the window function row_number():
select id, sub_id, datetime, resource
from (
select *, row_number() over (partition by sub_id order by id)
from my_table
) s
where row_number < 3;
look at the order column (I use id to match your sample):
t=# with data as (select *,count(1) over (partition by sub_id order by id) from t)
select id,sub_id,datetime,resource from data where count <3;
id | sub_id | datetime | resource
1 | 10 | 2009-03-04 | 399
3 | 10 | 2009-03-04 | 555
2 | 11 | 2009-03-04 | 244
5 | 11 | 2009-03-03 | 200
8 | 13 | 2009-01-01 | 750
10 | 13 | 2009-01-01 | 570
(6 rows)

Postgresql: Select sum with different conditions

I have two table table:
I. Table 1 like this:
codeid | pos | neg | category
1 | 10 | 3 | begin2016
1 | 3 | 5 | justhere
3 | 7 | 7 | justthere
4 | 1 | 1 | else
4 | 12 | 0 | begin2015
4 | 5 | 12 | begin2013
1 | 2 | 50 | now
2 | 5 | 33 | now
5 | 33 | 0 | Begin2011
5 | 11 | 7 | begin2000
II. Table 2 like this:
codeid | codedesc | codegroupid
1 | road runner | 1
2 | bike warrior | 2
3 | lazy driver | 4
4 | clever runner | 1
5 | worker | 3
6 | smarty | 1
7 | sweety | 3
8 | sweeper | 1
I want to have one result like this having two (or more) conditions:
sum pos and neg where codegroupid IN('1', '2', '3')
BUt do not sum pos and neg if category like 'begin%'
So the result will like this:
codeid | codedesc | sumpos | sumneg
1 | roadrunner | 5 | 55 => (sumpos = 3+2, because 10 have category like 'begin%' so doesn't sum)
2 | bike warrior | 5 | 33
4 | clever runner | 1 | 1
5 | worker | 0 | 0 => (sumpos=sumneg=0) becase codeid 5 category ilike 'begin%'
Group by codeid, codedesc;
Sumpos is sum(pos) where category NOT ILIKE 'begin%', BUT IF category ILKIE 'begin%' make all pos values become zero (0);
Sumpos is sum(neg) where category NOT ILIKE 'begin%', BUT IF category ILKIE 'begin%' make all neg values become zero;
Any ideas how to do it?
sum(CASE WHEN category LIKE 'begin%' THEN 0 ELSE a.pos END) AS sumpos,
sum(CASE WHEN category LIKE 'begin%' THEN 0 ELSE a.neg END) AS sumneg
table1 AS a
table2 AS b ON a.codeid = b.codeid
WHERE b.codegroupid IN (1, 2, 3)