Postgres: How to get aggregates of multiple columns - postgresql

I want to implement filtered navigation with postgres, but I'm not sure how to return the count of each value in multiple fields from the result set.
Example Schema:
id, name, status
for this query i'd want to see something like (doesn't have to mimic this structure):
name: [(Bob, 20), (Joe, 15), (Sue, 5)]
status[(active, 15), (inactive, 25)]

Check Grouping Sets
with t (name, status) as (values
('Bob', 'active'),('Bob', 'active'),
('Joe', 'inactive'),('Joe', 'active'),('Joe', 'active')
)
select json_object_agg(case g when 1 then 'name' else 'status' end,a)
from (
select jsonb_agg(jsonb_build_object(coalesce(name,status), total)) as a, g
from (
select name, status, count(*) as total, grouping(name,status) as g
from t
group by grouping sets ((name),(status))
) s
group by g
) s
;
json_object_agg
------------------------------------------------------------------------------------
{ "name" : [{"Bob": 2}, {"Joe": 3}], "status" : [{"active": 4}, {"inactive": 1}] }

Related

Grouping user id columns together with string_agg on PostgreSQL 13

This is my emails table
create table emails (
id bigint not null primary key generated by default as identity,
name text not null
);
And contacts table:
create table contacts (
id bigint not null primary key generated by default as identity,
email_id bigint not null,
user_id bigint not null,
full_name text not null,
ordering int not null
);
As you can see I have user_id field here. There can be multiple same user ID's on my result so i want to join them using comma ,
Insert some data to the tables:
insert into emails (name)
values
('dennis1'),
('dennis2');
insert into contacts (id, email_id, user_id, full_name, ordering)
values
(5, 1, 1, 'dennis1', 9),
(6, 2, 1, 'dennis1', 5),
(7, 2, 1, 'dennis1', 1),
(8, 1, 3, 'john', 2),
(9, 2, 4, 'dennis7', 1),
(10, 2, 4, 'dennis7', 1);
My query is:
select em.name,
c.user_ids
from emails em
join (
select email_id, string_agg(user_id::text, ',' order by ordering desc) as user_ids
from contacts
group by email_id
) c on c.email_id = em.id
order by em.name;
Actual Result
name user_ids
dennis1 1,3
dennis2 1,1,4,4
Expected Result
name user_ids
dennis1 1,3
dennis2 1,4
On my real-world data, I get same user id like 50 times. Instead it should appear 1 time only. In example above, you see user 1 and 4 appears 2 times for dennis2 user.
How can I unique them?
Demo: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=2e957b52eb46742f3ddea27ec36effb1
P.S: I tried to add user_id it to group by but this time I get duplicate rows...
demo:db<>fiddle
SELECT
name,
string_agg(user_id::text, ',' order by ordering desc)
FROM (
SELECT DISTINCT ON (em.id, c.user_id)
*
FROM emails em
JOIN contacts c ON c.email_id = em.id
) s
GROUP BY name
Join the tables
DISTINCT ON email and the user_id, so for every email record, there is no equal users
Aggregate

Merging an array of JSON objects to one JSON column in Postgres

I have two tables, products and products_ext that can be reduced essentially
to this basic form:
CREATE TABLE "products" (
"product_id" TEXT PRIMARY KEY
);
CREATE TABLE "products_ext" (
"product_id" TEXT NOT NULL,
"key" TEXT NOT NULL,
"value" JSON,
PRIMARY KEY ("product_id", "key"),
FOREIGN KEY ("product_id") REFERENCES "products"("product_id")
ON DELETE CASCADE
ON UPDATE CASCADE
);
Let us assume mock data
INSERT INTO "products" ("product_id") VALUES
('test1'),
('test2'),
('test3');
INSERT INTO "products_ext" (product_id, "key", "value") VALUES
('test1', 'foo', '"Foo"'),
('test1', 'bar', '"Bar"'),
('test2', 'foo', '"Foo"');
I can use a query
SELECT
"P"."product_id",
ARRAY(
SELECT
json_build_object(
"E"."key",
"E"."value"
)
FROM "products_ext" AS "E"
WHERE "E"."product_id" = "P"."product_id"
)
FROM
"products" AS "P";
which yields
product_id | array
------------+-----------------------------------------------
test1 | {"{\"foo\" : \"Foo\"}","{\"bar\" : \"Bar\"}"}
test2 | {"{\"foo\" : \"Foo\"}"}
but I cannot make it to yield a merged JSON. Is there an easy way in Postgres 10
to merge an array of multiple JSONs as one JSON that would yield?
product_id | json
------------+----------------------------------------
test1 | {\"foo\" : \"Foo\", \"bar\" : \"Bar\"}
test2 | {\"foo\" : \"Foo\"}
test3 | {}
Primary key pair "product_id" and "key" already make sure that there are no
key collisions. There may be rows in the products that do not have any data in products_ext and in those cases an empty JSON object should be provided.
demo:db<>fiddle
Use json_object_agg():
SELECT
p.product_id AS product_id,
json_object_agg(e.key, e.value)
FROM
products AS p
JOIN
products_ext AS e ON p.product_id = e.product_id
GROUP BY p.product_id;
Edit for empty product_ext values:
demo:db<>fiddle
SELECT
p.product_id AS product_id,
COALESCE(
json_object_agg(e.key, e.value) FILTER (WHERE e.key IS NOT NULL),
'{}'
)
FROM
products AS p
LEFT JOIN
products_ext AS e ON p.product_id = e.product_id
GROUP BY p.product_id;

Convert jsonb in PostgreSQL to rows without cycle

ffI have a json array stored in my postgres database. The first table "Orders" looks like this:
order_id, basket_items_id
1, {1,2}
2, {3}
3, {1,2,3,1}
Second table "Items" looks like this:
item_id, price
1,5
2,3
3,20
Already tried to load data with multiple sql and select of different jsonb record, but this is not a silver bullet.
SELECT
sum(price)
FROM orders
INNER JOIN items on
orders.basket_items_id = items.item_id
WHERE order_id = 3;
Want to get this as output:
order_id, basket_items_id, price
1, 1, 5
1, 2, 3
2, 3, 20
3, 1, 5
3, 2, 3
3, 3, 20
3, 1, 5
or this:
order_id, sum(price)
1, 8
2, 20
3, 33
demo:db<>fiddle
SELECT
o.order_id,
elems.value::int as basket_items_id,
i.price
FROM
orders o, jsonb_array_elements_text(basket_items_id) as elems
LEFT JOIN items i
ON i.item_id = elems.value::int
ORDER BY 1,2,3
jsonb_array_elements_text expands the jsonb array into one row each element. With this you are able to join against your second table directly
Since the expanded array gives you text elements you have to cast them into integers using ::int
Of course you can GROUP and SUM aggregate this as well:
SELECT
o.order_id,
SUM(i.price)
FROM
orders o, jsonb_array_elements_text(basket_items_id) as elems
LEFT JOIN items i
ON i.item_id = elems.value::int
GROUP BY o.order_id
ORDER BY 1
Is your orders.basket_items_id column of type jsonb or int[]?
If the type is jsonb you can use json_array_elements_text to expand the column:
SELECT
o.order_id,
o.basket_item_id,
items.price
FROM
(
SELECT
order_id,
jsonb_array_elements_text(basket_items_id)::int basket_item_id
FROM
orders
) o
JOIN
items ON o.basket_item_id = items.item_id
ORDER BY
1, 2, 3;
See this DB-Fiddle.
If the type is int[] (array of integers), you can run a similar query with the unnest function:
SELECT
o.order_id,
o.basket_item_id,
items.price
FROM
(
SELECT
order_id,
unnest(basket_items_id) basket_item_id
FROM
orders
) o
JOIN
items ON o.basket_item_id = items.item_id
ORDER BY
1, 2, 3;
See this DB-fiddle

Postgres very hard dynamic select statement with COALESCE

Having a table and data like this
CREATE TABLE solicitations
(
id SERIAL PRIMARY KEY,
name text
);
CREATE TABLE donations
(
id SERIAL PRIMARY KEY,
solicitation_id integer REFERENCES solicitations, -- can be null
created_at timestamp without time zone NOT NULL DEFAULT (now() at time zone 'utc'),
amount bigint NOT NULL DEFAULT 0
);
INSERT INTO solicitations (name) VALUES
('solicitation1'), ('solicitation2');
INSERT INTO donations (created_at, solicitation_id, amount) VALUES
('2018-06-26', null, 10), ('2018-06-26', 1, 20), ('2018-06-26', 2, 30),
('2018-06-27', null, 10), ('2018-06-27', 1, 20),
('2018-06-28', null, 10), ('2018-06-28', 1, 20), ('2018-06-28', 2, 30);
How to make solicitation id's dynamic in following select statement using only postgres???
SELECT
"created_at"
-- make dynamic this begins
, COALESCE("no_solicitation", 0) AS "no_solicitation"
, COALESCE("1", 0) AS "1"
, COALESCE("2", 0) AS "2"
-- make dynamic this ends
FROM crosstab(
$source_sql$
SELECT
created_at::date as row_id
, COALESCE(solicitation_id::text, 'no_solicitation') as category
, SUM(amount) as value
FROM donations
GROUP BY row_id, category
ORDER BY row_id, category
$source_sql$
, $category_sql$
-- parametrize with ids from here begins
SELECT unnest('{no_solicitation}'::text[] || ARRAY(SELECT DISTINCT id::text FROM solicitations ORDER BY id))
-- parametrize with ids from here ends
$category_sql$
) AS ct (
"created_at" date
-- make dynamic this begins
, "no_solicitation" bigint
, "1" bigint
, "2" bigint
-- make dynamic this ends
)
The select should return data like this
created_at no_solicitation 1 2
____________________________________
2018-06-26 10 20 30
2018-06-27 10 20 0
2018-06-28 10 20 30
The solicitation ids that should parametrize select are the same as in
SELECT unnest('{no_solicitation}'::text[] || ARRAY(SELECT DISTINCT id::text FROM solicitations ORDER BY id))
One can fiddle the code here
I decided to use json, which is much simpler then crosstab
WITH
all_solicitation_ids AS (
SELECT
unnest('{no_solicitation}'::text[] ||
ARRAY(SELECT DISTINCT id::text FROM solicitations ORDER BY id))
AS col
)
, all_days AS (
SELECT
-- TODO: compute days ad hoc, from min created_at day of donations to max created_at day of donations
generate_series('2018-06-26', '2018-06-28', '1 day'::interval)::date
AS col
)
, all_days_and_all_solicitation_ids AS (
SELECT
all_days.col AS created_at
, all_solicitation_ids.col AS solicitation_id
FROM all_days, all_solicitation_ids
ORDER BY all_days.col, all_solicitation_ids.col
)
, donations_ AS (
SELECT
created_at::date as created_at
, COALESCE(solicitation_id::text, 'no_solicitation') as solicitation_id
, SUM(amount) as amount
FROM donations
GROUP BY created_at, solicitation_id
ORDER BY created_at, solicitation_id
)
, donations__ AS (
SELECT
all_days_and_all_solicitation_ids.created_at
, all_days_and_all_solicitation_ids.solicitation_id
, COALESCE(donations_.amount, 0) AS amount
FROM all_days_and_all_solicitation_ids
LEFT JOIN donations_
ON all_days_and_all_solicitation_ids.created_at = donations_.created_at
AND all_days_and_all_solicitation_ids.solicitation_id = donations_.solicitation_id
)
SELECT
jsonb_object_agg(solicitation_id, amount) ||
jsonb_object_agg('date', created_at)
AS data
FROM donations__
GROUP BY created_at
which results
data
______________________________________________________________
{"1": 20, "2": 30, "date": "2018-06-28", "no_solicitation": 10}
{"1": 20, "2": 30, "date": "2018-06-26", "no_solicitation": 10}
{"1": 20, "2": 0, "date": "2018-06-27", "no_solicitation": 10}
Thought its not the same that I requested.
It returns only data column, instead of date, no_solicitation, 1, 2, ...., to do so I need to use json_to_record, but I dont know how to produce its as argument dynamically

Limit query by count distinct column values

I have a table with people, something like this:
ID PersonId SomeAttribute
1 1 yellow
2 1 red
3 2 yellow
4 3 green
5 3 black
6 3 purple
7 4 white
Previously I was returning all of Persons to API as seperate objects. So if user set limit to 3, I was just setting query maxResults in hibernate to 3 and returning:
{"PersonID": 1, "attr":"yellow"}
{"PersonID": 1, "attr":"red"}
{"PersonID": 2, "attr":"yellow"}
and if someone specify limit to 3 and page 2(setMaxResult(3), setFirstResult(6) it would be:
{"PersonID": 3, "attr":"green"}
{"PersonID": 3, "attr":"black"}
{"PersonID": 3, "attr":"purple"}
But now I want to select people and combine then into one json object to look like this:
{
"PersonID":3,
"attrs": [
{"attr":"green"},
{"attr":"black"},
{"attr":"purple"}
]
}
And here is the problem. Is there any possibility in postgresql or hibernate to set limit not by number of rows but to number of distinct people ids, because if user specifies limit to 4 I should return person1, 2, 3 and 4, but in my current limiting mechanism I will return person1 with 2 attributes, person2 and person3 with only one attribute. Same problem with pagination, now I can return half of a person3 array attrs on one page and another half on next page.
You can use row_number to simulate LIMIT:
-- Test data
CREATE TABLE person AS
WITH tmp ("ID", "PersonId", "SomeAttribute") AS (
VALUES
(1, 1, 'yellow'::TEXT),
(2, 1, 'red'),
(3, 2, 'yellow'),
(4, 3, 'green'),
(5, 3, 'black'),
(6, 3, 'purple'),
(7, 4, 'white')
)
SELECT * FROM tmp;
-- Returning as a normal column (limit by someAttribute size)
SELECT * FROM (
select
"PersonId",
"SomeAttribute",
row_number() OVER(PARTITION BY "PersonId" ORDER BY "PersonId") AS rownum
from
person) as tmp
WHERE rownum <= 3;
-- Returning as a normal column (overall limit)
SELECT * FROM (
select
"PersonId",
"SomeAttribute",
row_number() OVER(ORDER BY "PersonId") AS rownum
from
person) as tmp
WHERE rownum <= 4;
-- Returning as a JSON column (limit by someAttribute size)
SELECT "PersonId", json_object_agg('color', "SomeAttribute") AS attributes FROM (
select
"PersonId",
"SomeAttribute",
row_number() OVER(PARTITION BY "PersonId" ORDER BY "PersonId") AS rownum
from
person) as tmp
WHERE rownum <= 3 GROUP BY "PersonId";
-- Returning as a JSON column (limit by person)
SELECT "PersonId", json_object_agg('color', "SomeAttribute") AS attributes FROM (
select
"PersonId",
"SomeAttribute"
from
person) as tmp
GROUP BY "PersonId"
LIMIT 4;
In this case, of course, you must use a native query, but this is a small trade-off IMHO.
More info here and here.
I'm assuming you have another Person table. With JPA, you should do the query on Person table(one side), not on the PersonColor(many side).Then the limit will be applied on number of rows of Person then
If you don't have the Person table and can't modify the DB, what you can do is use SQL and Group By PersonId, and concatenate colors
select PersonId, array_agg(Color) FROM my_table group by PersonId limit 2
SQL Fiddle
Thank you guys. After I realize that it could not be done with one query I just do sth like
temp_query = select distinct x.person_id from (my_original_query) x
with user specific page/per_page
and then:
my_original_query += " AND person_id in (temp_query_results)