Making multiple optional joins on same table using json array in postgres - postgresql

I have two tables users and rooms. I'm trying to join both tables and get list of user names related to the room. I'm getting results only if both columns have values. If any one of the column is null, it's not giving the value of non-null column. Please find the details below.
users schema:
id name
---------
1 X
3 Y
4 Z
rooms schema:
id active_users inactive_users
-----------------------------------
101 [1] [3]
102 [3] null
103 null [4]
Tried query:
SELECT
r.id, json_agg(u.name) as active_users, json_agg(u1.name) as inactive_users
FROM rooms r,
json_array_elements(active_users) as elems
LEFT OUTER JOIN users u
ON u.id = elems::TEXT::INT,
json_array_elements(inactive_users) as elems1
LEFT OUTER JOIN users u1 ON
u1.id = elems1::TEXT::INT
GROUP BY r.id
Query output:
id active_users inactive_users
----------------------------------
101 ["X"] ["Y"]
Expected output:
id active_users inactive_users
----------------------------------
101 ["X"] ["Y"]
102 ["Y"] NULL
103 NULL ["Z"]

Your main issue here is that you're doing an inner join with the json_array_elements(active_users) function, which produces null when active_users is null. Therefore, that join is excluding those lines you want to include. It will work if you make those joins into outer joins as well:
SELECT
r.id,
json_agg(u.name) FILTER (WHERE u.name IS NOT NULL) as active_users,
json_agg(u1.name) FILTER (WHERE u1.name IS NOT NULL) as inactive_users
FROM rooms r
LEFT JOIN json_array_elements(active_users) as elems
ON TRUE
LEFT JOIN json_array_elements(inactive_users) as elems1
ON TRUE
LEFT JOIN users u
ON u.id = elems::TEXT::INT
LEFT JOIN users u1 ON
u1.id = elems1::TEXT::INT
GROUP BY r.id;
You'll notice that I also added FILTERs in the select, that's so null usernames are excluded from the aggregation.

Related

Postgresql recursive query

I have table with self-related foreign keys and can not get how I can receive firs child or descendant which meet condition. My_table structure is:
id
parent_id
type
1
null
union
2
1
group
3
2
group
4
3
depart
5
1
depart
6
5
unit
7
1
unit
I should for id 1 (union) receive all direct child or first descendant, excluding all groups between first descendant and union. So in this example as result I should receive:
id
type
4
depart
5
depart
7
unit
id 4 because it's connected to union through group with id 3 and group with id 2 and id 5 because it's connected directly to union.
I've tried to write recursive query with condition for recursive part: when parent_id = 1 or parent_type = 'depart' but it doesn't lead to expected result
with recursive cte AS (
select b.id, p.type_id
from my_table b
join my_table p on p.id = b.parent_id
where b.id = 1
union
select c.id, cte.type_id
from my_table c
join cte on cte.id = c.parent_id
where c.parent_id = 1 or cte.type_id = 'group'
)
Here's my interpretation:
if type='group', then id and parent_id are considered in the same group
id#1 and id#2 are in the same group, they're equals
id#2 and id#3 are in the same group, they're equals
id#1, id#2 and id#3 are in the same group
If the above is correct, you want to get all the first descendent of id#1's group. The way to do that:
Get all the ids in the same group with id#1
Get all the first descendants of the above group (type not in ('union', 'group'))
with recursive cte_group as (
select 1 as id
union all
select m.id
from my_table m
join cte_group g
on m.parent_id = g.id
and m.type = 'group')
select mt.id,
mt.type
from my_table mt
join cte_group cg
on mt.parent_id = cg.id
and mt.type not in ('union','group');
Result:
id|type |
--+------+
4|depart|
5|depart|
7|unit |
Sounds like you want to start with the row of id 1, then get its children, and continue recursively on rows of type group. To do that, use
WITH RECURSIVE tree AS (
SELECT b.id, b.type, TRUE AS skip
FROM my_table b
WHERE id = 1
UNION ALL
SELECT c.id, c.type, (c.type = 'group') AS skip
FROM my_table c
JOIN tree p ON c.parent_id = p.id AND p.skip
)
SELECT id, type
FROM tree
WHERE NOT skip

How to find in a many to many relation all the identical values in a column and join the table with other three tables?

I have a many to many relation with three columns, (owner_id,property_id,ownership_perc) and for this table applies (many owners have many properties).
So I would like to find all the owner_id who has many properties (property_id) and connect them with other three tables (Table 1,3,4) in order to get further information for the requested result.
All the tables that I'm using are
Table 1: owner (id_owner,name)
Table 2: owner_property (owner_id,property_id,ownership_perc)
Table 3: property(id_property,building_id)
Table 4: building(id_building,address,region)
So, when I'm trying it like this, the query runs but it returns empty.
SELECT address,region,name
FROM owner_property
JOIN property ON owner_property.property_id = property.id_property
JOIN owner ON owner.id_owner = owner_property.owner_id
JOIN building ON property.building_id=building.id_building
GROUP BY owner_id,address,region,name
HAVING count(owner_id) > 1
ORDER BY owner_id;
Only when I'm trying the code below, it returns the owner_id who has many properties (see image below) but without joining it with the other three tables:
SELECT a.*
FROM owner_property a
JOIN (SELECT owner_id, COUNT(owner_id)
FROM owner_property
GROUP BY owner_id
HAVING COUNT(owner_id)>1) b
ON a.owner_id = b.owner_id
ORDER BY a.owner_id,property_id ASC;
So, is there any suggestion on what I'm doing wrong when I'm joining the tables? Thank you!
This query:
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
returns all the owner_ids with more than 1 property_ids.
If there is a case of duplicates in the combination of owner_id and property_id then instead of COUNT(property_id) use COUNT(DISTINCT property_id) in the HAVING clause.
So join it to the other tables:
SELECT b.address, b.region, o.name
FROM (
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
) t
INNER JOIN owner_property op ON op.owner_id = t.owner_id
INNER JOIN property p ON op.property_id = p.id_property
INNER JOIN owner o ON o.id_owner = op.owner_id
INNER JOIN building b ON p.building_id = b.id_building
ORDER BY op.owner_id, op.property_id ASC;
Always qualify the column names with the table name/alias.
You can try to use a correlated subquery that counts the ownerships with EXISTS in the WHERE clause.
SELECT b1.address,
b1.region,
o1.name
FROM owner_property op1
INNER JOIN owner o1
ON o1.id_owner = op1.owner_id
INNER JOIN property p1
ON p1.id_property = op1.property_id
INNER JOIN building b1
ON b1.id_building = p1.building_id
WHERE EXISTS (SELECT ''
FROM owner_property op2
WHERE op2.owner_id = op1.owner_id
HAVING count(*) > 1);

Handle Null in jsonb_array_elements

I have 2 tables a and b
Table a
id | name | code
VARCHAR VARCHAR jsonb
1 xyz [14, 15, 16 ]
2 abc [null]
3 def [null]
Table b
id | name | code
1 xyz [16, 15, 14 ]
2 abc [null]
I want to figure out where the code does not match for same id and name. I sort code column in b b/c i know it same but sorted differently
SELECT a.id,
a.name,
a.code,
c.id,
c.name,
c.code
FROM a
FULL OUTER JOIN ( SELECT id,
name,
jsonb_agg(code ORDER BY code) AS code
FROM (
SELECT id,
name,
jsonb_array_elements(code) AS code
FROM b
GROUP BY id,
name,
jsonb_array_elements(code)
) t
GROUP BY id,
name
) c
ON a.id = c.id
AND a.name = c.name
AND COALESCE (a.code, '[]'::jsonb) = COALESCE (c.code, '[]'::jsonb)
WHERE (a.id IS NULL OR c.id IS NULL)
My answer in this case should only return id = 3 b/c its not in b table but my query is returning id = 2 as well b/c i am not handling the null case well enough in the inner subquery
How can i handle the null use case in the inner subquery?
demo:db<>fiddle
The <# operator checks if all elements of the left array occur in the right one. The #> does other way round. So using both you can ensure that both arrays contain the same elements:
a.code #> b.code AND a.code <# b.code
Nevertheless it will be accept as well if one array contains duplicates. So [42,42] will be the same as [42]. If you want to avoid this as well you should check the array length as well
AND jsonb_array_length(a.code) = jsonb_array_length(b.code)
Furthermore you might check if both values are NULL. This case has to be checked separately:
a.code IS NULL and b.code IS NULL
A little bit shorter form is using the COALESCE function:
COALESCE(a.code, b.code) IS NULL
So the whole query could look like this:
SELECT
*
FROM a
FULL OUTER JOIN b
ON a.id = b.id AND a.name = b.name
AND (
COALESCE(a.code, b.code) IS NULL -- both null
OR (a.code #> b.code AND a.code <# b.code
AND jsonb_array_length(a.code) = jsonb_array_length(b.code) -- avoid accepting duplicates
)
)
After that you are able to filter the NULL values in the WHERE clause

Cascading sum hierarchy using recursive cte

I'm trying to perform recursive cte with postgres but I can't wrap my head around it. In terms of performance issue there are only 50 items in TABLE 1 so this shouldn't be an issue.
TABLE 1 (expense):
id | parent_id | name
------------------------------
1 | null | A
2 | null | B
3 | 1 | C
4 | 1 | D
TABLE 2 (expense_amount):
ref_id | amount
-------------------------------
3 | 500
4 | 200
Expected Result:
id, name, amount
-------------------------------
1 | A | 700
2 | B | 0
3 | C | 500
4 | D | 200
Query
WITH RECURSIVE cte AS (
SELECT
expenses.id,
name,
parent_id,
expense_amount.total
FROM expenses
WHERE expenses.parent_id IS NULL
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
UNION ALL
SELECT
expenses.id,
expenses.name,
expenses.parent_id,
expense_amount.total
FROM cte
JOIN expenses ON expenses.parent_id = cte.id
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
)
SELECT
id,
SUM(amount)
FROM cte
GROUP BY 1
ORDER BY 1
Results
id | sum
--------------------
1 | null
2 | null
3 | 500
4 | 200
You can do a conditional sum() for only the root row:
with recursive tree as (
select id, parent_id, name, id as root_id
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.root_id
from expense c
join tree p on c.parent_id = p.id
)
select e.id,
e.name,
e.root_id,
case
when e.id = e.root_id then sum(ea.amount) over (partition by root_id)
else amount
end as amount
from tree e
left join expense_amount ea on e.id = ea.ref_id
order by id;
I prefer doing the recursive part first, then join the related tables to the result of the recursive query, but you could do the join to the expense_amount also inside the CTE.
Online example: http://rextester.com/TGQUX53703
However, the above only aggregates on the top-level parent, not for any intermediate non-leaf rows.
If you want to see intermediate aggregates as well, this gets a bit more complicated (and is probably not very scalable for large results, but you said your tables aren't that big)
with recursive tree as (
select id, parent_id, name, 1 as level, concat('/', id) as path, null::numeric as amount
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.level + 1, concat(p.path, '/', c.id), ea.amount
from expense c
join tree p on c.parent_id = p.id
left join expense_amount ea on ea.ref_id = c.id
)
select e.id,
lpad(' ', (e.level - 1) * 2, ' ')||e.name as name,
e.amount as element_amount,
(select sum(amount)
from tree t
where t.path like e.path||'%') as sub_tree_amount,
e.path
from tree e
order by path;
Online example: http://rextester.com/MCE96740
The query builds up a path of all IDs belonging to a (sub)tree and then uses a scalar sub-select to get all child rows belonging to a node. That sub-select is what will make this quite slow as soon as the result of the recursive query can't be kept in memory.
I used the level column to create a "visual" display of the tree structure - this helps me debugging the statement and understanding the result better. If you need the real name of an element in your program you would obviously only use e.name instead of pre-pending it with blanks.
I could not get your query to work for some reason. Here's my attempt that works for the particular table you provided (parent-child, no grandchild) without recursion. SQL Fiddle
--- step 1: get parent-child data together
with parent_child as(
select t.*, amount
from
(select e.id, f.name as name,
coalesce(f.name, e.name) as pname
from expense e
left join expense f
on e.parent_id = f.id) t
left join expense_amount ea
on ea.ref_id = t.id
)
--- final step is to group by id, name
select id, pname, sum(amount)
from
(-- step 2: group by parent name and find corresponding amount
-- returns A, B
select e.id, t.pname, t.amount
from expense e
join (select pname, sum(amount) as amount
from parent_child
group by 1) t
on t.pname = e.name
-- step 3: to get C, D we union and get corresponding columns
-- results in all rows and corresponding value
union
select id, name, amount
from expense e
left join expense_amount ea
on e.id = ea.ref_id
) t
group by 1, 2
order by 1;

Postgres: Getting a total related count based on a condition from a related table

My sql-fu is not strong, and I'm sure I'm missing something simple in trying to get this working. I have a fairly standard group of tables:
users
-----
id
name
carts
-----
id
user_id
purchased_at
line_items
----------
id
cart_id
product_id
products
--------
id
permalink
I want to get a total count of purchased carts for each user, if that user has purchased a particular product. That is: if at least one of their purchased carts has a product with a particular permalink, I'd like a count of the total number of purchased carts, regardless of their contents.
The definition a purchased cart is when carts.purchased_at is not null.
select
u.id,
count(c2.*) as purchased_carts
from users u
inner join carts c on u.id = c.user_id
inner join line_items li on c.id = li.cart_id
inner join products p on p.id = li.product_id
left join carts c2 on u.id = c2.user_id
where
c.purchased_at is not NULL
and
c2.purchased_at is not NULL
and
p.permalink = 'product-name'
group by 1
order by 2 desc
The numbers that are coming up for purchased_carts are strangely high, possibly related to the total number of line items multiplied by the number of carts? Maybe? I'm pretty stumped at the result. Any help would be greatly appreciated.
This ought to help:
select u.id,
count(*)
from users u join
carts c on c.user_id = u.id
where c.purchased_at is not NULL and
exists (
select null
from carts c2
join line_items l on l.cart_id = c2.id
join products p on p.id = l.product_id
where c2.user_id = u.id and
c2.purchased_at is not NULL
p.permalink = 'product-name')
group by u.id
order by count(*) desc;
The exists predicate is a semi-join.
bool_or is what you need
select
u.id,
count(distinct c.id) as purchased_carts
from
users u
inner join
carts c on u.id = c.user_id
inner join
line_items li on c.id = li.cart_id
inner join
products p on p.id = li.product_id
where c.purchased_at is not NULL
group by u.id
having bool_or (p.permalink = 'product-name')
order by 2 desc