Handle Null in jsonb_array_elements - postgresql

I have 2 tables a and b
Table a
id | name | code
VARCHAR VARCHAR jsonb
1 xyz [14, 15, 16 ]
2 abc [null]
3 def [null]
Table b
id | name | code
1 xyz [16, 15, 14 ]
2 abc [null]
I want to figure out where the code does not match for same id and name. I sort code column in b b/c i know it same but sorted differently
SELECT a.id,
a.name,
a.code,
c.id,
c.name,
c.code
FROM a
FULL OUTER JOIN ( SELECT id,
name,
jsonb_agg(code ORDER BY code) AS code
FROM (
SELECT id,
name,
jsonb_array_elements(code) AS code
FROM b
GROUP BY id,
name,
jsonb_array_elements(code)
) t
GROUP BY id,
name
) c
ON a.id = c.id
AND a.name = c.name
AND COALESCE (a.code, '[]'::jsonb) = COALESCE (c.code, '[]'::jsonb)
WHERE (a.id IS NULL OR c.id IS NULL)
My answer in this case should only return id = 3 b/c its not in b table but my query is returning id = 2 as well b/c i am not handling the null case well enough in the inner subquery
How can i handle the null use case in the inner subquery?

demo:db<>fiddle
The <# operator checks if all elements of the left array occur in the right one. The #> does other way round. So using both you can ensure that both arrays contain the same elements:
a.code #> b.code AND a.code <# b.code
Nevertheless it will be accept as well if one array contains duplicates. So [42,42] will be the same as [42]. If you want to avoid this as well you should check the array length as well
AND jsonb_array_length(a.code) = jsonb_array_length(b.code)
Furthermore you might check if both values are NULL. This case has to be checked separately:
a.code IS NULL and b.code IS NULL
A little bit shorter form is using the COALESCE function:
COALESCE(a.code, b.code) IS NULL
So the whole query could look like this:
SELECT
*
FROM a
FULL OUTER JOIN b
ON a.id = b.id AND a.name = b.name
AND (
COALESCE(a.code, b.code) IS NULL -- both null
OR (a.code #> b.code AND a.code <# b.code
AND jsonb_array_length(a.code) = jsonb_array_length(b.code) -- avoid accepting duplicates
)
)
After that you are able to filter the NULL values in the WHERE clause

Related

Postgresql recursive query

I have table with self-related foreign keys and can not get how I can receive firs child or descendant which meet condition. My_table structure is:
id
parent_id
type
1
null
union
2
1
group
3
2
group
4
3
depart
5
1
depart
6
5
unit
7
1
unit
I should for id 1 (union) receive all direct child or first descendant, excluding all groups between first descendant and union. So in this example as result I should receive:
id
type
4
depart
5
depart
7
unit
id 4 because it's connected to union through group with id 3 and group with id 2 and id 5 because it's connected directly to union.
I've tried to write recursive query with condition for recursive part: when parent_id = 1 or parent_type = 'depart' but it doesn't lead to expected result
with recursive cte AS (
select b.id, p.type_id
from my_table b
join my_table p on p.id = b.parent_id
where b.id = 1
union
select c.id, cte.type_id
from my_table c
join cte on cte.id = c.parent_id
where c.parent_id = 1 or cte.type_id = 'group'
)
Here's my interpretation:
if type='group', then id and parent_id are considered in the same group
id#1 and id#2 are in the same group, they're equals
id#2 and id#3 are in the same group, they're equals
id#1, id#2 and id#3 are in the same group
If the above is correct, you want to get all the first descendent of id#1's group. The way to do that:
Get all the ids in the same group with id#1
Get all the first descendants of the above group (type not in ('union', 'group'))
with recursive cte_group as (
select 1 as id
union all
select m.id
from my_table m
join cte_group g
on m.parent_id = g.id
and m.type = 'group')
select mt.id,
mt.type
from my_table mt
join cte_group cg
on mt.parent_id = cg.id
and mt.type not in ('union','group');
Result:
id|type |
--+------+
4|depart|
5|depart|
7|unit |
Sounds like you want to start with the row of id 1, then get its children, and continue recursively on rows of type group. To do that, use
WITH RECURSIVE tree AS (
SELECT b.id, b.type, TRUE AS skip
FROM my_table b
WHERE id = 1
UNION ALL
SELECT c.id, c.type, (c.type = 'group') AS skip
FROM my_table c
JOIN tree p ON c.parent_id = p.id AND p.skip
)
SELECT id, type
FROM tree
WHERE NOT skip

Rewrite Where condition using Lateral Join

I have a WITH statement which returns a table like followings:
id
parent_id
1
null
3
2
4
null
5
4
The desired output of my query is where parent_id is null or parent_id is present in the id column:
id
parent_id
1
null
4
null
5
4
I wrote the following query:
SELECT * FROM items
WHERE parent_id IS NULL OR parent_id = ANY(SELECT id from items)
As far I have understood, lateral joins are faster than the ANY operator so my idea was to rewrite the above query using them. I started with:
SELECT * FROM items i1
JOIN LATERAL (SELECT * FROM items i2 WHERE i2.parent_id = i1.id ) t ON true
But where do I add the condition to take the items where parent_id is null?
Use a self join:
SELECT t1.*
FROM items t1
LEFT JOIN items t2 ON t1.parent_id = t2.id
WHERE t1.parent_id IS NULL
OR t2.id IS NOT NULL
This is the best performing approach (assuming you have an index on the id column, which is almost certainly the case).

SQL left join case statement

Need some help working out the SQL. Unfortunately the version of tsql is SybaseASE which I'm not too familiar with, in MS SQL I would use a windowed function like RANK() or ROW_NUMBER() in a subquery and join to those results ...
Here's what I'm trying to resolve
TABLE A
Id
1
2
3
TABLE B
Id,Type
1,A
1,B
1,C
2,A
2,B
3,A
3,C
4,B
4,C
I would like to return 1 row for each ID and if the ID has a type 'A' record that should display, if it has a different type then it doesn't matter but it cannot be null (can do some arbitrary ordering, like alpha to prioritize "other" return value types)
Results:
1, A
2, A
3, A
4, B
A regular left join (ON A.id = B.id and B.type = 'A') ALMOST returns what I am looking for however it returns null for the type when I want the 'next available' type.
You can use a INNER JOIN on a SubQuery (FirstTypeResult) that will return the minimum type per Id.
Eg:
SELECT TABLEA.[Id], FirstTypeResult.[Type]
FROM TABLEA
JOIN (
SELECT [Id], Min([Type]) As [Type]
FROM TABLEB
GROUP BY [Id]
) FirstTypeResult ON FirstTypeResult.[Id] = TABLEA.[Id]

Using a where clause in a combination with a subquery which result is a record containin an array

I found some results/answers, regarding searching in array, but:
WHERE = ANY, works with a column, but not with a subquery that returns one record that contains an array as a result, triggers error
WHERE IN, also triggers the same error
I also tested untest on the subquery, similar error as the first 2
I don't want to check if a value is in an array but get/execute the query for the values in the array, like WHERE IN (1,2,3,4), not like in the other questions/answers
The error:
No operator matches the given name and argument types. You might need
to add explicit type casts.
or
No operator matches int = int[]
path is an int[]array type.
Structure:
id | name | | slug | path | parent_id
1 name1 slug1 {1} null
2 name2 slug2 {1,2} 1
3 name3 slug3 {1,2,3} 2
4 nam4 slug4 {4} null
What I tries as base:
SELECT t.id, t.name, t.slug FROM types AS t
WHERE t.id in (SELECT t.path FROM types AS t WHERE t.id = 24)
ORDER BY depth ASC
Basically path is like a breadcrumb , {grandparent,parent,type}
Here's one using IN and unnest()
SELECT t1.id,
t1."name",
t1.slug
FROM types t1
WHERE t1.id IN (SELECT un.e
FROM types t2
CROSS JOIN LATERAL unnest(t2.path) un (e)
WHERE t2.id = 2)
ORDER BY array_length(t1.path, 1);
And another one using the array is contained by operator <#.
SELECT t1.id,
t1."name",
t1.slug
FROM types t1
WHERE ARRAY[t1.id] <# (SELECT t2.path
FROM types t2
WHERE t2.id = 2)
ORDER BY array_length(t1.path, 1);
And one using = ANY.
SELECT t1.id,
t1."name",
t1.slug
FROM types t1
WHERE t1.id = ANY ((SELECT t2.path
FROM types t2
WHERE t2.id = 2)::integer[])
ORDER BY array_length(t1.path, 1);
db<>fiddle
You didn't include depth in your sample data so I replaced it with array_length(t1.path, 1) which is probably what it is.

Cascading sum hierarchy using recursive cte

I'm trying to perform recursive cte with postgres but I can't wrap my head around it. In terms of performance issue there are only 50 items in TABLE 1 so this shouldn't be an issue.
TABLE 1 (expense):
id | parent_id | name
------------------------------
1 | null | A
2 | null | B
3 | 1 | C
4 | 1 | D
TABLE 2 (expense_amount):
ref_id | amount
-------------------------------
3 | 500
4 | 200
Expected Result:
id, name, amount
-------------------------------
1 | A | 700
2 | B | 0
3 | C | 500
4 | D | 200
Query
WITH RECURSIVE cte AS (
SELECT
expenses.id,
name,
parent_id,
expense_amount.total
FROM expenses
WHERE expenses.parent_id IS NULL
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
UNION ALL
SELECT
expenses.id,
expenses.name,
expenses.parent_id,
expense_amount.total
FROM cte
JOIN expenses ON expenses.parent_id = cte.id
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
)
SELECT
id,
SUM(amount)
FROM cte
GROUP BY 1
ORDER BY 1
Results
id | sum
--------------------
1 | null
2 | null
3 | 500
4 | 200
You can do a conditional sum() for only the root row:
with recursive tree as (
select id, parent_id, name, id as root_id
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.root_id
from expense c
join tree p on c.parent_id = p.id
)
select e.id,
e.name,
e.root_id,
case
when e.id = e.root_id then sum(ea.amount) over (partition by root_id)
else amount
end as amount
from tree e
left join expense_amount ea on e.id = ea.ref_id
order by id;
I prefer doing the recursive part first, then join the related tables to the result of the recursive query, but you could do the join to the expense_amount also inside the CTE.
Online example: http://rextester.com/TGQUX53703
However, the above only aggregates on the top-level parent, not for any intermediate non-leaf rows.
If you want to see intermediate aggregates as well, this gets a bit more complicated (and is probably not very scalable for large results, but you said your tables aren't that big)
with recursive tree as (
select id, parent_id, name, 1 as level, concat('/', id) as path, null::numeric as amount
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.level + 1, concat(p.path, '/', c.id), ea.amount
from expense c
join tree p on c.parent_id = p.id
left join expense_amount ea on ea.ref_id = c.id
)
select e.id,
lpad(' ', (e.level - 1) * 2, ' ')||e.name as name,
e.amount as element_amount,
(select sum(amount)
from tree t
where t.path like e.path||'%') as sub_tree_amount,
e.path
from tree e
order by path;
Online example: http://rextester.com/MCE96740
The query builds up a path of all IDs belonging to a (sub)tree and then uses a scalar sub-select to get all child rows belonging to a node. That sub-select is what will make this quite slow as soon as the result of the recursive query can't be kept in memory.
I used the level column to create a "visual" display of the tree structure - this helps me debugging the statement and understanding the result better. If you need the real name of an element in your program you would obviously only use e.name instead of pre-pending it with blanks.
I could not get your query to work for some reason. Here's my attempt that works for the particular table you provided (parent-child, no grandchild) without recursion. SQL Fiddle
--- step 1: get parent-child data together
with parent_child as(
select t.*, amount
from
(select e.id, f.name as name,
coalesce(f.name, e.name) as pname
from expense e
left join expense f
on e.parent_id = f.id) t
left join expense_amount ea
on ea.ref_id = t.id
)
--- final step is to group by id, name
select id, pname, sum(amount)
from
(-- step 2: group by parent name and find corresponding amount
-- returns A, B
select e.id, t.pname, t.amount
from expense e
join (select pname, sum(amount) as amount
from parent_child
group by 1) t
on t.pname = e.name
-- step 3: to get C, D we union and get corresponding columns
-- results in all rows and corresponding value
union
select id, name, amount
from expense e
left join expense_amount ea
on e.id = ea.ref_id
) t
group by 1, 2
order by 1;