Rewrite Where condition using Lateral Join

Rewrite Where condition using Lateral Join - postgresql

I have a WITH statement which returns a table like followings:
id
parent_id
1
null
3
2
4
null
5
4
The desired output of my query is where parent_id is null or parent_id is present in the id column:
id
parent_id
1
null
4
null
5
4
I wrote the following query:
SELECT * FROM items
WHERE parent_id IS NULL OR parent_id = ANY(SELECT id from items)
As far I have understood, lateral joins are faster than the ANY operator so my idea was to rewrite the above query using them. I started with:
SELECT * FROM items i1
JOIN LATERAL (SELECT * FROM items i2 WHERE i2.parent_id = i1.id ) t ON true
But where do I add the condition to take the items where parent_id is null?

Use a self join:
SELECT t1.*
FROM items t1
LEFT JOIN items t2 ON t1.parent_id = t2.id
WHERE t1.parent_id IS NULL
OR t2.id IS NOT NULL
This is the best performing approach (assuming you have an index on the id column, which is almost certainly the case).

Related

Postgresql recursive query

I have table with self-related foreign keys and can not get how I can receive firs child or descendant which meet condition. My_table structure is:
id
parent_id
type
1
null
union
2
1
group
3
2
group
4
3
depart
5
1
depart
6
5
unit
7
1
unit
I should for id 1 (union) receive all direct child or first descendant, excluding all groups between first descendant and union. So in this example as result I should receive:
id
type
4
depart
5
depart
7
unit
id 4 because it's connected to union through group with id 3 and group with id 2 and id 5 because it's connected directly to union.
I've tried to write recursive query with condition for recursive part: when parent_id = 1 or parent_type = 'depart' but it doesn't lead to expected result
with recursive cte AS (
select b.id, p.type_id
from my_table b
join my_table p on p.id = b.parent_id
where b.id = 1
union
select c.id, cte.type_id
from my_table c
join cte on cte.id = c.parent_id
where c.parent_id = 1 or cte.type_id = 'group'
)

Here's my interpretation:
if type='group', then id and parent_id are considered in the same group
id#1 and id#2 are in the same group, they're equals
id#2 and id#3 are in the same group, they're equals
id#1, id#2 and id#3 are in the same group
If the above is correct, you want to get all the first descendent of id#1's group. The way to do that:
Get all the ids in the same group with id#1
Get all the first descendants of the above group (type not in ('union', 'group'))
with recursive cte_group as (
select 1 as id
union all
select m.id
from my_table m
join cte_group g
on m.parent_id = g.id
and m.type = 'group')
select mt.id,
mt.type
from my_table mt
join cte_group cg
on mt.parent_id = cg.id
and mt.type not in ('union','group');
Result:
id|type |
--+------+
4|depart|
5|depart|
7|unit |

Sounds like you want to start with the row of id 1, then get its children, and continue recursively on rows of type group. To do that, use
WITH RECURSIVE tree AS (
SELECT b.id, b.type, TRUE AS skip
FROM my_table b
WHERE id = 1
UNION ALL
SELECT c.id, c.type, (c.type = 'group') AS skip
FROM my_table c
JOIN tree p ON c.parent_id = p.id AND p.skip
)
SELECT id, type
FROM tree
WHERE NOT skip

select count from both tables using join in postgesql

how to find number of records in both table using join.
i have two tables table1 and table2 with same structure.
table1
id
item
1
A
1
B
1
C
2
A
2
B
table2
id
item
1
A
1
B
2
A
2
B
2
C
2
D
Output should be like this.
id
table1.itemcount
table2.itemcount
1
3
2
2
2
4

SELECT DISTINCT id, (
SELECT COUNT(*) FROM table1 AS table1_2 WHERE table1_2.id=table1.id
) AS "table1.itemcount", (
SELECT COUNT(*) FROM table2 AS table2_2 WHERE table2_2.id=table1.id
) AS "table2.itemcount"
FROM table1;

Assuming that each id is guaranteed to exist in both tables, the following would work
select
t1.id,
count(distinct t1.item) t1count,
count(distinct t2.item) t2count
from t1
join t2 on t1.id = t2.id
group by 1;
But if that is not guaranteed then we'll have to use full outer join to get unique ids from both tables
select
coalesce(t1.id, t2.id) id,
count(distinct t1.item) t1count,
count(distinct t2.item) t2count
from t1
full outer join t2 on t1.id = t2.id
group by 1;
We're using coalesce here as well for id because if it only exists in t2, t1.id would result in null.
#DeeStark's answer also works if ids are guaranteed to be in both tables but it's quite inefficient because count is essentially run twice for every distinct id in the table. Here's the fiddle where you can test out different approaches. I've prefixed each query with explain which shows the cost
Hope this helps

How to use OPENJSON on multiple rows

I have a temp table with multiple rows in it and each row has a column called Categories; which contains a very simple json array of ids for categories in a different table.
A few example rows of the temp table:
Id Name Categories
---------------------------------------------------------------------------------------------
'539f7e28-143e-41bb-8814-a7b93b846007' Test 1 ["category1Id", "category2Id", "category3Id"]
'f29e2ecf-6e37-4aa9-aa56-4a351d298bfc' Test 2 ["category1Id", "category2Id"]
'34e41a0a-ad92-4cd7-bf5c-8df6bfd6ed5c' Test 3 NULL
Now what I would like to do is to select all of the category ids from all of the rows in the temp table.
What I have is the following and it's not working as it's giving me the error of :
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
SELECT
c.Id
,c.[Name]
,c.Color
FROM
dbo.Category as c
WHERE
c.Id in (SELECT [value] FROM OPENJSON((SELECT Categories FROM #TempTable)))
and c.IsDeleted = 0
Which I guess it makes sense that's failing on that because I'm selecting multiple rows and needing to parse each row's respective category ids json. I'm just not sure what to do/change to give me the results that I want. Thank you in advance for any help.

You'd need to use CROSS APPLY like so:
SELECT id ,
name ,
t.Value AS category_id
FROM #temp
CROSS APPLY OPENJSON(categories, '$') t;
And then, you can JOIN to your Categories table using the category_id column, something like this:
SELECT id ,
name ,
t.Value AS category_id,
c.*
FROM #temp
CROSS APPLY OPENJSON(categories, '$') t
LEFT JOIN Categories c ON c.Id = t.Value

Handle Null in jsonb_array_elements

I have 2 tables a and b
Table a
id | name | code
VARCHAR VARCHAR jsonb
1 xyz [14, 15, 16 ]
2 abc [null]
3 def [null]
Table b
id | name | code
1 xyz [16, 15, 14 ]
2 abc [null]
I want to figure out where the code does not match for same id and name. I sort code column in b b/c i know it same but sorted differently
SELECT a.id,
a.name,
a.code,
c.id,
c.name,
c.code
FROM a
FULL OUTER JOIN ( SELECT id,
name,
jsonb_agg(code ORDER BY code) AS code
FROM (
SELECT id,
name,
jsonb_array_elements(code) AS code
FROM b
GROUP BY id,
name,
jsonb_array_elements(code)
) t
GROUP BY id,
name
) c
ON a.id = c.id
AND a.name = c.name
AND COALESCE (a.code, '[]'::jsonb) = COALESCE (c.code, '[]'::jsonb)
WHERE (a.id IS NULL OR c.id IS NULL)
My answer in this case should only return id = 3 b/c its not in b table but my query is returning id = 2 as well b/c i am not handling the null case well enough in the inner subquery
How can i handle the null use case in the inner subquery?

demo:db<>fiddle
The <# operator checks if all elements of the left array occur in the right one. The #> does other way round. So using both you can ensure that both arrays contain the same elements:
a.code #> b.code AND a.code <# b.code
Nevertheless it will be accept as well if one array contains duplicates. So [42,42] will be the same as [42]. If you want to avoid this as well you should check the array length as well
AND jsonb_array_length(a.code) = jsonb_array_length(b.code)
Furthermore you might check if both values are NULL. This case has to be checked separately:
a.code IS NULL and b.code IS NULL
A little bit shorter form is using the COALESCE function:
COALESCE(a.code, b.code) IS NULL
So the whole query could look like this:
SELECT
*
FROM a
FULL OUTER JOIN b
ON a.id = b.id AND a.name = b.name
AND (
COALESCE(a.code, b.code) IS NULL -- both null
OR (a.code #> b.code AND a.code <# b.code
AND jsonb_array_length(a.code) = jsonb_array_length(b.code) -- avoid accepting duplicates
)
)
After that you are able to filter the NULL values in the WHERE clause

Avoiding Order By in T-SQL

Below sample query is a part of my main query. I found SORT operator in below query is consuming 30% of the cost.
To avoid SORT, there is need of creation of Indexes. Is there any other way to optimize this code.
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA
WHERE ID = r.ID
AND Status = 3
AND TableA_ID >ISNULL((
SELECT TOP 1 TableA_ID
FROM TableA
WHERE ID = r.ID
AND Status <> 3
ORDER BY T_Date DESC
), 0)
ORDER BY T_Date ASC

Looks like you can use not exists rather than the sorts. I think you'll probably get a better performance boost by use a CTE or derived table instead of the a scalar subquery.
select *
from r ... left outer join
(
select ID, min(t_date) as min_date from TableA t1
where status = 3 and not exists (
select 1 from TableA t2
where t2.ID = t1.ID
and t2.status <> 3 and t2.t_date > t1.t_date
)
group by ID
) as md on md.ID = r.ID ...
or
select *
from r ... left outer join
(
select t1.ID, min(t1.t_date) as min_date
from TableA t1 left outer join TableA t2
on t2.ID = t1.ID and t2.status <> 3
where t1.status = 3 and t1.t_date < t2.t_date
group by t1.ID
having count(t2.ID) = 0
) as md on md.ID = r.ID ...
It also appears that you're relying on an identity column but it's not clear what those values mean. I'm basically ignoring it and using the date column instead.

Try this:
SELECT TOP 1 CONVERT( DATE, T_Date) AS T_Date
FROM TableA a1
LEFT JOIN (
SELECT ID, MAX(TableA_ID) AS MaxAID
FROM TableA
WHERE Status <> 3
GROUP BY ID
) a2 ON a2.ID = a1.ID AND a1.TableA_ID > coalesce(a2.MAXAID,0)
WHERE a1.ID = r.ID AND a1.Status = 3
ORDER BY T_Date ASC
The use of TOP 1 in combination with the unexplained r alias concern me. There's almost certainly a MUCH better way to get this data into your results that doesn't involve doing this in a sub query (unless this is for an APPLY operation).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Rewrite Where condition using Lateral Join - postgresql

Use a self join: SELECT t1.* FROM items t1 LEFT JOIN items t2 ON t1.parent_id = t2.id WHERE t1.parent_id IS NULL OR t2.id IS NOT NULL This is the best performing approach (assuming you have an index on the id column, which is almost certainly the case).

Related

Postgresql recursive query

select count from both tables using join in postgesql

How to use OPENJSON on multiple rows

Handle Null in jsonb_array_elements

Avoiding Order By in T-SQL

Categories

Resources