This recursive CTE runs forever (never returns results), when obtaining the same results by hand would take about 10 seconds, with most of that being copy-pasting.
Did I misimplement the RekeyLevel part? Is it not leveling up appropriately?
How would I make it so the recursion stops when no results are found, rather than needing a failsafe like RekeyLevel <= 2?
Current query:
with RekeysAllLevelsDeep as (
select
a.claimid as Rekey
,a.ClaimIDAdjFromOrig as Original
,0 as RekeyLevel
from <base table> (nolock) a
where a.ClaimIDAdjFromOrig is not null
and a.ClaimIDAdjFromOrig <> a.ClaimID
union all
select
a.claimid as Rekey
,a.ClaimIDAdjFromOrig as Original
,RekeyLevel + 1
from RekeysAllLevelsDeep
join <base table> (nolock) a
on RekeysAllLevelsDeep.Original = a.ClaimID
where a.ClaimIDAdjFromOrig is not null
and a.ClaimIDAdjFromOrig <> a.ClaimID
and RekeyLevel <= 2
)
select distinct
Rekey
,Original
,RekeyLevel
from RekeysAllLevelsDeep
where Original is not null
and Original <> Rekey
and Rekey = '(<number>)'
I needed to move the condition I used outside the recursive CTE, and Rekey = '(<number>)', inside of it. Doing so made the recursive CTE return correct results immediately. Having the condition outside the recursive CTE meant that the recursive CTE was doing this recursion for every number in the entire table.
Related
I notice some slow down when query is running. From 5ms to 200ms. (+44ms JIT)
https://explain.depesz.com/s/lZYf#l12
similar, but JIT is off
Underlined expression is NULL so whole filter expression is FALSE.
Why here PG waste time 227ms? What I did wrong?
EXPLAIN( ANALYSE, FORMAT JSON, VERBOSE, settings, buffers )
WITH
_app_period AS ( select ?::tstzrange ),
ready AS (
SELECT
min( lower( o.app_period ) ) OVER ( PARTITION BY agreement_id ) <# (select * from _app_period) AS new_order,
max( upper( o.app_period ) ) OVER ( PARTITION BY agreement_id ) <# (select * from _app_period) AS del_order
,o.*
FROM "order_bt" o
LEFT JOIN acc_ready( 'Usage', (select * from _app_period), o ) acc_u ON acc_u.ready
LEFT JOIN acc_ready( 'Invoice', (select * from _app_period), o ) acc_i ON acc_i.ready
LEFT JOIN agreement a ON a.id = o.agreement_id
LEFT JOIN xcheck c ON c.doc_id = o.id and c.doctype = 'OrderDetail'
WHERE o.sys_period #> sys_time() AND o.app_period && app_period()
)
SELECT * FROM ready
UPD
Server version is 13.1
Is the second execution faster?
No. Result is reproducible all the time.
Perhaps sys_time() is expensive - what is that function?
This is stable function which do select coalesce( biconf( 'sys_time' )::timestamptz, now() ). app_period() is STABLE SQL and do similar thing.
Are you sure that the expression is NULL for all rows?
Yes. I check result of app_period() it is NULL, so it does not matter how many rows in table. o.app_period && NULL will result NULL for all rows.
Does the execution time change if you replace the expression with a literal NULL?
Yes, changing condition to WHERE o.sys_period #> sys_time() AND o.app_period && NULL reduce time to 0.08ms. Plan is changed.
Do you have indexes on o.sys_period and o.app_period?
Yes. I have: "order_id_sys_period_app_period_excl" EXCLUDE USING gist (id WITH =, sys_period WITH &&, app_period WITH &&)
And what happens when you execute the query without the CTE?
Without CTE many things are inlined and time is reduced to 0.5ms. But for IndexScan similar condition is used (now it is fast)
When I put (select * from _app_period) everywhere then query also run fast: 15ms. Filter is planned as $3: (o.app_period && $3) AND (o.sys_period #> sys_time())
I am just trying to learn graph traversal using Recursive CTE in postgresql.
Below is my data set:
i am using the below code to get the path along with existing columns(node & edges).
It is giving me output but path column is not in ARRAY format.
;WITH RECURSIVE CTE AS
(
SELECT NODE,EDGES,ARRAY[G.NODE]::TEXT AS PATH,1 AS LEVEL
FROM property_graph G
UNION ALL
SELECT G.NODE,G.EDGES,C.PATH || G.NODE,LEVEL + 1
FROM property_graph G
INNER JOIN CTE C ON G.NODE = ANY(C.EDGES)
WHERE G.NODE <> ALL(STRING_TO_ARRAY(C.PATH,'')) --Cond added to avoid cyclic graph
)
SELECT NODE,EDGES,PATH,LEVEL
FROM CTE
ORDER BY NODE,LEVEL;
Output:
Could you guys help me?
Thanks in advance.
The problem is that your PATH column is of type TEXT, and so is NODE, therefore the || operator performs string concatenation rather than array concatenation.
You should change the type of your PATH column from TEXT to TEXT[] (and then you can remove the STRING_TO_ARRAY in the WHERE clause.
For example:
WITH RECURSIVE CTE AS
(
SELECT NODE,EDGES,ARRAY[G.NODE]::TEXT[] AS PATH,1 AS LEVEL
FROM property_graph G
UNION ALL
SELECT G.NODE,G.EDGES,C.PATH || ARRAY[G.NODE]::TEXT[],LEVEL + 1
FROM property_graph G
INNER JOIN CTE C ON G.NODE = ANY(C.EDGES)
WHERE G.NODE <> ALL(C.PATH) --Cond added to avoid cyclic graph
)
SELECT NODE,EDGES,PATH,LEVEL
FROM CTE
ORDER BY NODE,LEVEL;
Recently I have asked about Why select from function is slow?.
But now when I LEFT JOIN this function it take 11500ms.
When I rewrite LEFT JOIN by SubQuery it took only 111ms
SELECT
(SELECT next_ots FROM order_total_suma( next_range ) next_ots
WHERE next_ots.order_id = ots.order_id AND next_ots.consumed_period #> (ots.o).billed_to
) AS next_suma, --<< this took only 111ms. See plan
ots.* FROM (
SELECT
tstzrange(
NULLIF( (ots.o).billed_to, 'infinity' ),
NULLIF( (ots.o).billed_to +p.interval, 'infinity' )
) as next_range,
ots.*
FROM order_total_suma() ots
LEFT JOIN period p ON p.id = (ots.o).period_id
) ots
--LEFT JOIN order_total_suma( next_range ) next_ots ON next_ots.order_id = 6154
-- AND next_ots.consumed_period #> (ots.o).billed_to --<< this is fine. plan is not posted
--LEFT JOIN order_total_suma( next_range ) next_ots ON next_ots.order_id = ots.order_id
-- AND next_ots.consumed_period #> (ots.o).billed_to --<< this takes 11500ms. See Plan
WHERE ots.order_id IN ( 6154, 10805 )
Attached plans
While googling I have found this blog post
In most cases, joins are also a better solution than subqueries — Postgres will even internally “rewrite” a subquery, creating a join, whenever possible, but this of course increases the time it takes to come up with the query plan
Many SO question like this
A LEFT [OUTER] JOIN can be faster than an equivalent subquery because the server might be able to optimize it better—a fact that is not specific to MySQL Server alone.
So why LEFT JOINing function is significantly slower in compare to SubQuery?
Is there a way to make LEFT JOIN take time equally to SubQuery?
In SQL Server, I know for sure that the following query;
SELECT things.*
FROM things
LEFT OUTER JOIN (
SELECT thingreadings.thingid, reading
FROM thingreadings
INNER JOIN things on thingreadings.thingid = things.id
ORDER BY reading DESC LIMIT 1) AS readings
ON things.id = readings.thingid
WHERE things.id = '1'
Would join against thingreadings only once the WHERE id = 1 had restricted the record set down. It left joins against just one row. However in order for performance to be acceptable in postgres, I have to add the WHERE id= 1 to the INNER JOIN things on thingreadings.thingid = things.id line too.
This isn't ideal; is it possible to force postgres to know that what I am joining against is only one row without explicitly adding the WHERE clauses everywhere?
An example of this problem can be seen here;
I am trying to recreate the following query in a more efficient way;
SELECT things.id, things.name,
(SELECT thingreadings.id FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1),
(SELECT thingreadings.reading FROM thingreadings WHERE thingid = things.id ORDER BY id DESC LIMIT 1)
FROM things
WHERE id IN (1,2)
http://sqlfiddle.com/#!15/a172c/2
Not really sure why you did all that work. Isn't the inner query enough?
SELECT t.*
FROM thingreadings tr
INNER JOIN things t on tr.thingid = t.id AND t.id = '1'
ORDER BY tr.reading DESC
LIMIT 1;
sqlfiddle demo
When you want to select the latest value for each thingID, you can do:
SELECT t.*,a.reading
FROM things t
INNER JOIN (
SELECT t1.*
FROM thingreadings t1
LEFT JOIN thingreadings t2
ON (t1.thingid = t2.thingid AND t1.reading < t2.reading)
WHERE t2.thingid IS NULL
) a ON a.thingid = t.id
sqlfiddle demo
The derived table gets you the record with the most recent reading, then the JOIN gets you the information from things table for that record.
The where clause in SQL applies to the result set you're requesting, NOT to the join.
What your code is NOT saying: "do this join only for the ID of 1"...
What your code IS saying: "do this join, then pull records out of it where the ID is 1"...
This is why you need the inner where clause. Incidentally, I also think Filipe is right about the unnecessary code.
In a table I have records with id's 2,4,5,8. How can I receive a list with values 1,3,6,7. I have tried in this way
SELECT t1.id + 1
FROM table t1
WHERE NOT EXISTS (
SELECT *
FROM table t2
WHERE t2.id = t1.id + 1
)
but it's not working correctly. It doesn't bring all available positions.
Is it possible without another table?
You can get all the missing ID's from a recursive CTE, like this:
with recursive numbers as (
select 1 number
from rdb$database
union all
select number+1
from rdb$database
join numbers on numbers.number < 1024
)
select n.number
from numbers n
where not exists (select 1
from table t
where t.id = n.number)
the number < 1024 condition in my example limit the query to the max 1024 recursion depth. After that, the query will end with an error. If you need more than 1024 consecutive ID's you have either run the query multiple times adjusting the interval of numbers generated or think in a different query that produces consecutive numbers without reaching that level of recursion, which is not too difficult to write.