PostgreSQL - CTE slowing the query - postgresql

I have a concern regarding the use of multiple WITH clauses in a query because
in some condition, it is slowing down the performance of the query like the below example,
So first WITH clause taking the 0.345 sec to fetch the 98948 records and second WITH clause taking the 13 sec to fetch the 68199 records even its less record as compare to first one so the only difference is that we have used the aggregate function in the second WITH clause to calculate the sum of charges.
Can anybody please help us to understand why the second query taking too much time.
1.This clause taking the 0.318 sec to fetch the 98948 record,
WITH delinquency_lease_details AS (
SELECT
dp.cid,
dp.id AS delinquency_policy_id,
p.id,
dp.threshold_amount,
dp.small_balance_amount,
dp.delinquency_threshold_type_id,
cl.id,
cl.primary_customer_id AS customer_id,
cl.lease_status_type_id,
cl.occupancy_type_id,
COALESCE( cl.building_name || ' - ' || cl.unit_number_cache, cl.building_name, cl.unit_number_cache ) AS unit_number, func_format_customer_name ( cl.name_first, cl.name_last, cl.company_name ) customer_name, cl.property_name, TZ.time_zone_name AS property_timezone
FROM
cached_leases cl
JOIN lease_details ld ON ( ld.cid = cl.cid AND ld.lease_id = cl.id )
JOIN delinquency_policies dp ON ( dp.cid = ld.cid AND ld.delinquency_policy_id = dp.id )
JOIN properties p ON ( p.cid = lp.cid AND p.id = lp.property_id )
JOIN time_zones TZ ON ( TZ.id = p.time_zone_id )
WHERE
cl.cid = 1111
AND cl.lease_status_type_id IN ( 4, 5 )
AND cl.occupancy_type_id IN ( 1, 2, 3, 4, 6, 9, 10, 11 )
AND cl.termination_list_type_id IS NULL
)
SELECT * FROM delinquency_lease_details;
2. This clause taking the 13 sec to fetch the 68199 records and if I just run the query without WITH clause then it is taking 0.564 seconds,
WITH delinquent_balance AS (
SELECT
dld.cid,
dld.id,
min( c.post_date ) AS min_post_date,
sum( c.transaction_amount_due ) AS delinquent_amount
FROM
cached_leases dld
JOIN charges at ON ( at.cid = dld.cid AND at.lease_id = dld.id AND c.is_temporary = FALSE AND c.is_deleted = FALSE )
JOIN charge_codes cc ON ( c.ar_code_id = cc.id AND c.cid = cc.cid AND cc.ledger_filter_id = 27 )
WHERE
dld.cid = 1111
AND ( ( c.transaction_amount_due > 0 AND c.post_date < CURRENT_DATE ) OR c.transaction_amount_due < 0 )
AND NOT EXISTS (
select
1
from
repayment_charges
WHERE
cid = c.cid
AND property_id = c.property_id
AND charge_id = c.id
AND is_active = true
)
GROUP BY
dld.cid,
dld.id
) select * from delinquent_balance;
As per this link, the WITH clause is the optimization barrier for Postgres database so it's really a cause then what we should use in place of WITH clause for complex queries because I have used the 10 WITH clauses in query and it is slowing down the performance of the query but out of that I have given two clauses only to get the some conclusion because the second clause taking more time as compared to another one.

Related

How To Properly Use Lateral Joins In PSQL

When I run the command below, there is a syntax error highlighted at UNION on the last line. I am really not sure why.
SELECT *
FROM (
SELECT (final_exp.deploymentyear + number) AS deploymentyear1,
(final_exp.deploymentyear + number) - final_exp.deploymentyear AS diff,
final_exp.*
FROM (
SELECT number
FROM hubb."spt_values"
WHERE type = 'P'
AND (
2000 + number) <= 2023 ) s
LEFT JOIN lateral (
SELECT *
FROM final_exp
WHERE (
final_exp.deploymentyear + number) < 2023
AND NOT EXISTS
(
SELECT 1
FROM final_exp g
WHERE g.deploymentyear = (final_exp.deploymentyear + number)
AND g.sacstate = final_exp.sacstate
AND g.fund = final_exp.fund
AND g.companyname = final_exp.companyname )
AND diff = 1 )df
UNION
I verified parentheses and comments and can't find where the issue might be coming from. Is it the lateral join?
You are missing the join condition for the Join portion. You need to join the "S" sub query and the join like "ON df.some_column=s.some_column"

How can I update a table using the following CROSS APPLY?

UPDATE ItemDim_DEV
set ActiveFlag = 0,
EndUTCDate = GETUTCDATE()
from (select i.SKU, i.itemname, i.category, i.CategoryInternalId, i.itemtype, i.IsActive, i.assetaccount, i.InternalId,
ni.sku,
ni.itemname,
ni.class_name,
ni.productclass,
ni.ItemType,
ni.isactive,
ni.assetaccount,
i.CalculatedHash, ni.hashid
from (
SELECT i.*
FROM itemDim_Dev i
WHERE i.SourceSystem = 'NetSuite'
--and InternalId = '1692'
) i
CROSS APPLY (
SELECT (CONVERT([binary](64),hashbytes('SHA2_512',
concat (ISNULL(im.TargetSKU, ni.sku), ni.itemname, pc.class_name, ni.productclass, ni.ItemType,
CAST(CASE WHEN ni.IsInactive = 'T' THEN 0 ELSE 1 END AS BIT ), ni.assetaccount)))) hashid,
ISNULL(im.TargetSKU, ni.sku) sku,
ni.itemname,
pc.class_name,
ni.productclass,
ni.ItemType,
CAST(CASE WHEN ni.IsInactive = 'T' THEN 0 ELSE 1 END AS BIT ) isactive, ni.assetaccount
FROM NetSuiteInventory ni
INNER JOIN Product_Class pc ON ni.productclass = pc.class_id
LEFT JOIN ItemMapping im ON ni.sku = im.SourceSKU
WHERE ni.ItemInternalId = CAST(i.InternalId as bigint)
and
(CONVERT([binary](64),hashbytes('SHA2_512',
concat (ISNULL(im.TargetSKU, ni.sku), ni.itemname, pc.class_name, ni.productclass, ni.ItemType,
CAST(CASE WHEN ni.IsInactive = 'T' THEN 0 ELSE 1 END AS BIT ), ni.assetaccount)))) = i.CalculatedHash
) ni
I understand that I can put a select after the from of the update but in the CROSS APPLY that is in the query starts with a select as in the second one, can you help me please.
Make a join to the target table
UPDATE A
SET XXX = YYY
FROM ItemDim_DEV A
JOIN (
--huge SELECT that I recommend you try to shorten in the next questions
CROSS APPLY (any secrets...)
) B ON A.PK = B.PK

How to use a column value in one of the select statement in the join in postgresql

I'm updating sitename as below. But then I need to utilize the levelindex from a in another query which is getting joined with the first query.
update billofquantity_temp a
set sitename = b.boqitemname
from (
select a.* from
(
select levelindex,productno,boqitemid,boqitemname,fileid from billofquantity_temp
where productno ='' and levelindex !='' //first query
) a
join
(
select distinct substring(levelindex,1,length(a.levelindex) lblindex from billofquantity_temp
where boqitemid !='' and levelindex !=''
and length(levelindex) > 2
) b on a.levelindex=b.lblindex // second query
) b
where a.levelindex=b.levelindex;
I need to use the a.levelindex in second query to get the substrings of the it's length. But this throws error invalid reference to FROM-clause entry for table "a"
Is there any other way to use the column from query 1 in 2 while joining them?
Update:
Based on kims answer, I could get the selection done but there is error syntax error at c.a.levelindex. There error is thrown even with c.levelindex.
A slight modification to the query made no errors but the intened update doesn't happen. Instead only one particular values is updated. Below is the updated query to remove errors
update billofquantity_temp d
set sitename = c.boqitemname
from (
select * from (
select levelindex, boqitemname from billofquantity_temp
where productno = '' and levelindex != '' -- first query
) a
join (
select distinct levelindex lblindex from billofquantity_temp
where boqitemid != '' and levelindex != '' and length(levelindex) > 2 ) b
on a.levelindex = substring(b.lblindex, 1, length(a.levelindex)) -- second query
) c
where d.levelindex = c.lblindex;
Move the computation of lblindex outside the sub-select to the on clause:
update billofquantity_temp d
set sitename = c.boqitemname
from (
select *
from (
select levelindex, boqitemname
from billofquantity_temp
where productno = '' and levelindex != '' -- first query
) a
join (
select distinct levelindex
from billofquantity_temp
where boqitemid != '' and levelindex != '' and length(levelindex) > 2
) b
on a.levelindex = substring(b.levelindex, 1, length(a.levelindex)) -- second query
) c
where d.levelindex = c.a.levelindex
;
I've also used c and d to avoid confusion from using a and b twice, added a missing ), and removed unused fields from select.

Why using same field when filtering cause different execution time? (different index usage)

When I run query and filter by agreement_id it is slow,
but when I filter by an alias id it is fast. (Look at the end of the query)
Why using same field when filtering cause different execution time?
Links to explain analyze:
slow1, slow2
fast1, fast2
Difference start at #20: Where different indexes are used:
Index Cond: (o.sys_period #> sys_time()) VS Index Cond: (o.agreement_id = 38)
PS. It would be nice if I can contact to developer of this feature (I have one more similar problem)
UPD I did some experiments. when I remove window functions from my query it works fast in any case. So why window function stop index usage in some cases? How to escape/workaround that?
dbfiddle with minimal test case
Server version is v13.1
Full query:
WITH gconf AS
-- https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-SELECT
NOT MATERIALIZED -- force it to be merged into the parent query
-- it gives a net savings because each usage of the WITH query needs only a small part of the WITH query's full output.
( SELECT
ocd.*,
tstzrange( '2021-05-01', '2021-05-01', '[]') AS acc_period,
(o).agreement_id AS id, -- Required to passthrough WINDOW FUNCTION
(o).id AS order_id,
(ic).consumed_period AS consumed_period,
dense_rank() OVER ( PARTITION BY (o).agreement_id, (o).id ORDER BY (ic).consumed_period ) AS nconf,
row_number() OVER ( wconf ORDER BY (c).sort_order NULLS LAST ) AS nitem,
(sum( ocd.item_cost ) OVER wconf)::numeric( 10, 2) AS conf_cost,
max((ocd.ic).consumed) OVER wconf AS consumed,
CASE WHEN true
THEN (sum( ocd.item_suma ) OVER wconf)::numeric( 10, 2 )
ELSE (sum( ocd.item_cost ) OVER wconf)::numeric( 10, 2 )
END AS conf_suma
FROM order_cost_details( tstzrange( '2021-05-01', '2021-05-01', '[]') ) ocd
WHERE true OR (ocd.ic).consumed_period #> lower( tstzrange( '2021-05-01', '2021-05-01', '[]') )
WINDOW wconf AS ( PARTITION BY (o).agreement_id, (o).id, (ic).consumed_period )
),
gorder AS (
SELECT *,
(conf_suma/6)::numeric( 10, 2 ) as conf_nds,
sum( conf_suma ) FILTER (WHERE nitem = 1) OVER worder AS order_suma
FROM gconf
WINDOW worder AS ( PARTITION BY gconf.id, (o).id )
-- TODO: Ask PG developers: Why changing to (o).agreement_id slows down query?
-- WINDOW worder AS ( PARTITION BY (o).agreement_id, (o).id )
)
SELECT
u.id, consumed_period, nconf, nitem,
(c).id as item_id,
COALESCE( (c).sort_order, pd.sort_order ) as item_order,
COALESCE( st.display, st.name, rt.display, rt.name ) as item_name,
COALESCE( item_qty, (c).amount/rt.unit ) as item_qty,
COALESCE( (p).label, rt.label ) as measure,
item_price, item_cost, item_suma,
conf_cost, consumed, conf_suma, conf_nds, order_suma,
(order_suma/6)::numeric( 10, 2 ) as order_nds,
sum( conf_suma ) FILTER (WHERE nitem = 1 ) OVER wagreement AS total_suma,
sum( (order_suma/6)::numeric( 10, 2 ) ) FILTER (WHERE nitem = 1 AND nconf = 1) OVER wagreement AS total_nds,
pkg.id as package_id,
pkg.link_1c_id as package_1c_id,
COALESCE( pkg.display, pkg.name ) as package,
acc_period
FROM gorder u
LEFT JOIN resource_type rt ON rt.id = (c).resource_type_id
LEFT JOIN service_type st ON st.id = (c).service_type_id
LEFT JOIN package pkg ON pkg.id = (o).package_id
LEFT JOIN package_detail pd ON pd.package_id = (o).package_id
AND pd.resource_type_id IS NOT DISTINCT FROM (c).resource_type_id
AND pd.service_type_id IS NOT DISTINCT FROM (c).service_type_id
-- WHERE (o).agreement_id = 38 -- slow
WHERE u.id = 38 -- fast
WINDOW wagreement AS ( PARTITION BY (o).agreement_id )
As problem workaround we can additionally SELECT an alias for column used at PARTITION BY expression. Then PG apply optimization and use index.
The answer to the question could be: PG does not apply optimization if composite type is used. Notice as it works:
PARTITION | FILTER | IS USED?
------------------------------
ALIAS | ORIG | NO
ALIAS | ALIAS | YES
ORIG | ALIAS | NO
ORIG | ORIG | NO
See this dbfiddle
create table agreement ( ag_id int, name text, cost numeric(10,2) );
create index ag_idx on agreement (ag_id);
insert into agreement (ag_id, name, cost) values ( 1, '333', 22 ),
(1,'333', 33), (1, '333', 7), (2, '555', 18 ), (2, '555', 2), (3, '777', 4);
select * from agreement;
create function initial ()
returns table( agreement_id int, ag agreement ) language sql stable AS $$
select ag_id, t from agreement t;
$$;
select * from initial() t;
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
select
*,
sum( (t.ag).cost ) over ( partition by agreement_id ) as total
from initial() t
)
select * from totals_by_ag t
where (t.ag).ag_id = 1; -- index is NOT USED
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
select
*,
sum( (t.ag).cost ) over ( partition by agreement_id ) as total
from initial() t
)
select * from totals_by_ag t
where agreement_id = 1; -- index is used when alias for column is used
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
select
*,
sum( (t.ag).cost ) over ( partition by (t.ag).ag_id ) as total --renamed
from initial() t
)
select * from totals_by_ag t
where agreement_id = 1; -- index is NOT USED because grouping by original column
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
select
*,
sum( (t.ag).cost ) over ( partition by (t.ag).ag_id ) as total --renamed
from initial() t
)
select * from totals_by_ag t
where (t.ag).ag_id = 1; -- index is NOT USED even if at both cases original column

How to append CTE results to main query output?

I've created a TSQL query that pulls from two sets of tables in my database. The tables in the Common Table Expression are different from the tables in the main query. I'm joining on MRN and need the end result to contain accounts from both sets of tables. I've written the following query to this end:
with cteHosp as(
select Distinct p.EncounterNumber, p.MRN, p.AdmitAge
from HospitalPatients p
inner join Eligibility e on p.MRN = e.MRN
inner join HospChgDtl c on p.pt_id = c.pt_id
inner join HospitalDiagnoses d on p.pt_id = d.pt_id
where p.AdmitAge >=12
and d.dx_cd in ('G89.4','R52.1','R52.2','Z00.129')
)
Select Distinct a.AccountNo, a.dob, DATEDIFF(yy, a.dob, GETDATE()) as Age
from RHCCPTDetail c
inner join RHCAppointments a on c.ClaimID = a.ClaimID
inner join Eligibility e on c.hl7Id = e.MRN
full outer join cteHosp on e.MRN = cteHosp.MRN
where DATEDIFF(yy, a.dob, getdate()) >= 12
and left(c.PriDiag,7) in ('G89.4','R52.1','R52.2', 'Z00.129')
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode2,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode3,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode4,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
order by AccountNo
How do I merge together the output of both the common table expression and the main query into one set of results?
Merge performs inserts, updates or deletes. I believe you want to join the cte. If so, here is an example.
Notice the cteBatch is joined to the Main query below.
with
cteBatch (BatchID,BatchDate,Creator,LogID)
as
(
select
BatchID
,dateadd(day,right(BatchID,3) -1,
cast(cast(left(BatchID,4) as varchar(4))
+ '-01-01' as date)) BatchDate
,Creator
,LogID
from tblPriceMatrixBatch b
unpivot
(
LogID
for Logs in (LogIDISI,LogIDTG,LogIDWeb)
)u
)
Select
0 as isCurrent
,i.InterfaceID
,i.InterfaceName
,b.BatchID
,b.BatchDate
,case when isdate(l.start) = 0 and isdate(l.[end]) = 0 then 'Scheduled'
when isdate(l.start) = 1 and isdate(l.[end]) = 0 then 'Running'
when isdate(l.start) = 1 and isdate(l.[end]) = 1 and isnull(l.haserror,0) = 1 then 'Failed'
when isdate(l.start) = 1 and isdate(l.[end]) = 1 and isnull(l.haserror,0) != 1 then 'Success'
else 'idunno' end as stat
,l.Start as StartTime
,l.[end] as CompleteTime
,b.Creator as Usr
from EOCSupport.dbo.Interfaces i
join EOCSupport.dbo.Logs l
on i.InterfaceID = l.InterfaceID
join cteBatch b
on b.logid = l.LogID