Performance degrade while fetching it from views PostgreSQL

Performance degrade while fetching it from views PostgreSQL - postgresql

I am running this query and i am getting a low performance. We have fetch the data from views but some how it is giving low performance.
I got explain analyze
"Aggregate (cost=387.95..387.96 rows=1 width=0) (actual time=0.561..0.561 rows=1 loops=1)"
" -> Unique (cost=387.95..387.95 rows=1 width=36) (actual time=0.558..0.558 rows=0 loops=1)"
" -> Sort (cost=387.95..387.95 rows=1 width=36) (actual time=0.558..0.558 rows=0 loops=1)"
" Sort Key: at.id, at.cid, at.created_at, ps.channel"
" Sort Method: quicksort Memory: 25kB"
" -> Nested Loop (cost=15.89..387.94 rows=1 width=36) (actual time=0.525..0.525 rows=0 loops=1)"
" -> Hash Join (cost=15.78..269.20 rows=56 width=108) (actual time=0.212..0.347 rows=11 loops=1)"
" Hash Cond: (at."LV" = br.id)"
" -> Nested Loop (cost=8.47..261.68 rows=56 width=105) (actual time=0.078..0.209 rows=11 loops=1)"
" Join Filter: (at."aRR" = ar.id)"
" Rows Removed by Join Filter: 11"
" -> Hash Join (cost=8.47..260.00 rows=56 width=89) (actual time=0.071..0.196 rows=11 loops=1)"
" Hash Cond: (at."Type" = at.id)"
" -> Nested Loop (cost=6.28..257.60 rows=56 width=90) (actual time=0.043..0.161 rows=11 loops=1)"
" Join Filter: (at."Src" = sa.id)"
" Rows Removed by Join Filter: 231"
" -> Bitmap Heap Scan on at (cost=6.28..252.88 rows=67 width=94) (actual time=0.026..0.109 rows=11 loops=1)"
" Recheck Cond: (created_at > '2018-01-05 11:33:28'::timestamp without time zone)"
" Filter: (status = 't'::text)"
" Heap Blocks: exact=11"
" -> Bitmap Index Scan on created_date_ids (cost=0.00..6.28 rows=128 width=0) (actual time=0.011..0.011 rows=12 loops=1)"
" Index Cond: (created_at > '2018-01-05 11:33:28'::timestamp without time zone)"
" -> Materialize (cost=0.00..2.04 rows=10 width=28) (actual time=0.001..0.002 rows=22 loops=11)"
" -> Seq Scan on sa (cost=0.00..2.03 rows=10 width=28) (actual time=0.002..0.006 rows=22 loops=1)"
" -> Hash (cost=2.09..2.09 rows=29 width=31) (actual time=0.018..0.018 rows=30 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 10kB"
" -> Seq Scan on at (cost=0.00..2.09 rows=29 width=31) (actual time=0.005..0.010 rows=30 loops=1)"
" -> Materialize (cost=0.00..1.01 rows=3 width=48) (actual time=0.000..0.000 rows=2 loops=11)"
" -> Seq Scan on ar (cost=0.00..1.01 rows=3 width=48) (actual time=0.002..0.002 rows=2 loops=1)"
" -> Hash (cost=6.06..6.06 rows=355 width=35) (actual time=0.122..0.122 rows=370 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 33kB"
" -> Seq Scan on br (cost=0.00..6.06 rows=355 width=35) (actual time=0.006..0.048 rows=370 loops=1)"
" -> Index Only Scan using prs_Src_application_at_activit_key on prs ps (cost=0.11..2.12 rows=1 width=63) (actual time=0.015..0.015 rows=0 loops=11)"
" Index Cond: ((Src_application = (sa."Name")::text) AND (at = (at."Name")::text) AND (aRR = (ar."Name")::text) AND (LV = (br."Name")::text))"
" Filter: (btrim((channel)::text) = 'V'::text)"
" Rows Removed by Filter: 1"
" Heap Fetches: 0"
"Planning time: 7.735 ms"
"Execution time: 0.721 ms"
```
Our views look like
SELECT DISTINCT at.id,
at.cid,
at.created_at,
at.status,
ps.channel
FROM at
JOIN sa ON sa.id = at."Src"
JOIN at ON at.id = at."Type"
JOIN ar ON ar.id = at."aRR"
JOIN br ON br.id = at."LV"
JOIN prs ps ON ps.aRR::text = ar."Name"::text AND ps.at::text = at."Name"::text AND ps.LV::text = br."Name"::text AND ps.Src_application::text = sa."Name"::text
WHERE at.status = 't'::text and
trim(ps.channel)= 'V' and at.created_at > '2018-01-05 11:33:28'
This query is taking too much time. How to improve the performance of this query.

Related

Postgresql Query performance optimization

I have a query which runs fast during off period but when there is load it runs very slow.
In the New Relic it sometimes shows to run 5-8mins.
The query looks simple but the View definition may be not that simple.
So wanted to know if there is any scope of optimization
Database version - "PostgreSQL 10.14 on x86_64-pc-linux-gnu, compiled by x86_64-unknown-linux-gnu-gcc (GCC) 4.9.4, 64-bit"
The query which comes up in any monitoring tool is:
SELECT
esnpartvie0_.esn_id AS col_0_0_,
esnpartvie0_.esn AS col_1_0_,
esnpartvie0_.quarter_point AS col_2_0_,
esnpartvie0_.work_order_number AS col_3_0_,
esnpartvie0_.site AS col_4_0_,
sum(esnpartvie0_.critical) AS col_5_0_,
sum(esnpartvie0_.numshort) AS col_6_0_,
sum(esnpartvie0_.wa) AS col_7_0_,
esnpartvie0_.customer AS col_8_0_,
esnpartvie0_.adj_accum_date AS col_9_0_,
esnpartvie0_.g2_otr AS col_10_0_,
esnpartvie0_.induct_date AS col_11_0_,
min(esnpartvie0_.delta) AS col_12_0_,
esnpartvie0_.fiscal_week_bucket_date AS col_13_0_
FROM
moa.esn_part_view esnpartvie0_
WHERE
esnpartvie0_.esn_id = 140339
GROUP BY
esnpartvie0_.esn_id,
esnpartvie0_.esn,
esnpartvie0_.quarter_point,
esnpartvie0_.work_order_number,
esnpartvie0_.site,
esnpartvie0_.customer,
esnpartvie0_.adj_accum_date,
esnpartvie0_.g2_otr,
esnpartvie0_.induct_date,
esnpartvie0_.fiscal_week_bucket_date
The Explain Analyze, buffer plan for the same is and the link (https://explain.depesz.com/s/mr76#html)
"GroupAggregate (cost=69684.12..69684.17 rows=1 width=82) (actual time=976.163..976.228 rows=1 loops=1)"
" Group Key: esnpartvie0_.esn_id, esnpartvie0_.esn, esnpartvie0_.quarter_point, esnpartvie0_.work_order_number, esnpartvie0_.site, esnpartvie0_.customer, esnpartvie0_.adj_accum_date, esnpartvie0_.g2_otr, esnpartvie0_.induct_date, esnpartvie0_.fiscal_week_bucket_date"
" Buffers: shared hit=20301, temp read=48936 written=6835"
" -> Sort (cost=69684.12..69684.13 rows=1 width=70) (actual time=976.153..976.219 rows=14 loops=1)"
" Sort Key: esnpartvie0_.esn, esnpartvie0_.quarter_point, esnpartvie0_.work_order_number, esnpartvie0_.site, esnpartvie0_.customer, esnpartvie0_.adj_accum_date, esnpartvie0_.g2_otr, esnpartvie0_.induct_date, esnpartvie0_.fiscal_week_bucket_date"
" Sort Method: quicksort Memory: 26kB"
" Buffers: shared hit=20301, temp read=48936 written=6835"
" -> Subquery Scan on esnpartvie0_ (cost=69684.02..69684.11 rows=1 width=70) (actual time=976.078..976.158 rows=14 loops=1)"
" Buffers: shared hit=20290, temp read=48936 written=6835"
" -> GroupAggregate (cost=69684.02..69684.10 rows=1 width=2016) (actual time=976.077..976.155 rows=14 loops=1)"
" Group Key: e.esn_id, w.number, ed.adj_accum_date, (COALESCE(ed.gate_2_otr, 0)), ed.gate_0_start, ed.gate_1_stop, p.part_id, st.name, mat.name, so.name, dr.name, hpc.hpc_status_name, module.module_name, c.customer_id, m.model_id, ef.engine_family_id, s.site_id, ws.name, ic.comment"
" Buffers: shared hit=20290, temp read=48936 written=6835"
" CTE indexed_comments"
" -> WindowAgg (cost=40573.82..45076.80 rows=225149 width=118) (actual time=182.537..291.895 rows=216974 loops=1)"
" Buffers: shared hit=5226, temp read=3319 written=3327"
" -> Sort (cost=40573.82..41136.69 rows=225149 width=110) (actual time=182.528..215.549 rows=216974 loops=1)"
" Sort Key: part_comment.part_id, part_comment.created_at DESC"
" Sort Method: external merge Disk: 26552kB"
" Buffers: shared hit=5226, temp read=3319 written=3327"
" -> Seq Scan on part_comment (cost=0.00..7474.49 rows=225149 width=110) (actual time=0.014..38.209 rows=216974 loops=1)"
" Buffers: shared hit=5223"
" -> Sort (cost=24607.21..24607.22 rows=1 width=717) (actual time=976.069..976.133 rows=14 loops=1)"
" Sort Key: w.number, ed.adj_accum_date, (COALESCE(ed.gate_2_otr, 0)), ed.gate_0_start, ed.gate_1_stop, p.part_id, st.name, mat.name, so.name, dr.name, hpc.hpc_status_name, module.module_name, c.customer_id, m.model_id, ef.engine_family_id, s.site_id, ws.name, ic.comment"
" Sort Method: quicksort Memory: 28kB"
" Buffers: shared hit=20290, temp read=48936 written=6835"
" -> Nested Loop (cost=1010.23..24607.20 rows=1 width=717) (actual time=442.381..976.017 rows=14 loops=1)"
" Buffers: shared hit=20287, temp read=48936 written=6835"
" -> Nested Loop Left Join (cost=1009.94..24598.88 rows=1 width=697) (actual time=442.337..975.670 rows=14 loops=1)"
" Join Filter: (ic.part_id = p.part_id)"
" Rows Removed by Join Filter: 824838"
" Buffers: shared hit=20245, temp read=48936 written=6835"
" -> Nested Loop Left Join (cost=1009.94..19518.95 rows=1 width=181) (actual time=56.148..57.676 rows=14 loops=1)"
" Buffers: shared hit=15019"
" -> Nested Loop Left Join (cost=1009.81..19518.35 rows=1 width=183) (actual time=56.139..57.635 rows=14 loops=1)"
" Buffers: shared hit=15019"
" -> Nested Loop Left Join (cost=1009.67..19517.67 rows=1 width=181) (actual time=56.133..57.598 rows=14 loops=1)"
" Buffers: shared hit=15019"
" -> Nested Loop Left Join (cost=1009.55..19516.82 rows=1 width=179) (actual time=56.124..57.544 rows=14 loops=1)"
" Buffers: shared hit=15019"
" -> Nested Loop Left Join (cost=1009.42..19516.04 rows=1 width=178) (actual time=56.105..57.439 rows=14 loops=1)"
" Buffers: shared hit=14991"
" -> Nested Loop Left Join (cost=1009.28..19515.37 rows=1 width=175) (actual time=56.089..57.335 rows=14 loops=1)"
" Buffers: shared hit=14963"
" -> Nested Loop Left Join (cost=1009.14..19514.77 rows=1 width=170) (actual time=56.068..57.206 rows=14 loops=1)"
" Join Filter: (e.work_scope_id = ws.work_scope_id)"
" Buffers: shared hit=14935"
" -> Nested Loop Left Join (cost=1009.14..19513.55 rows=1 width=166) (actual time=56.043..57.102 rows=14 loops=1)"
" Join Filter: (e.esn_id = p.esn_id)"
" Buffers: shared hit=14921"
" -> Nested Loop (cost=9.14..31.40 rows=1 width=125) (actual time=0.081..0.130 rows=1 loops=1)"
" Buffers: shared hit=15"
" -> Nested Loop (cost=8.87..23.08 rows=1 width=118) (actual time=0.069..0.117 rows=1 loops=1)"
" Buffers: shared hit=12"
" -> Nested Loop (cost=8.73..21.86 rows=1 width=108) (actual time=0.055..0.102 rows=1 loops=1)"
" Buffers: shared hit=10"
" -> Nested Loop (cost=8.60..21.65 rows=1 width=46) (actual time=0.046..0.091 rows=1 loops=1)"
" Buffers: shared hit=8"
" -> Hash Join (cost=8.31..13.34 rows=1 width=41) (actual time=0.036..0.081 rows=1 loops=1)"
" Hash Cond: (m.model_id = e.model_id)"
" Buffers: shared hit=5"
" -> Seq Scan on model m (cost=0.00..4.39 rows=239 width=17) (actual time=0.010..0.038 rows=240 loops=1)"
" Buffers: shared hit=2"
" -> Hash (cost=8.30..8.30 rows=1 width=28) (actual time=0.009..0.010 rows=1 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" Buffers: shared hit=3"
" -> Index Scan using esn_pkey on esn e (cost=0.29..8.30 rows=1 width=28) (actual time=0.006..0.006 rows=1 loops=1)"
" Index Cond: (esn_id = 140339)"
" Filter: active"
" Buffers: shared hit=3"
" -> Index Scan using work_order_pkey on work_order w (cost=0.29..8.30 rows=1 width=13) (actual time=0.008..0.008 rows=1 loops=1)"
" Index Cond: (work_order_id = e.work_order_id)"
" Buffers: shared hit=3"
" -> Index Scan using engine_family_pkey on engine_family ef (cost=0.14..0.20 rows=1 width=66) (actual time=0.009..0.009 rows=1 loops=1)"
" Index Cond: (engine_family_id = m.engine_family_id)"
" Buffers: shared hit=2"
" -> Index Scan using site_pkey on site s (cost=0.14..1.15 rows=1 width=14) (actual time=0.013..0.013 rows=1 loops=1)"
" Index Cond: (site_id = ef.site_id)"
" Buffers: shared hit=2"
" -> Index Scan using customer_pkey on customer c (cost=0.27..8.29 rows=1 width=11) (actual time=0.012..0.012 rows=1 loops=1)"
" Index Cond: (customer_id = e.customer_id)"
" Buffers: shared hit=3"
" -> Gather (cost=1000.00..19481.78 rows=29 width=41) (actual time=55.958..56.949 rows=14 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" Buffers: shared hit=14906"
" -> Parallel Seq Scan on part p (cost=0.00..18478.88 rows=12 width=41) (actual time=51.855..52.544 rows=5 loops=3)"
" Filter: (active AND (esn_id = 140339))"
" Rows Removed by Filter: 226662"
" Buffers: shared hit=14906"
" -> Seq Scan on work_scope ws (cost=0.00..1.10 rows=10 width=12) (actual time=0.004..0.004 rows=1 loops=14)"
" Buffers: shared hit=14"
" -> Index Scan using source_pkey on source so (cost=0.14..0.57 rows=1 width=13) (actual time=0.005..0.005 rows=1 loops=14)"
" Index Cond: (p.source_id = source_id)"
" Buffers: shared hit=28"
" -> Index Scan using status_pkey on status st (cost=0.13..0.56 rows=1 width=11) (actual time=0.004..0.004 rows=1 loops=14)"
" Index Cond: (p.status_id = status_id)"
" Buffers: shared hit=28"
" -> Index Scan using material_stream_pkey on material_stream mat (cost=0.13..0.56 rows=1 width=9) (actual time=0.004..0.004 rows=1 loops=14)"
" Index Cond: (p.material_stream_id = material_stream_id)"
" Buffers: shared hit=28"
" -> Index Scan using dr_status_pkey on dr_status dr (cost=0.13..0.56 rows=1 width=10) (actual time=0.001..0.001 rows=0 loops=14)"
" Index Cond: (p.dr_status_id = dr_status_id)"
" -> Index Scan using hpc_status_pkey on hpc_status hpc (cost=0.13..0.56 rows=1 width=10) (actual time=0.001..0.001 rows=0 loops=14)"
" Index Cond: (p.hpc_status_id = hpc_status_id)"
" -> Index Scan using module_pkey on module (cost=0.14..0.57 rows=1 width=6) (actual time=0.001..0.001 rows=0 loops=14)"
" Index Cond: (p.module_id = module_id)"
" -> CTE Scan on indexed_comments ic (cost=0.00..5065.85 rows=1126 width=520) (actual time=13.043..61.251 rows=58917 loops=14)"
" Filter: (comment_index = 1)"
" Rows Removed by Filter: 158057"
" Buffers: shared hit=5226, temp read=48936 written=6835"
" -> Index Scan using esn_dates_esn_id_key on esn_dates ed (cost=0.29..8.32 rows=1 width=20) (actual time=0.019..0.020 rows=1 loops=14)"
" Index Cond: (esn_id = 140339)"
" Filter: ((gate_3_stop_actual AND (gate_3_stop >= now())) OR (gate_3_stop IS NULL) OR ((NOT gate_3_stop_actual) AND (gate_3_stop IS NOT NULL) AND (gate_3_stop >= (now() - '730 days'::interval))))"
" Buffers: shared hit=42"
"Planning time: 6.564 ms"
"Execution time: 988.335 ms"
The actual View definition on which the above select is running
with indexed_comments as (
select
part_comment.part_id,
part_comment.comment,
row_number() over (partition by part_comment.part_id
order by
part_comment.created_at desc) as comment_index
from
moa.part_comment
)
select
e.esn_id,
e.name as esn,
e.is_qp_engine as quarter_point,
w.number as work_order_number,
case
when (p.part_id is null) then 0
else p.part_id
end as part_id,
p.part_number,
p.part_description,
p.quantity,
st.name as status,
p.status_id,
mat.name as material_stream,
p.material_stream_id,
so.name as source,
p.source_id,
p.oem,
p.po_number,
p.manual_cso_commit,
p.auto_cso_commit,
coalesce(p.manual_cso_commit, p.auto_cso_commit) as calculated_cso_commit,
(coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)), ed.gate_0_start) + p.accum_offset) as adjusted_accum,
dr.name as dr_status,
p.dr_status_id,
p.airway_bill,
p.core_material,
hpc.hpc_status_name as hpc_status,
p.hpc_status_id,
module.module_name,
p.module_id,
c.name as customer,
c.customer_id,
m.name as model,
m.model_id,
ef.name as engine_family,
ef.engine_family_id,
s.label as site,
s.site_id,
case
when (coalesce(p.manual_cso_commit, p.auto_cso_commit) > coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)), ed.gate_0_start)) then 1
else 0
end as critical,
case
when (coalesce(p.manual_cso_commit, p.auto_cso_commit) <= coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)), ed.gate_0_start)) then 1
else 0
end as numshort,
case
when ((p.esn_id is not null)
and (coalesce(p.manual_cso_commit, p.auto_cso_commit) is null)) then 1
else 0
end as wa,
ed.adj_accum_date,
(ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)) as g2_otr,
ed.gate_0_start as induct_date,
coalesce((coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0))) - max(coalesce(p.manual_cso_commit, p.auto_cso_commit))), 0) as delta,
coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)), ed.gate_0_start) as fiscal_week_bucket_date,
p.po_line_num,
p.ship_out,
p.receipt,
p.crit_ship,
e.work_scope_id,
ws.name as work_scope,
p.late_call,
p.ex_esn,
p.accum_offset,
ic.comment as latest_comment
from
(((((((((((((((moa.esn e
join moa.work_order w
using (work_order_id))
join moa.model m
using (model_id))
join moa.engine_family ef on
((m.engine_family_id = ef.engine_family_id)))
join moa.site s on
((ef.site_id = s.site_id)))
join moa.customer c
using (customer_id))
left join moa.part p on
(((e.esn_id = p.esn_id)
and (p.active <> false))))
left join moa.work_scope ws on
((e.work_scope_id = ws.work_scope_id)))
left join moa.source so on
((p.source_id = so.source_id)))
left join moa.status st on
((p.status_id = st.status_id)))
left join moa.material_stream mat
using (material_stream_id))
left join moa.dr_status dr
using (dr_status_id))
left join moa.hpc_status hpc
using (hpc_status_id))
left join moa.module module
using (module_id))
left join indexed_comments ic on
(((ic.part_id = p.part_id)
and (ic.comment_index = 1))))
join moa.esn_dates ed on
((e.esn_id = ed.esn_id)))
where
((e.active = true)
and (((ed.gate_3_stop_actual = true)
and (ed.gate_3_stop >= now()))
or (ed.gate_3_stop is null)
or ((ed.gate_3_stop_actual = false)
and (ed.gate_3_stop is not null)
and (ed.gate_3_stop >= (now() - '730 days'::interval)))))
group by
e.esn_id,
w.number,
s.label,
c.name,
p.active,
ed.adj_accum_date,
coalesce(ed.gate_2_otr, 0),
ed.gate_0_start,
ed.gate_1_stop,
p.part_id,
st.name,
mat.name,
so.name,
dr.name,
hpc.hpc_status_name,
module.module_name,
c.customer_id,
m.name,
m.model_id,
ef.name,
ef.engine_family_id,
s.site_id,
ws.name,
ic.comment;

What a horrific query.
Most of the time is going to this:
-> CTE Scan on indexed_comments ic (cost=0.00..5065.85 rows=1126 width=520) (actual time=13.043..61.251 rows=58917 loops=14)"
And the main culprit there is a misestimation of upper sibling node. It thinks it will need to do the CTE Scan one time, but it actually needs to do it 14 times (although apparently returning the same answer each time). If it knew it would do it repeatedly, it would set up a hash table rather than just iterate through it each time. But since setting up the hash requires one iteration through it, it doesn't seem to save anything if it thinks it only needs one iteration in the first place.
I don't know how to fix the estimation problem. But you could compute the ranks on the fly, rather than computing all up front then needing to search through them. You would do that with a LATERAL join.
Change
left join indexed_comments ic on
(((ic.part_id = p.part_id)
and (ic.comment_index = 1))))
to
left join lateral (select comment from part_comment pc where p.part_id=pc.part_id order by created_at desc limit 1) ic on true
and get rid of the with indexed_comments as...
For this to be fast you would need an index ON part_comment (part_id, created_at)

Postgresql LEFT OUTER JOIN and performance

I have two tables: wells(id, name, extra_name) and geodesies(id, well_id, plot)
For this two tables query
EXPLAIN ANALYZE
SELECT
wells.name, geodesies.plot
FROM "geodesies" LEFT OUTER JOIN "wells" ON "wells"."id" = "geodesies"."well_id"
ORDER BY LOWER("wells"."name_nso"), "wells"."extra_name"
LIMIT 10;
Output:
"Limit (cost=1146.27..1146.29 rows=10 width=58) (actual time=64.482..64.488 rows=10 loops=1)"
" -> Sort (cost=1146.27..1176.83 rows=12225 width=58) (actual time=64.480..64.484 rows=10 loops=1)"
" Sort Key: (lower(wells.name_nso)), wells.extra_name"
" Sort Method: top-N heapsort Memory: 27kB"
" -> Hash Left Join (cost=568.17..882.09 rows=12225 width=58) (actual time=11.214..56.280 rows=12225 loops=1)"
" Hash Cond: (geodesies.well_id = wells.id)"
" -> Seq Scan on geodesies (cost=0.00..251.25 rows=12225 width=23) (actual time=0.017..5.533 rows=12225 loops=1)"
" -> Hash (cost=415.30..415.30 rows=12230 width=118) (actual time=11.126..11.127 rows=12230 loops=1)"
" Buckets: 16384 Batches: 1 Memory Usage: 1848kB"
" -> Seq Scan on wells (cost=0.00..415.30 rows=12230 width=118) (actual time=0.009..5.611 rows=12230 loops=1)"
"Planning Time: 0.804 ms"
"Execution Time: 64.544 ms"
This query does not use any index.
If i remove order by from query:
EXPLAIN ANALYZE
SELECT
wells.name, geodesies.plot
FROM "geodesies" LEFT OUTER JOIN "wells" ON "wells"."id" = "geodesies"."well_id"
LIMIT 10;
it uses index, and output looks like:
"Limit (cost=0.57..2.86 rows=10 width=19) (actual time=0.042..0.146 rows=10 loops=1)"
" -> Merge Left Join (cost=0.57..2794.76 rows=12225 width=19) (actual time=0.040..0.142 rows=10 loops=1)"
" Merge Cond: (geodesies.well_id = wells.id)"
" -> Index Scan using index_geodesies_on_well_id on geodesies (cost=0.29..979.64 rows=12225 width=23) (actual time=0.023..0.056 rows=10 loops=1)"
" -> Index Scan using wells_pkey on wells (cost=0.29..1631.73 rows=12230 width=28) (actual time=0.013..0.069 rows=10 loops=1)"
"Planning Time: 0.654 ms"
"Execution Time: 0.293 ms"
How to speed up query with order by clausle?
Regards
PostgresSQL 13

Speed of Postgres SELECT

I am quite new to optimizing the speed of a select, but I have the one below which is time consuming. I would be grateful for suggestions to improve performance.
SELECT DISTINCT p.id "pub_id",
p.submission_year,
pip.level,
mv_s.integer_value "total_citations",
1 "count_pub"
FROM publication p
JOIN organisation_association "oa" ON (oa.publication_id = p.id
AND oa.organisation_id IN (249189578,
249189824))
JOIN bfi_2017 "pip" ON (p.uuid = pip.uuid
AND pip.bfi_score > 0
AND pip.bfi_score IS NOT NULL)
LEFT JOIN metric_value mv_s ON (mv_s.name = 'citations'
AND EXISTS
(SELECT *
FROM publication_metrics pm_s
JOIN metrics m_s ON (m_s.id = pm_s.metrics_id
AND m_s.source_id = 210247389
AND pm_s.publication_id = p.id
AND mv_s.metrics_id = m_s.id)))
WHERE p.peer_review = 'true'
AND (p.type_classification_id IN (57360320,
57360322,
57360324,
57360326,
57360350))
AND p.submission_year = 2017
Execute plan:
"Unique (cost=532129954.32..532286422.32 rows=4084080 width=24) (actual time=1549616.424..1549616.582 rows=699 loops=1)"
" Buffers: shared read=27411, temp read=1774656 written=2496"
" -> Sort (cost=532129954.32..532169071.32 rows=15646800 width=24) (actual time=1549616.422..1549616.445 rows=712 loops=1)"
" Sort Key: p.id, pip.level, mv_s.integer_value"
" Sort Method: quicksort Memory: 80kB"
" Buffers: shared read=27411, temp read=1774656 written=2496"
" -> Nested Loop Left Join (cost=393.40..529618444.45 rows=15646800 width=24) (actual time=1832.122..1549614.196 rows=712 loops=1)"
" Join Filter: (SubPlan 1)"
" Rows Removed by Join Filter: 607313310"
" Buffers: shared read=27411, temp read=1774656 written=2496"
" -> Nested Loop (cost=393.40..8704.01 rows=37 width=16) (actual time=5.470..125.773 rows=712 loops=1)"
" Buffers: shared hit=20313 read=4585"
" -> Hash Join (cost=392.97..7886.65 rows=72 width=16) (actual time=5.160..77.182 rows=3417 loops=1)"
" Hash Cond: ((p.uuid)::text = (pip.uuid)::text)"
" Buffers: shared hit=2 read=3670"
" -> Bitmap Heap Scan on publication p (cost=160.30..7643.44 rows=2618 width=49) (actual time=2.335..67.546 rows=4527 loops=1)"
" Recheck Cond: (submission_year = 2017)"
" Filter: (peer_review AND (type_classification_id = ANY ('{57360320,57360322,57360324,57360326,57360350}'::bigint[])))"
" Rows Removed by Filter: 3975"
" Heap Blocks: exact=3556"
" Buffers: shared hit=2 read=3581"
" -> Bitmap Index Scan on idx_in2ix3rvuzxxf76bsipgn4l4sy (cost=0.00..159.64 rows=8430 width=0) (actual time=1.784..1.784 rows=8502 loops=1)"
" Index Cond: (submission_year = 2017)"
" Buffers: shared read=27"
" -> Hash (cost=181.61..181.61 rows=4085 width=41) (actual time=2.787..2.787 rows=4085 loops=1)"
" Buckets: 4096 Batches: 1 Memory Usage: 324kB"
" Buffers: shared read=89"
" -> Seq Scan on bfi_2017 pip (cost=0.00..181.61 rows=4085 width=41) (actual time=0.029..2.034 rows=4085 loops=1)"
" Filter: ((bfi_score IS NOT NULL) AND (bfi_score > '0'::double precision))"
" Rows Removed by Filter: 3324"
" Buffers: shared read=89"
" -> Index Only Scan using org_ass_publication_idx on organisation_association oa (cost=0.43..11.34 rows=1 width=8) (actual time=0.011..0.012 rows=0 loops=3417)"
" Index Cond: ((publication_id = p.id) AND (organisation_id = ANY ('{249189578,249189824}'::bigint[])))"
" Heap Fetches: 712"
" Buffers: shared hit=20311 read=915"
" -> Materialize (cost=0.00..53679.95 rows=845773 width=12) (actual time=0.012..93.456 rows=852969 loops=712)"
" Buffers: shared read=20873, temp read=1774656 written=2496"
" -> Seq Scan on metric_value mv_s (cost=0.00..45321.09 rows=845773 width=12) (actual time=0.043..470.590 rows=852969 loops=1)"
" Filter: ((name)::text = 'citations'::text)"
" Rows Removed by Filter: 1102878"
" Buffers: shared read=20873"
" SubPlan 1"
" -> Nested Loop (cost=0.85..16.91 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=607313928)"
" Buffers: shared read=1953"
" -> Index Scan using idx_w4wbsbxcqvjmqu64ubjlmqywdy on publication_metrics pm_s (cost=0.43..8.45 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=607313928)"
" Index Cond: (metrics_id = mv_s.metrics_id)"
" Filter: (publication_id = p.id)"
" Rows Removed by Filter: 1"
" -> Index Scan using metrics_pkey on metrics m_s (cost=0.43..8.45 rows=1 width=8) (actual time=0.027..0.027 rows=0 loops=3108)"
" Index Cond: (id = mv_s.metrics_id)"
" Filter: (source_id = 210247389)"
" Rows Removed by Filter: 1"
" Buffers: shared hit=10496 read=1953"
"Planning Time: 1.833 ms"
"Execution Time: 1549621.523 ms"

PostgreSQL Query performance slower with Text column

We are using PostgreSQL 9.5.2
We have 11 tables with around average of 10K records in each table
One of the table contains text column for which maximum content size is 12K characters.
When we exclude text column from select statement, it comes in around 5 seconds, and when we include text column, it take around 55 seconds. if we select any other column from same table, it works fine, but as soon as we take text column, performance goes on toss.
All tables are inner joined.
Can you please suggest on how to solve this?
Explain output shows 378ms but in real, it take around 1 minute to get these data.
so when we exclude text column from "ic" table, get result in 4-5 seconds.
"Nested Loop Left Join (cost=4.04..156.40 rows=10 width=616) (actual time=3.092..377.128 rows=24118 loops=1)"
" -> Nested Loop Left Join (cost=3.90..59.92 rows=7 width=603) (actual time=2.834..110.842 rows=14325 loops=1)"
" -> Nested Loop Left Join (cost=3.76..58.56 rows=7 width=604) (actual time=2.832..101.481 rows=12340 loops=1)"
" -> Nested Loop (cost=3.62..57.19 rows=7 width=590) (actual time=2.830..90.614 rows=8436 loops=1)"
" Join Filter: (i."Id" = ic."ImId")"
" -> Nested Loop (cost=3.33..51.42 rows=7 width=210) (actual time=2.807..65.782 rows=8436 loops=1)"
" -> Nested Loop (cost=3.19..50.21 rows=7 width=187) (actual time=2.424..54.596 rows=8436 loops=1)"
" -> Nested Loop (cost=2.77..46.16 rows=7 width=175) (actual time=1.944..32.056 rows=8436 loops=1)"
" -> Nested Loop (cost=2.35..23.66 rows=5 width=87) (actual time=1.750..1.877 rows=4 loops=1)"
" -> Hash Join (cost=2.22..22.84 rows=5 width=55) (actual time=1.492..1.605 rows=4 loops=1)"
" Hash Cond: (i."ImtypId" = it."Id")"
" -> Nested Loop (cost=0.84..21.29 rows=34 width=51) (actual time=1.408..1.507 rows=30 loops=1)"
" -> Nested Loop (cost=0.56..9.68 rows=34 width=35) (actual time=1.038..1.053 rows=30 loops=1)"
" -> Index Only Scan using ev_query on "table_Ev" e (cost=0.28..4.29 rows=1 width=31) (actual time=0.523..0.523 rows=1 loops=1)"
" Index Cond: ("Id" = 1301)"
" Heap Fetches: 0"
" -> Index Only Scan using asmitm_query on "table_AsmItm" ai (cost=0.28..5.07 rows=31 width=8) (actual time=0.499..0.508 rows=30 loops=1)"
" Index Cond: (("AsmId" = e."AsmId") AND ("IsActive" = true))"
" Filter: "IsActive""
" Heap Fetches: 0"
" -> Index Only Scan using itm_query on "table_Itm" i (cost=0.28..0.33 rows=1 width=16) (actual time=0.014..0.014 rows=1 loops=30)"
" Index Cond: ("Id" = ai."ImId")"
" Heap Fetches: 0"
" -> Hash (cost=1.33..1.33 rows=4 width=12) (actual time=0.026..0.026 rows=4 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" -> Seq Scan on "ItmTyp" it (cost=0.00..1.33 rows=4 width=12) (actual time=0.013..0.018 rows=4 loops=1)"
" Filter: ("ParentId" = 12)"
" Rows Removed by Filter: 22"
" -> Index Only Scan using jur_query on "table_Jur" j (cost=0.14..0.15 rows=1 width=36) (actual time=0.065..0.066 rows=1 loops=4)"
" Index Cond: ("Id" = i."JurId")"
" Heap Fetches: 4"
" -> Index Scan using pwsres_evid_ImId_canid_query on "table_PwsRes" p (cost=0.42..3.78 rows=72 width=92) (actual time=0.056..6.562 rows=2109 loops=4)"
" Index Cond: (("EvId" = 1301) AND ("ImId" = i."Id"))"
" -> Index Only Scan using user_query on "table_User" u (cost=0.42..0.57 rows=1 width=16) (actual time=0.002..0.002 rows=1 loops=8436)"
" Index Cond: ("Id" = p."CanId")"
" Heap Fetches: 0"
" -> Index Only Scan using ins_query on "table_Ins" ins (cost=0.14..0.16 rows=1 width=31) (actual time=0.001..0.001 rows=1 loops=8436)"
" Index Cond: ("Id" = u."InsId")"
" Heap Fetches: 0"
" -> Index Scan using "IX_ItmCont_ImId" on "table_ItmCont" ic (cost=0.29..0.81 rows=1 width=392) (actual time=0.002..0.002 rows=1 loops=8436)"
" Index Cond: ("ImId" = p."ImId")"
" Filter: ("ContTyp" = 'CP'::text)"
" Rows Removed by Filter: 1"
" -> Index Scan using "IX_FreDetail_FreId" on "table_FreDetail" f (cost=0.14..0.18 rows=2 width=22) (actual time=0.000..0.001 rows=1 loops=8436)"
" Index Cond: ("FreId" = p."FreId")"
" -> Index Scan using "IX_DurDetail_DurId" on "table_DurDetail" d (cost=0.14..0.17 rows=2 width=7) (actual time=0.000..0.000 rows=0 loops=12340)"
" Index Cond: ("DurId" = p."DurId")"
" -> Index Scan using "IX_DruConsRouteDetail_DruConsRouId" on "table_DruConsRouDetail" dr (cost=0.14..0.18 rows=2 width=21) (actual time=0.001..0.001 rows=1 loops=14325)"
" Index Cond: ("DruConsRouteId" = p."RouteId")"
" SubPlan 1"
" -> Index Only Scan using asm_query on "table_Asm" (cost=0.14..8.16 rows=1 width=26) (actual time=0.001..0.001 rows=1 loops=24118)"
" Index Cond: ("Id" = e."AsmId")"
" Heap Fetches: 24118"
" SubPlan 2"
" -> Seq Scan on "ItmTyp" ity (cost=0.00..1.33 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=24118)"
" Filter: ("Id" = it."ParentId")"
" Rows Removed by Filter: 25"
"Planning time: 47.056 ms"
"Execution time: 378.229 ms"

If the explain analyze output is taking 378ms, that is how long the query is taking and there's probably not a lot of room for improvement there. If it's taking 1 minute to transfer and load the data, you need to work on that end.
If you're trying to view very wide rows in psql or pgadmin, it can take some time to calculate the row widths or render the html, but that has nothing to do with query performance.

query without limit works faster than query with limit

What is the explanation why the same query with limit 100 works slower than similar query without limit 100. The two queries run against the same database and and the result-set is less than 100
The original query was generated by hibernate and had some extra joins. Based on the feedback I got I made the query simpler and ran
VACUUM FULL ANALYZE events
VACUUM FULL ANALYZE resources
But the problem still exist.
Thanks!
explain ANALYZE
SELECT e.id
FROM events e,
resources r
WHERE e.resource_id = r.id
AND (resource_type_id = '19872817' OR resource_type_id = '282')
ORDER BY occurrence_date DESC LIMIT 100
outputs...
"Limit (cost=0.98..86362.46 rows=100 width=12) (actual time=61958.090..185854.425 rows=22 loops=1)"
" -> Nested Loop (cost=0.98..16791263.94 rows=19443 width=12) (actual time=61958.087..185854.392 rows=22 loops=1)"
" -> Index Scan using eventoccurrencedateindex on events e (cost=0.56..2295556.29 rows=31819630 width=16) (actual time=0.028..31770.948 rows=31819491 loops=1)"
" -> Index Scan using resources_pkey on resources r (cost=0.42..0.45 rows=1 width=4) (actual time=0.004..0.004 rows=0 loops=31819491)"
" Index Cond: (id = e.resource_id)"
" Filter: ((resource_type_id = 19872817) OR (resource_type_id = 282))"
" Rows Removed by Filter: 1"
"Total runtime: 185854.569 ms"
and
explain ANALYZE
SELECT e.id
FROM events e,
resources r
WHERE e.resource_id = r.id
AND (resource_type_id = '19872817' OR resource_type_id = '282')
ORDER BY occurrence_date DESC
outputs...
"Sort (cost=455353.69..455402.30 rows=19443 width=12) (actual time=1.942..1.947 rows=22 loops=1)"
" Sort Key: e.occurrence_date"
" Sort Method: quicksort Memory: 26kB"
" -> Nested Loop (cost=42.30..453968.67 rows=19443 width=12) (actual time=0.720..1.900 rows=22 loops=1)"
" -> Bitmap Heap Scan on resources r (cost=9.53..309.53 rows=86 width=4) (actual time=0.120..0.306 rows=34 loops=1)"
" Recheck Cond: ((resource_type_id = 19872817) OR (resource_type_id = 282))"
" -> BitmapOr (cost=9.53..9.53 rows=86 width=0) (actual time=0.109..0.109 rows=0 loops=1)"
" -> Bitmap Index Scan on resources_type_fk_index (cost=0.00..4.74 rows=43 width=0) (actual time=0.016..0.016 rows=0 loops=1)"
" Index Cond: (resource_type_id = 19872817)"
" -> Bitmap Index Scan on resources_type_fk_index (cost=0.00..4.74 rows=43 width=0) (actual time=0.092..0.092 rows=34 loops=1)"
" Index Cond: (resource_type_id = 282)"
" -> Bitmap Heap Scan on events e (cost=32.78..5259.29 rows=1582 width=16) (actual time=0.041..0.043 rows=1 loops=34)"
" Recheck Cond: (resource_id = r.id)"
" -> Bitmap Index Scan on events_resource_fk_index (cost=0.00..32.38 rows=1582 width=0) (actual time=0.037..0.037 rows=1 loops=34)"
" Index Cond: (resource_id = r.id)"
"Total runtime: 2.054 ms"

Increasing the limit size to 1000 caused Postgres to use a different plan which worked much faster.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Performance degrade while fetching it from views PostgreSQL - postgresql

Related

Postgresql Query performance optimization

Postgresql LEFT OUTER JOIN and performance

Speed of Postgres SELECT

PostgreSQL Query performance slower with Text column

query without limit works faster than query with limit

Categories

Resources