How to improve efficiency of query by removing sub-query producing field? - postgresql
I have the following tables and sample data:
https://www.db-fiddle.com/f/hxQ7BGdgDJtQcv5xTChY9u/0
I would like to increase the performance of the contained query in db-fiddle by ideally removing the sub-query producing the success field (this was taken from ChatGPT output, but it was unable to remove this sub-query without destroying the results). How can I do this?
My question to ChatGPT was this:
using the tables in <db-fiddle link>, write a select sql query to
return all cfg_commissioning_tags columns and
dat_commissioning_test_log.success. These tables are joined by
cfg_commissioning_tags.id = dat_commissioning_test_log.tag_id. If the
tag_source is 'plc' and any rows have success = true, return true for
the success field for all matching type_id and relative_tag_path rows.
To the result ChatGPT produced, I added the AND ct_2.device_name != ct_1.device_name condition into the sub-query which is also required.
The current query, table creation queries, and the current query results are all copied below for posterity:
SELECT
ct_1.device_parent_path
,ct_1.device_name
,ct_1.relative_tag_path
,ct_1.tag_source
,ct_1.type_id
,CASE
WHEN EXISTS (
SELECT 1
FROM dat_commissioning_test_log ctl_2
JOIN cfg_commissioning_tags ct_2 ON ct_2.id = ctl_2.tag_id
WHERE
ct_2.type_id = ct_1.type_id
AND ct_2.relative_tag_path = ct_1.relative_tag_path
AND ct_2.device_name != ct_1.device_name -- without this, it runs super fast, but I need this
AND ctl_2.success = TRUE
AND ct_2.tag_source = 'plc'
) THEN 'true*'
ELSE CASE ctl_1.success WHEN true THEN 'true' ELSE 'false' END
END AS success
FROM cfg_commissioning_tags ct_1
LEFT JOIN dat_commissioning_test_log ctl_1 ON ct_1.id = ctl_1.tag_id
ORDER BY type_id, relative_tag_path
CREATE TABLE dat_commissioning_test_log
(
id integer NOT NULL GENERATED BY DEFAULT AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
tag_id integer,
-- tested_on timestamp with time zone,
success boolean,
-- username character varying(50) COLLATE pg_catalog."default",
-- note character varying(300) COLLATE pg_catalog."default",
CONSTRAINT dat_commissioning_test_log_pkey PRIMARY KEY (id)
);
CREATE TABLE cfg_commissioning_tags
(
id integer NOT NULL GENERATED BY DEFAULT AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
-- full_path character varying(400) COLLATE pg_catalog."default",
device_name character varying(50) COLLATE pg_catalog."default",
device_parent_path character varying(400) COLLATE pg_catalog."default",
-- added_on timestamp with time zone,
relative_tag_path character varying(100) COLLATE pg_catalog."default",
-- retired_on timestamp with time zone,
tag_source character varying(10) COLLATE pg_catalog."default",
type_id character varying(400) COLLATE pg_catalog."default",
CONSTRAINT cfg_commissioning_tags_pkey PRIMARY KEY (id)
);
INSERT INTO cfg_commissioning_tags (id, device_name, device_parent_path, relative_tag_path, tag_source, type_id) VALUES
(1, 'PC13A','DUMMY','Run Mins','plc','DOL'),
(2, 'PC12A','DUMMY','Run Mins','plc','DOL'),
(3, 'PC11A','DUMMY','Run Mins','plc','DOL'),
(4, 'PC11A','DUMMY','Status','io','DOL'),
(5, 'PC11A','DUMMY','Alarms/Isolator Tripped','io','DOL'),
(6, 'PC12A','DUMMY','Status','io','DOL');
INSERT INTO dat_commissioning_test_log (tag_id, success) VALUES
(1, true),
(6, true);
This is the results of the query:
device_parent_path
device_name
relative_tag_path
tag_source
type_id
success
DUMMY
PC11A
Alarms/Isolator Tripped
io
DOL
FALSE
DUMMY
PC13A
Run Mins
plc
DOL
TRUE
DUMMY
PC12A
Run Mins
plc
DOL
true*
DUMMY
PC11A
Run Mins
plc
DOL
true*
DUMMY
PC12A
Status
io
DOL
TRUE
DUMMY
PC11A
Status
io
DOL
FALSE
Edit:
Here is the EXPLAIN(ANALYZE, VERBOSE, BUFFERS) result:
"Sort (cost=4368188.41..4368208.24 rows=7932 width=188) (actual time=10378.617..10378.916 rows=8108 loops=1)"
" Output: ct_1.id, ct_1.full_path, ct_1.device_name, ct_1.device_parent_path, ct_1.added_on, ct_1.relative_tag_path, ct_1.retired_on, ct_1.tag_source, ct_1.type_id, (CASE WHEN (SubPlan 1) THEN 'true*'::text ELSE CASE ctl_1.success WHEN CASE_TEST_EXPR THEN 'true'::text ELSE 'false'::text END END)"
" Sort Key: ct_1.type_id, ct_1.relative_tag_path"
" Sort Method: quicksort Memory: 2350kB"
" Buffers: shared hit=2895186"
" -> Hash Left Join (cost=60.69..4367674.67 rows=7932 width=188) (actual time=1.991..10357.671 rows=8108 loops=1)"
" Output: ct_1.id, ct_1.full_path, ct_1.device_name, ct_1.device_parent_path, ct_1.added_on, ct_1.relative_tag_path, ct_1.retired_on, ct_1.tag_source, ct_1.type_id, CASE WHEN (SubPlan 1) THEN 'true*'::text ELSE CASE ctl_1.success WHEN CASE_TEST_EXPR THEN 'true'::text ELSE 'false'::text END END"
" Hash Cond: (ct_1.id = ctl_1.tag_id)"
" Buffers: shared hit=2895186"
" -> Seq Scan on public.cfg_commissioning_tags ct_1 (cost=0.00..426.32 rows=7932 width=156) (actual time=0.013..1.313 rows=7932 loops=1)"
" Output: ct_1.id, ct_1.full_path, ct_1.device_name, ct_1.device_parent_path, ct_1.added_on, ct_1.relative_tag_path, ct_1.retired_on, ct_1.tag_source, ct_1.type_id"
" Buffers: shared hit=347"
" -> Hash (cost=40.86..40.86 rows=1586 width=5) (actual time=0.326..0.326 rows=1593 loops=1)"
" Output: ctl_1.success, ctl_1.tag_id"
" Buckets: 2048 Batches: 1 Memory Usage: 79kB"
" Buffers: shared hit=25"
" -> Seq Scan on public.dat_commissioning_test_log ctl_1 (cost=0.00..40.86 rows=1586 width=5) (actual time=0.012..0.171 rows=1593 loops=1)"
" Output: ctl_1.success, ctl_1.tag_id"
" Buffers: shared hit=25"
" SubPlan 1"
" -> Hash Join (cost=505.71..550.57 rows=1 width=0) (actual time=1.267..1.267 rows=0 loops=8108)"
" Inner Unique: true"
" Hash Cond: (ctl_2.tag_id = ct_2.id)"
" Buffers: shared hit=2894814"
" -> Seq Scan on public.dat_commissioning_test_log ctl_2 (cost=0.00..40.86 rows=1521 width=4) (actual time=0.003..0.112 rows=1300 loops=3800)"
" Output: ctl_2.id, ctl_2.tag_id, ctl_2.tested_on, ctl_2.success, ctl_2.username, ctl_2.note"
" Filter: ctl_2.success"
" Rows Removed by Filter: 56"
" Buffers: shared hit=81338"
" -> Hash (cost=505.64..505.64 rows=6 width=4) (actual time=1.183..1.183 rows=98 loops=8108)"
" Output: ct_2.id"
" Buckets: 1024 Batches: 1 Memory Usage: 8kB"
" Buffers: shared hit=2813476"
" -> Seq Scan on public.cfg_commissioning_tags ct_2 (cost=0.00..505.64 rows=6 width=4) (actual time=0.620..1.169 rows=98 loops=8108)"
" Output: ct_2.id"
" Filter: (((ct_2.device_name)::text <> (ct_1.device_name)::text) AND ((ct_2.type_id)::text = (ct_1.type_id)::text) AND ((ct_2.relative_tag_path)::text = (ct_1.relative_tag_path)::text) AND ((ct_2.tag_source)::text = 'plc'::text))"
" Rows Removed by Filter: 7834"
" Buffers: shared hit=2813476"
"Planning Time: 0.382 ms"
"Execution Time: 10379.346 ms"
Edit 2
EXPLAIN after adding compound indexes:
"Sort (cost=540847.20..540867.03 rows=7932 width=198) (actual time=1142.282..1142.843 rows=7932 loops=1)"
" Output: ct_1.full_path, ct_1.device_parent_path, ct_1.device_name, ct_1.relative_tag_path, ct_1.tag_source, ct_1.type_id, dat_commissioning_test_log.tested_on, dat_commissioning_test_log.note, (CASE WHEN (SubPlan 1) THEN 'true*'::text ELSE CASE dat_commissioning_test_log.success WHEN CASE_TEST_EXPR THEN 'true'::text ELSE 'false'::text END END)"
" Sort Key: ct_1.full_path"
" Sort Method: quicksort Memory: 2290kB"
" Buffers: shared hit=778254"
" -> Hash Left Join (cost=149.19..540333.47 rows=7932 width=198) (actual time=1.775..1108.469 rows=7932 loops=1)"
" Output: ct_1.full_path, ct_1.device_parent_path, ct_1.device_name, ct_1.relative_tag_path, ct_1.tag_source, ct_1.type_id, dat_commissioning_test_log.tested_on, dat_commissioning_test_log.note, CASE WHEN (SubPlan 1) THEN 'true*'::text ELSE CASE dat_commissioning_test_log.success WHEN CASE_TEST_EXPR THEN 'true'::text ELSE 'false'::text END END"
" Hash Cond: (ct_1.id = dat_commissioning_test_log.tag_id)"
" Buffers: shared hit=778254"
" -> Seq Scan on public.cfg_commissioning_tags ct_1 (cost=0.00..426.32 rows=7932 width=140) (actual time=0.011..0.837 rows=7932 loops=1)"
" Output: ct_1.id, ct_1.full_path, ct_1.device_name, ct_1.device_parent_path, ct_1.added_on, ct_1.relative_tag_path, ct_1.retired_on, ct_1.tag_source, ct_1.type_id"
" Buffers: shared hit=347"
" -> Hash (cost=139.24..139.24 rows=796 width=35) (actual time=1.404..1.404 rows=1417 loops=1)"
" Output: dat_commissioning_test_log.tested_on, dat_commissioning_test_log.note, dat_commissioning_test_log.success, dat_commissioning_test_log.tag_id"
" Buckets: 2048 (originally 1024) Batches: 1 (originally 1) Memory Usage: 83kB"
" Buffers: shared hit=50"
" -> Hash Join (cost=85.28..139.24 rows=796 width=35) (actual time=0.938..1.249 rows=1417 loops=1)"
" Output: dat_commissioning_test_log.tested_on, dat_commissioning_test_log.note, dat_commissioning_test_log.success, dat_commissioning_test_log.tag_id"
" Inner Unique: true"
" Hash Cond: (dat_commissioning_test_log.id = "ANY_subquery".max)"
" Buffers: shared hit=50"
" -> Seq Scan on public.dat_commissioning_test_log (cost=0.00..40.93 rows=1593 width=39) (actual time=0.009..0.089 rows=1593 loops=1)"
" Output: dat_commissioning_test_log.id, dat_commissioning_test_log.tag_id, dat_commissioning_test_log.tested_on, dat_commissioning_test_log.success, dat_commissioning_test_log.username, dat_commissioning_test_log.note"
" Buffers: shared hit=25"
" -> Hash (cost=82.78..82.78 rows=200 width=4) (actual time=0.926..0.926 rows=1417 loops=1)"
" Output: "ANY_subquery".max"
" Buckets: 2048 (originally 1024) Batches: 1 (originally 1) Memory Usage: 66kB"
" Buffers: shared hit=25"
" -> HashAggregate (cost=80.78..82.78 rows=200 width=4) (actual time=0.710..0.804 rows=1417 loops=1)"
" Output: "ANY_subquery".max"
" Group Key: "ANY_subquery".max"
" Buffers: shared hit=25"
" -> Subquery Scan on "ANY_subquery" (cost=48.90..77.23 rows=1417 width=4) (actual time=0.297..0.475 rows=1417 loops=1)"
" Output: "ANY_subquery".max"
" Buffers: shared hit=25"
" -> HashAggregate (cost=48.90..63.07 rows=1417 width=8) (actual time=0.297..0.413 rows=1417 loops=1)"
" Output: max(dat_commissioning_test_log_1.id), dat_commissioning_test_log_1.tag_id"
" Group Key: dat_commissioning_test_log_1.tag_id"
" Buffers: shared hit=25"
" -> Seq Scan on public.dat_commissioning_test_log dat_commissioning_test_log_1 (cost=0.00..40.93 rows=1593 width=8) (actual time=0.006..0.090 rows=1593 loops=1)"
" Output: dat_commissioning_test_log_1.id, dat_commissioning_test_log_1.tag_id, dat_commissioning_test_log_1.tested_on, dat_commissioning_test_log_1.success, dat_commissioning_test_log_1.username, dat_commissioning_test_log_1.note"
" Buffers: shared hit=25"
" SubPlan 1"
" -> Hash Join (cost=23.10..68.04 rows=1 width=0) (actual time=0.133..0.133 rows=0 loops=7932)"
" Inner Unique: true"
" Hash Cond: (ctl_2.tag_id = ct_2.id)"
" Buffers: shared hit=777857"
" -> Seq Scan on public.dat_commissioning_test_log ctl_2 (cost=0.00..40.93 rows=1528 width=4) (actual time=0.002..0.098 rows=1301 loops=3796)"
" Output: ctl_2.id, ctl_2.tag_id, ctl_2.tested_on, ctl_2.success, ctl_2.username, ctl_2.note"
" Filter: ctl_2.success"
" Rows Removed by Filter: 56"
" Buffers: shared hit=81286"
" -> Hash (cost=23.02..23.02 rows=6 width=4) (actual time=0.057..0.057 rows=100 loops=7932)"
" Output: ct_2.id"
" Buckets: 1024 Batches: 1 Memory Usage: 8kB"
" Buffers: shared hit=696571"
" -> Index Scan using cfg_commissioning_tags_idx on public.cfg_commissioning_tags ct_2 (cost=0.41..23.02 rows=6 width=4) (actual time=0.016..0.049 rows=100 loops=7932)"
" Output: ct_2.id"
" Index Cond: (((ct_2.type_id)::text = (ct_1.type_id)::text) AND ((ct_2.relative_tag_path)::text = (ct_1.relative_tag_path)::text) AND ((ct_2.tag_source)::text = 'plc'::text))"
" Filter: ((ct_2.device_name)::text <> (ct_1.device_name)::text)"
" Rows Removed by Filter: 1"
" Buffers: shared hit=696571"
"Planning Time: 0.550 ms"
"Execution Time: 1143.359 ms"
EDIT 3
Replaced covering index on cfg_commissioning_tags:
"Sort (cost=540847.20..540867.03 rows=7932 width=198) (actual time=1152.113..1152.682 rows=7932 loops=1)"
" Output: ct_1.full_path, ct_1.device_parent_path, ct_1.device_name, ct_1.relative_tag_path, ct_1.tag_source, ct_1.type_id, dat_commissioning_test_log.tested_on, dat_commissioning_test_log.note, (CASE WHEN (SubPlan 1) THEN 'true*'::text ELSE CASE dat_commissioning_test_log.success WHEN CASE_TEST_EXPR THEN 'true'::text ELSE 'false'::text END END)"
" Sort Key: ct_1.full_path"
" Sort Method: quicksort Memory: 2290kB"
" Buffers: shared hit=784891"
" -> Hash Left Join (cost=149.19..540333.47 rows=7932 width=198) (actual time=2.016..1115.111 rows=7932 loops=1)"
" Output: ct_1.full_path, ct_1.device_parent_path, ct_1.device_name, ct_1.relative_tag_path, ct_1.tag_source, ct_1.type_id, dat_commissioning_test_log.tested_on, dat_commissioning_test_log.note, CASE WHEN (SubPlan 1) THEN 'true*'::text ELSE CASE dat_commissioning_test_log.success WHEN CASE_TEST_EXPR THEN 'true'::text ELSE 'false'::text END END"
" Hash Cond: (ct_1.id = dat_commissioning_test_log.tag_id)"
" Buffers: shared hit=784891"
" -> Seq Scan on public.cfg_commissioning_tags ct_1 (cost=0.00..426.32 rows=7932 width=140) (actual time=0.014..0.755 rows=7932 loops=1)"
" Output: ct_1.id, ct_1.full_path, ct_1.device_name, ct_1.device_parent_path, ct_1.added_on, ct_1.relative_tag_path, ct_1.retired_on, ct_1.tag_source, ct_1.type_id"
" Buffers: shared hit=347"
" -> Hash (cost=139.24..139.24 rows=796 width=35) (actual time=1.613..1.613 rows=1417 loops=1)"
" Output: dat_commissioning_test_log.tested_on, dat_commissioning_test_log.note, dat_commissioning_test_log.success, dat_commissioning_test_log.tag_id"
" Buckets: 2048 (originally 1024) Batches: 1 (originally 1) Memory Usage: 83kB"
" Buffers: shared hit=50"
" -> Hash Join (cost=85.28..139.24 rows=796 width=35) (actual time=1.117..1.449 rows=1417 loops=1)"
" Output: dat_commissioning_test_log.tested_on, dat_commissioning_test_log.note, dat_commissioning_test_log.success, dat_commissioning_test_log.tag_id"
" Inner Unique: true"
" Hash Cond: (dat_commissioning_test_log.id = "ANY_subquery".max)"
" Buffers: shared hit=50"
" -> Seq Scan on public.dat_commissioning_test_log (cost=0.00..40.93 rows=1593 width=39) (actual time=0.010..0.100 rows=1593 loops=1)"
" Output: dat_commissioning_test_log.id, dat_commissioning_test_log.tag_id, dat_commissioning_test_log.tested_on, dat_commissioning_test_log.success, dat_commissioning_test_log.username, dat_commissioning_test_log.note"
" Buffers: shared hit=25"
" -> Hash (cost=82.78..82.78 rows=200 width=4) (actual time=1.103..1.103 rows=1417 loops=1)"
" Output: "ANY_subquery".max"
" Buckets: 2048 (originally 1024) Batches: 1 (originally 1) Memory Usage: 66kB"
" Buffers: shared hit=25"
" -> HashAggregate (cost=80.78..82.78 rows=200 width=4) (actual time=0.798..0.940 rows=1417 loops=1)"
" Output: "ANY_subquery".max"
" Group Key: "ANY_subquery".max"
" Buffers: shared hit=25"
" -> Subquery Scan on "ANY_subquery" (cost=48.90..77.23 rows=1417 width=4) (actual time=0.332..0.549 rows=1417 loops=1)"
" Output: "ANY_subquery".max"
" Buffers: shared hit=25"
" -> HashAggregate (cost=48.90..63.07 rows=1417 width=8) (actual time=0.331..0.482 rows=1417 loops=1)"
" Output: max(dat_commissioning_test_log_1.id), dat_commissioning_test_log_1.tag_id"
" Group Key: dat_commissioning_test_log_1.tag_id"
" Buffers: shared hit=25"
" -> Seq Scan on public.dat_commissioning_test_log dat_commissioning_test_log_1 (cost=0.00..40.93 rows=1593 width=8) (actual time=0.006..0.095 rows=1593 loops=1)"
" Output: dat_commissioning_test_log_1.id, dat_commissioning_test_log_1.tag_id, dat_commissioning_test_log_1.tested_on, dat_commissioning_test_log_1.success, dat_commissioning_test_log_1.username, dat_commissioning_test_log_1.note"
" Buffers: shared hit=25"
" SubPlan 1"
" -> Hash Join (cost=23.10..68.04 rows=1 width=0) (actual time=0.134..0.134 rows=0 loops=7932)"
" Inner Unique: true"
" Hash Cond: (ctl_2.tag_id = ct_2.id)"
" Buffers: shared hit=784494"
" -> Seq Scan on public.dat_commissioning_test_log ctl_2 (cost=0.00..40.93 rows=1528 width=4) (actual time=0.002..0.098 rows=1301 loops=3796)"
" Output: ctl_2.id, ctl_2.tag_id, ctl_2.tested_on, ctl_2.success, ctl_2.username, ctl_2.note"
" Filter: ctl_2.success"
" Rows Removed by Filter: 56"
" Buffers: shared hit=81286"
" -> Hash (cost=23.02..23.02 rows=6 width=4) (actual time=0.057..0.057 rows=100 loops=7932)"
" Output: ct_2.id"
" Buckets: 1024 Batches: 1 Memory Usage: 8kB"
" Buffers: shared hit=703208"
" -> Index Scan using cfg_commissioning_tags_idx on public.cfg_commissioning_tags ct_2 (cost=0.41..23.02 rows=6 width=4) (actual time=0.015..0.049 rows=100 loops=7932)"
" Output: ct_2.id"
" Index Cond: (((ct_2.type_id)::text = (ct_1.type_id)::text) AND ((ct_2.relative_tag_path)::text = (ct_1.relative_tag_path)::text) AND ((ct_2.tag_source)::text = 'plc'::text))"
" Filter: ((ct_2.device_name)::text <> (ct_1.device_name)::text)"
" Rows Removed by Filter: 1"
" Buffers: shared hit=703208"
"Planning Time: 0.514 ms"
"Execution Time: 1153.156 ms"
You need those compound Indexes :
CREATE index cfg_commissioning_tags_idx on cfg_commissioning_tags(type_id, relative_tag_path, device_name, tag_source);
CREATE index dat_commissioning_test_log_idx on dat_commissioning_test_log(tag_id, success);
You can extend the first Index to take in addition of the columns that are used on the where clause and join, We can add all the columns that are used on the select (in this case we can add device_parent_path to our index), this type of index called Covering Index.
Related
How to speed up the query in GCP PostgreSQL
My postgres server details are below using postgres 13 version, RAM 52GB,SSD 1000GB And the DB size is 300GB Here is my Query select distinct "col2","col3","col1" from table1(foreign table) where "col2" not in (select "col4" from table2(foreign table) where "col9" = 'data1' and "col10"='A') and "col2" not in (select "col13" from table5(foreign table) where "col11" = 'A' and "col12" in ('data1', 'data2', 'data3', 'data4')) and "col6" > '2022-01-01' and "col10" = 'A' and "col18" = 'P' and not "col7" = 'V' and "Type" = 'A' order by "col1" Here is my Explain Plan "Unique (cost=372.13..372.14 rows=1 width=1074) (actual time=145329.010..145329.136 rows=336 loops=1)" " Output: table1.""col2"", table1.""col3"", table1.""col1""" " Buffers: shared hit=3" " -> Sort (cost=372.13..372.14 rows=1 width=1074) (actual time=145329.008..145329.027 rows=336 loops=1)" " Output: table1.""col2"", table1.""col3"", table1.""col1""" " Sort Key: table1.""col1"", table1.""col2"", table1.""col3""" " Sort Method: quicksort Memory: 63kB" " Buffers: shared hit=3" " -> Foreign Scan on public.table1 (cost=360.38..372.12 rows=1 width=1074) (actual time=144430.980..145327.532 rows=336 loops=1)" " Output: table1.""col2"", table1.""col3"", table1.""col1""" " Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)))" " Rows Removed by Filter: 253144" " Remote SQL: SELECT ""col2"", ""col3"", ""col1"" FROM dbo.table4 WHERE ((""col6"" > '2022-01-01 00:00:00'::timestamp without time zone)) AND ((""col7"" <> 'V'::text)) AND ((""col8"" = 'A'::text))" " SubPlan 1" " -> Foreign Scan on public.table2 (cost=100.00..128.63 rows=1 width=42) (actual time=2.169..104702.862 rows=50573365 loops=1)" " Output: table2.""col4""" " Remote SQL: SELECT ""col5"" FROM dbo.table3 WHERE ((""col9"" = 'data1'::text)) AND ((""col10"" = 'A'::text))" " SubPlan 2" " -> Foreign Scan on public.table5 (cost=100.00..131.74 rows=1 width=42) (actual time=75.363..1015.498 rows=360240 loops=1)" " Output: table5.""col13""" " Remote SQL: SELECT ""col14"" FROM dbo.table6 WHERE ((""col11"" = 'A'::text)) AND ((""col12"" = ANY ('{data1,data2,data3,data4}'::text[])))" "Planning:" " Buffers: shared hit=142" "Planning Time: 1.887 ms" "Execution Time: 145620.958 ms" table1 - 4mln row count table2 - 250mln row count table3 - 400mln row count Table Definition table1 CREATE TABLE IF NOT EXISTS table1 ( "col1" character varying(12) , "col" character varying(1) , "col" character varying(1) , ... ... ); Indexes are exist on other column not having on query columns "col2","col3","col1" Table Definition table2 CREATE TABLE IF NOT EXISTS table2 ( "col4" character varying(12) , "col9" character varying(1) , "col10" character varying(1) , ... ... ); Indexes are exist on table2 CREATE INDEX index1 ON table2("col4" ASC,"col9" ASC,"col" ASC,"col10" ASC); CREATE INDEX index1 ON table2("col" ASC,"col9" ASC,"col4" ASC,"col10" ASC); CREATE INDEX index1 ON table2("col9" ASC,"col4" ASC,"col" ASC,"col10" ASC); CREATE INDEX index1 ON table2("col" ASC,"col9" ASC,"col10" ASC,"col" ASC); Table Definition table5 CREATE TABLE IF NOT EXISTS table5 ( "col11" character varying(12) , "col13" character varying(1) , "col" character varying(1) , ... ... ); Indexes are exist on table5 CREATE INDEX index ON table5("col" ASC, "col" ASC,"col11" ASC); CREATE INDEX index ON table5("col13" ASC,"col11" ASC); CREATE INDEX index ON table5("col" ASC,"col13" ASC,"col11" ASC)INCLUDE ("col"); CREATE INDEX index ON table5("col" ASC, "col" ASC,"col11" ASC); How to speed up the following query execution? it took 3 minutes just to retrieve 365 records. Here is my EXPLAIN (ANALYZE, BUFFERS) "Unique (cost=372.13..372.14 rows=1 width=1074) (actual time=110631.114..110631.262 rows=336 loops=1)" " -> Sort (cost=372.13..372.14 rows=1 width=1074) (actual time=110631.111..110631.142 rows=336 loops=1)" " Sort Key: table1.""col1"", table1.""col2"", table1.""col3""" " Sort Method: quicksort Memory: 63kB" " -> Foreign Scan on table1 (cost=360.38..372.12 rows=1 width=1074) (actual time=110432.132..110629.640 rows=336 loops=1)" " Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)))" " Rows Removed by Filter: 253144" " SubPlan 1" " -> Foreign Scan on table2 (cost=100.00..128.63 rows=1 width=42) (actual time=63638.173..71979.772 rows=50573365 loops=1)" " SubPlan 2" " -> Foreign Scan on table5 (cost=100.00..131.74 rows=1 width=42) (actual time=569.126..630.782 rows=360240 loops=1)" "Planning Time: 0.266 ms" "Execution Time: 111748.715 ms" Here is my EXPLAIN (ANALYZE, BUFFERS) of the "remote SQL" when executed on the remote database "Limit (cost=4157478.69..4157602.66 rows=1000 width=47) (actual time=68356.908..68681.831 rows=336 loops=1)" " Buffers: shared hit=66205118" " -> Unique (cost=4157478.69..4164948.04 rows=60253 width=47) (actual time=68356.905..68681.801 rows=336 loops=1)" " Buffers: shared hit=66205118" " -> Gather Merge (cost=4157478.69..4164496.14 rows=60253 width=47) (actual time=68356.901..68681.718 rows=336 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " Buffers: shared hit=66205118" " -> Sort (cost=4156478.66..4156541.43 rows=25105 width=47) (actual time=66154.447..66154.459 rows=112 loops=3)" " Sort Key: table4.""col1"", table4.""col2"", table4.""col3""" " Sort Method: quicksort Memory: 63kB" " Buffers: shared hit=66205118" " Worker 0: Sort Method: quicksort Memory: 25kB" " Worker 1: Sort Method: quicksort Memory: 25kB" " -> Parallel Seq Scan on table4 (cost=3986703.25..4154644.03 rows=25105 width=47) (actual time=66041.929..66153.663 rows=112 loops=3)" " Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (""col6"" > '2022-01-01 00:00:00'::timestamp without time zone) AND ((""col7"")::text <> 'V'::text) AND ((""col8"")::text = 'A'::text))" " Rows Removed by Filter: 1236606" " Buffers: shared hit=66205102" " SubPlan 1" " -> Index Only Scan using col20 on table3 (cost=0.70..2696555.01 rows=50283867 width=13) (actual time=0.134..25085.583 rows=50573365 loops=3)" " Index Cond: ((""col9"" = 'data1'::text) AND (""col10"" = 'A'::text))" " Heap Fetches: 0" " Buffers: shared hit=65737946" " SubPlan 2" " -> Bitmap Heap Scan on table6 (cost=4962.91..1163549.12 rows=355779 width=13) (actual time=160.770..440.978 rows=360240 loops=3)" " Recheck Cond: (((""col12"")::text = ANY ('{data1,data2,data3,data4}'::text[])) AND ((""col11"")::text = 'A'::text))" " Heap Blocks: exact=110992" " Buffers: shared hit=333992" " -> Bitmap Index Scan on col21 (cost=0.00..4873.97 rows=355779 width=0) (actual time=120.354..120.354 rows=360240 loops=3)" " Index Cond: (((""col12"")::text = ANY ('{data1,data2,data3,data4}'::text[])) AND ((""col11"")::text = 'A'::text))" " Buffers: shared hit=1016" "Planning:" " Buffers: shared hit=451" "Planning Time: 4.039 ms" "Execution Time: 69001.171 ms"
PostgreSQL Performance varies drastically from one server to another
I am running PostgreSQL 13.4 and have a large table (~800M rows) for which I want to find the average and standard deviation of a column. I am running this query on two different servers running the same version of PostgreSQL with the same schema (verified by the diff tool in pgAdmin). The indexes on all the tables are identical. The query I am running is as follows: SELECT AVG("api_axle"."aoa") AS "mean", STDDEV_POP("api_axle"."aoa") AS "std" FROM "api_axle" INNER JOIN "api_train" ON ("api_axle"."train_id" = "api_train"."id") INNER JOIN "api_direction" ON ("api_train"."direction_id" = "api_direction"."id") INNER JOIN "api_track" ON ("api_direction"."track_id" = "api_track"."id") INNER JOIN "api_site" ON ("api_track"."site_id" = "api_site"."id") WHERE ("api_train"."time" >= '2022-06-12T19:43:32.164970+00:00'::timestamptz AND ("api_train"."direction_id" = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid OR "api_direction"."track_id" = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid OR "api_track"."site_id" = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid OR "api_site"."railroad_id" = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) AND NOT ("api_axle"."aoa" IS NULL) AND "api_axle"."bogie_id" IS NULL) On the slow run, execution takes around 3m and on the fast server under 100ms. Both servers have comparable hardware although the disk on the slow running one is around 2.5x slower than the fast one. I would appreciate any insight on what could be causing this discrepancy in query plan and performance. EDIT: Result of EXPLAIN (ANALYZE, BUFFERS): Fast Server: "Finalize Aggregate (cost=7527555.19..7527555.20 rows=1 width=64) (actual time=313.413..317.169 rows=1 loops=1)" " Buffers: shared hit=1607 read=4627 written=651" " -> Gather (cost=7527554.95..7527555.16 rows=2 width=64) (actual time=312.562..317.140 rows=3 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " Buffers: shared hit=1607 read=4627 written=651" " -> Partial Aggregate (cost=7526554.95..7526554.96 rows=1 width=64) (actual time=293.727..293.762 rows=1 loops=3)" " Buffers: shared hit=1607 read=4627 written=651" " -> Nested Loop (cost=408.44..7526548.57 rows=1276 width=4) (actual time=201.987..289.682 rows=3212 loops=3)" " Buffers: shared hit=1607 read=4627 written=651" " -> Hash Join (cost=82.87..9095.86 rows=78 width=16) (actual time=201.709..264.799 rows=222 loops=3)" " Hash Cond: (api_track.site_id = api_site.id)" " Join Filter: ((api_train.direction_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_direction.track_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_track.site_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_site.railroad_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid))" " Rows Removed by Join Filter: 8726" " Buffers: shared hit=165 read=235 written=19" " -> Hash Join (cost=81.10..9090.13 rows=1493 width=64) (actual time=0.794..52.878 rows=8948 loops=3)" " Hash Cond: (api_direction.track_id = api_track.id)" " Buffers: shared hit=99 read=235 written=19" " -> Hash Join (cost=79.13..9083.85 rows=1493 width=48) (actual time=0.660..32.094 rows=8948 loops=3)" " Hash Cond: (api_train.direction_id = api_direction.id)" " Buffers: shared hit=96 read=235 written=19" " -> Parallel Bitmap Heap Scan on api_train (cost=76.20..9076.81 rows=1493 width=32) (actual time=0.433..11.273 rows=8948 loops=3)" " Recheck Cond: (""time"" >= '2022-06-12 19:43:32.16497+00'::timestamp with time zone)" " Heap Blocks: exact=96" " Buffers: shared hit=93 read=235 written=19" " -> Bitmap Index Scan on api_train_time_7204a1a7 (cost=0.00..75.30 rows=3583 width=0) (actual time=1.187..1.188 rows=26877 loops=1)" " Index Cond: (""time"" >= '2022-06-12 19:43:32.16497+00'::timestamp with time zone)" " Buffers: shared hit=21 read=49" " -> Hash (cost=1.86..1.86 rows=86 width=32) (actual time=0.219..0.223 rows=88 loops=3)" " Buckets: 1024 Batches: 1 Memory Usage: 14kB" " Buffers: shared hit=3" " -> Seq Scan on api_direction (cost=0.00..1.86 rows=86 width=32) (actual time=0.010..0.111 rows=88 loops=3)" " Buffers: shared hit=3" " -> Hash (cost=1.43..1.43 rows=43 width=32) (actual time=0.124..0.128 rows=44 loops=3)" " Buckets: 1024 Batches: 1 Memory Usage: 11kB" " Buffers: shared hit=3" " -> Seq Scan on api_track (cost=0.00..1.43 rows=43 width=32) (actual time=0.018..0.069 rows=44 loops=3)" " Buffers: shared hit=3" " -> Hash (cost=1.34..1.34 rows=34 width=32) (actual time=200.592..200.596 rows=35 loops=3)" " Buckets: 1024 Batches: 1 Memory Usage: 11kB" " Buffers: shared hit=3" " -> Seq Scan on api_site (cost=0.00..1.34 rows=34 width=32) (actual time=200.494..200.538 rows=35 loops=3)" " Buffers: shared hit=3" " -> Bitmap Heap Scan on api_axle (cost=325.57..96365.91 rows=1169 width=20) (actual time=0.027..0.076 rows=14 loops=665)" " Recheck Cond: (train_id = api_train.id)" " Filter: ((aoa IS NOT NULL) AND (bogie_id IS NULL))" " Rows Removed by Filter: 233" " Heap Blocks: exact=1142" " Buffers: shared hit=1442 read=4392 written=632" " -> Bitmap Index Scan on api_axle_train_id_8f2bba76 (cost=0.00..325.28 rows=25743 width=0) (actual time=0.018..0.018 rows=248 loops=665)" " Index Cond: (train_id = api_train.id)" " Buffers: shared hit=1408 read=1254 written=177" "Planning:" " Buffers: shared hit=501 read=9" "Planning Time: 9.733 ms" "JIT:" " Functions: 119" " Options: Inlining true, Optimization true, Expressions true, Deforming true" " Timing: Generation 7.455 ms, Inlining 123.240 ms, Optimization 283.648 ms, Emission 193.846 ms, Total 608.189 ms" "Execution Time: 369.168 ms" Slow Server: "Finalize Aggregate (cost=15629658.70..15629658.71 rows=1 width=64) (actual time=193760.549..193863.020 rows=1 loops=1)" " Buffers: shared hit=991 read=12278213 dirtied=6 written=5" " -> Gather (cost=15629658.46..15629658.67 rows=2 width=64) (actual time=193753.058..193862.932 rows=3 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " Buffers: shared hit=991 read=12278213 dirtied=6 written=5" " -> Partial Aggregate (cost=15628658.46..15628658.47 rows=1 width=64) (actual time=193727.174..193727.188 rows=1 loops=3)" " Buffers: shared hit=991 read=12278213 dirtied=6 written=5" " -> Hash Join (cost=15039.93..15628644.49 rows=2793 width=4) (actual time=14030.963..193705.905 rows=3216 loops=3)" " Hash Cond: (api_track.site_id = api_site.id)" " Join Filter: ((api_train.direction_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_direction.track_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_track.site_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid) OR (api_site.railroad_id = '8a8b5df2-3b6a-4d95-8680-2df461d36d7b'::uuid))" " Rows Removed by Join Filter: 116510" " Buffers: shared hit=991 read=12278213 dirtied=6 written=5" " -> Hash Join (cost=15038.17..15628500.30 rows=53735 width=52) (actual time=13700.147..193347.236 rows=119726 loops=3)" " Hash Cond: (api_direction.track_id = api_track.id)" " Buffers: shared hit=928 read=12278213 dirtied=6 written=5" " -> Hash Join (cost=15036.20..15628343.37 rows=53735 width=36) (actual time=13700.089..193314.073 rows=119726 loops=3)" " Hash Cond: (api_train.direction_id = api_direction.id)" " Buffers: shared hit=925 read=12278213 dirtied=6 written=5" " -> Parallel Hash Join (cost=15033.27..15628192.43 rows=53735 width=20) (actual time=13700.031..193279.908 rows=119726 loops=3)" " Hash Cond: (api_axle.train_id = api_train.id)" " Buffers: shared hit=922 read=12278213 dirtied=6 written=5" " -> Parallel Seq Scan on api_axle (cost=0.00..15574239.21 rows=14826618 width=20) (actual time=2.209..190831.636 rows=12222952 loops=3)" " Filter: ((aoa IS NOT NULL) AND (bogie_id IS NULL))" " Rows Removed by Filter: 251852992" " Buffers: shared hit=911 read=12277904 dirtied=6 written=5" " -> Parallel Hash (cost=14990.80..14990.80 rows=3397 width=32) (actual time=4.012..4.015 rows=8650 loops=3)" " Buckets: 32768 (originally 8192) Batches: 1 (originally 1) Memory Usage: 2080kB" " Buffers: shared hit=11 read=309" " -> Parallel Bitmap Heap Scan on api_train (cost=147.61..14990.80 rows=3397 width=32) (actual time=1.218..5.323 rows=25949 loops=1)" " Recheck Cond: (""time"" >= '2022-06-12 19:43:32.16497+00'::timestamp with time zone)" " Heap Blocks: exact=251" " Buffers: shared hit=11 read=309" " -> Bitmap Index Scan on api_train_time_7204a1a7 (cost=0.00..145.57 rows=8152 width=0) (actual time=1.171..1.172 rows=25990 loops=1)" " Index Cond: (""time"" >= '2022-06-12 19:43:32.16497+00'::timestamp with time zone)" " Buffers: shared hit=5 read=64" " -> Hash (cost=1.86..1.86 rows=86 width=32) (actual time=0.050..0.052 rows=88 loops=3)" " Buckets: 1024 Batches: 1 Memory Usage: 14kB" " Buffers: shared hit=3" " -> Seq Scan on api_direction (cost=0.00..1.86 rows=86 width=32) (actual time=0.018..0.028 rows=88 loops=3)" " Buffers: shared hit=3" " -> Hash (cost=1.43..1.43 rows=43 width=32) (actual time=0.046..0.047 rows=44 loops=3)" " Buckets: 1024 Batches: 1 Memory Usage: 11kB" " Buffers: shared hit=3" " -> Seq Scan on api_track (cost=0.00..1.43 rows=43 width=32) (actual time=0.030..0.035 rows=44 loops=3)" " Buffers: shared hit=3" " -> Hash (cost=1.34..1.34 rows=34 width=32) (actual time=324.272..324.273 rows=35 loops=3)" " Buckets: 1024 Batches: 1 Memory Usage: 11kB" " Buffers: shared hit=3" " -> Seq Scan on api_site (cost=0.00..1.34 rows=34 width=32) (actual time=324.238..324.249 rows=35 loops=3)" " Buffers: shared hit=3" "Planning:" " Buffers: shared hit=41 read=8" "Planning Time: 6.060 ms" "JIT:" " Functions: 128" " Options: Inlining true, Optimization true, Expressions true, Deforming true" " Timing: Generation 11.373 ms, Inlining 119.493 ms, Optimization 521.358 ms, Emission 330.504 ms, Total 982.728 ms" "Execution Time: 193867.165 ms" api_train table definition: Table "public.api_train" Column | Type | Collation | Nullable | Default --------------+--------------------------+-----------+----------+--------- id | uuid | | not null | time | timestamp with time zone | | not null | direction_id | uuid | | not null | error_id | integer | | | Indexes: "api_train_pkey" PRIMARY KEY, btree (id) "api_train_direction_id_49569dab" btree (direction_id) "api_train_error_id_6312c8c6" btree (error_id) "api_train_time_7204a1a7" btree ("time") "unique_site_track_direction_time" UNIQUE CONSTRAINT, btree (direction_id, "time") Foreign-key constraints: "api_train_direction_id_49569dab_fk_api_direction_id" FOREIGN KEY (direction_id) REFERENCES api_direction(id) DEFERRABLE INITIALLY DEFERRED "api_train_error_id_6312c8c6_fk_api_bg6rejectioncode_code" FOREIGN KEY (error_id) REFERENCES api_bg6rejectioncode(code) DEFERRABLE INITIALLY DEFERRED Referenced by: TABLE "api_axle" CONSTRAINT "api_axle_train_id_8f2bba76_fk_api_train_id" FOREIGN KEY (train_id) REFERENCES api_train(id) DEFERRABLE INITIALLY DEFERRED TABLE "api_bogie" CONSTRAINT "api_bogie_train_id_089c4f60_fk_api_train_id" FOREIGN KEY (train_id) REFERENCES api_train(id) DEFERRABLE INITIALLY DEFERRED TABLE "api_trainmodule" CONSTRAINT "api_trainmodule_train_id_9711466e_fk_api_train_id" FOREIGN KEY (train_id) REFERENCES api_train(id) DEFERRABLE INITIALLY DEFERRED EDIT: The solution that worked for me in the end was to update the statistics target for the affected columns like so: ALTER TABLE api_axle ALTER COLUMN aoa SET STATISTICS 500; ANALYZE VERBOSE api_axle; Having better statistics was enough to help the query planner choose the superior plan that it was using on the fast server.
Part of the problem is at least that in the first (fast) plan, the index scan on api_axle is overestimated by a factor of 100. That probably makes that good plan look worse than it actually is (compare the total cost estimates). You could improve the estimate by collecting more detailed statistics: ALTER TABLE api_axle ALTER COLUMN aoa SET STATISTICS 1000; ANALYZE api_axle; If that isn't enough, try to create an index that makes the first type of plan seem even more attractive, in the hope that the optimizer will choose it: CREATE INDEX ON api_axle (train_id) INCLUDE (aoa) WHERE aoa IS NOT NULL AND bogie_id IS NULL; Then VACUUM api_axle and try again. I am hoping for a fast index-only scan.
Improve PostgreSQL - Insert statement execution RUN TIME
SELECT version() = ('PostgreSQL 12.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit',) I have this use-case where I had to insert records DAILY in the DATAVANT_COVID_MATCH table while joining 3 other tables. I’ve created INDEX and also Partitions to decrease the execution time for the Insert SQL but still it’s taking multiple hrs to Insert data into DATAVANT_COVID_MATCH table Below is the code that currently runs DAILY INSERT INTO DATAVANT_O.DATAVANT_COVID_MATCH_{} SELECT CUST_LAST_NM, CUST_FRST_NM, CIGNA_DOB, CIGNA_ZIP, DATAVANT_DOD, DATAVANT_DOB, DEATH_VERIFICATION, DATA_SOURCE, INDIV_ENTPR_ID FROM ( SELECT CR.PATNT_LAST_NM AS CUST_LAST_NM, CR.PATNT_FRST_NM AS CUST_FRST_NM, CRD.CUST_BRTH_DT AS CIGNA_DOB, CR.PATNT_POSTL_CD AS CIGNA_ZIP, MI.DOD AS DATAVANT_DOD, MI.DOB AS DATAVANT_DOB, MI.DEATH_VERIFICATION, MI.DATA_SOURCE, CRD.INDIV_ENTPR_ID, ROW_NUMBER () OVER (PARTITION BY CRD.INDIV_ENTPR_ID ORDER BY CRD.INDIV_ENTPR_ID DESC) FROM DATAVANT_O.COVID_PATNT_REGISTRY_DEID CRD INNER JOIN DATAVANT_STG_O.MORTALITY_INDEX_{} MI ON CRD.TOKEN_1 = MI.TOKEN_1 AND CRD.TOKEN_2 = MI.TOKEN_2 AND CRD.TOKEN_4 = MI.TOKEN_4 INNER JOIN DATAVANT_O.COVID_PATNT_REGISTRY CR ON CR.INDIV_ENTPR_ID = CRD.INDIV_ENTPR_ID ) x WHERE ROW_NUMBER = 1; INDEX: INDEX created for every partition of MORTALITY_INDEX table and DATAVANT_O.DATAVANT_COVID_MATCH table example: CREATE INDEX mortality_index_1941_dod_idx ON datavant_stg_o.mortality_index_1940 USING btree (dod ASC NULLS LAST) TABLESPACE pg_default CREATE INDEX mortality_index_1941_1945_dod_idx ON datavant_stg_o.mortality_index_1941_1945 USING btree (dod ASC NULLS LAST) TABLESPACE pg_default; etc... No of records in each table: DATAVANT_COVID_MATCH = 10k COVID_PATNT_REGISTRY = 800k COVID_PATNT_REGISTRY_DEID = 800k MORTALITY_INDEX(Total count = 220 Million MORTALITY_INDEX ~= 10 Million records for each interval So can someone direct me on how to decrease the execution time to <=1hr ? Any comments/suggestions are appreciated and Let me know if I need to add any additional information. Thank you!!! Below is the EXPLAIN plan for the INSERT statement "Insert on datavant_covid_match (cost=52885719.27..52885719.44 rows=1 width=271)" " -> Subquery Scan on x (cost=52885719.27..52885719.44 rows=1 width=271)" " Filter: (x.row_number = 1)" " -> WindowAgg (cost=52885719.27..52885719.37 rows=5 width=57)" " -> Sort (cost=52885719.27..52885719.28 rows=5 width=49)" " Sort Key: crd.indiv_entpr_id DESC" " -> Nested Loop (cost=21573582.34..52885719.21 rows=5 width=49)" " Join Filter: ((crd.indiv_entpr_id)::text = (cr.indiv_entpr_id)::text)" " -> Hash Join (cost=21573582.34..52647701.34 rows=1 width=29)" " Hash Cond: (((crd.token_1)::text = (mi.token_1)::text) AND ((crd.token_2)::text = (mi.token_2)::text) AND ((crd.token_4)::text = (mi.token_4)::text))" " -> Seq Scan on covid_patnt_registry_deid crd (cost=0.00..13.20 rows=320 width=145)" " -> Hash (cost=13011477.77..13011477.77 rows=219629118 width=148)" " -> Append (cost=0.00..13011477.77 rows=219629118 width=148)" " -> Seq Scan on mortality_index_1940 mi (cost=0.00..26129.12 rows=471412 width=149)" " -> Seq Scan on mortality_index_1941_1945 mi_1 (cost=0.00..89439.94 rows=1615094 width=149)" " -> Seq Scan on mortality_index_1946_1950 mi_2 (cost=0.00..110751.92 rows=1998492 width=149)" " -> Seq Scan on mortality_index_1951_1955 mi_3 (cost=0.00..170548.84 rows=3077984 width=149)" " -> Seq Scan on mortality_index_1956_1960 mi_4 (cost=0.00..228210.95 rows=4120895 width=149)" " -> Seq Scan on mortality_index_1961_1965 mi_5 (cost=0.00..416877.60 rows=7535260 width=148)" " -> Seq Scan on mortality_index_1966_1970 mi_6 (cost=0.00..721723.91 rows=13042691 width=148)" " -> Seq Scan on mortality_index_1971_1975 mi_7 (cost=0.00..863088.56 rows=15582656 width=148)" " -> Seq Scan on mortality_index_1976_1980 mi_8 (cost=0.00..932241.96 rows=16833796 width=149)" " -> Seq Scan on mortality_index_1981_1985 mi_9 (cost=0.00..956751.74 rows=17281174 width=149)" " -> Seq Scan on mortality_index_1986_1990 mi_10 (cost=0.00..972980.59 rows=17920859 width=145)" " -> Seq Scan on mortality_index_1991_1995 mi_11 (cost=0.00..1059929.92 rows=19515892 width=145)" " -> Seq Scan on mortality_index_1996_2000 mi_12 (cost=0.00..1147163.44 rows=20842344 width=146)" " -> Seq Scan on mortality_index_2001_2005 mi_13 (cost=0.00..1197933.26 rows=21622326 width=148)" " -> Seq Scan on mortality_index_2006_2010 mi_14 (cost=0.00..925468.03 rows=16956803 width=149)" " -> Seq Scan on mortality_index_2011_2015 mi_15 (cost=0.00..1028501.34 rows=19858534 width=149)" " -> Seq Scan on mortality_index_2016_2020 mi_16 (cost=0.00..1065579.24 rows=21352824 width=150)" " -> Seq Scan on mortality_index_2021_2025 mi_17 (cost=0.00..10.80 rows=80 width=398)" " -> Seq Scan on mortality_index_2026_2030 mi_18 (cost=0.00..1.02 rows=2 width=398)" " -> Seq Scan on covid_patnt_registry cr (cost=0.00..178937.94 rows=4726394 width=29)" EDIT: As requested EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) for SELECT "Subquery Scan on x (cost=28188731.46..28188731.66 rows=1 width=49) (actual time=38.348..38.348 rows=0 loops=1)" " Output: x.cust_last_nm, x.cust_frst_nm, x.cigna_dob, x.cigna_zip, x.datavant_dod, x.datavant_dob, x.death_verification, x.data_source, x.indiv_entpr_id" " Filter: (x.row_number = 1)" " Buffers: shared hit=141 read=3" " -> WindowAgg (cost=28188731.46..28188731.58 rows=6 width=57) (actual time=38.346..38.346 rows=0 loops=1)" " Output: cr.patnt_last_nm, cr.patnt_frst_nm, crd.cust_brth_dt, cr.patnt_postl_cd, mi_13.dod, mi_13.dob, mi_13.death_verification, mi_13.data_source, crd.indiv_entpr_id, row_number() OVER (?)" " Buffers: shared hit=141 read=3" " -> Sort (cost=28188731.46..28188731.48 rows=6 width=49) (actual time=38.338..38.338 rows=0 loops=1)" " Output: crd.indiv_entpr_id, cr.patnt_last_nm, cr.patnt_frst_nm, crd.cust_brth_dt, cr.patnt_postl_cd, mi_13.dod, mi_13.dob, mi_13.death_verification, mi_13.data_source" " Sort Key: crd.indiv_entpr_id DESC" " Sort Method: quicksort Memory: 25kB" " Buffers: shared hit=141 read=3" " -> Nested Loop (cost=1018.80..28188731.39 rows=6 width=49) (actual time=38.291..38.291 rows=0 loops=1)" " Output: crd.indiv_entpr_id, cr.patnt_last_nm, cr.patnt_frst_nm, crd.cust_brth_dt, cr.patnt_postl_cd, mi_13.dod, mi_13.dob, mi_13.death_verification, mi_13.data_source" " Join Filter: ((crd.indiv_entpr_id)::text = (cr.indiv_entpr_id)::text)" " Buffers: shared hit=138 read=3" " -> Gather (cost=1018.80..27906096.67 rows=1 width=29) (actual time=38.290..39.672 rows=0 loops=1)" " Output: crd.cust_brth_dt, crd.indiv_entpr_id, mi_13.dod, mi_13.dob, mi_13.death_verification, mi_13.data_source" " Workers Planned: 2" " Workers Launched: 2" " Buffers: shared hit=138 read=3" " -> Hash Join (cost=18.80..27905096.57 rows=1 width=29) (actual time=9.141..9.143 rows=0 loops=3)" " Output: crd.cust_brth_dt, crd.indiv_entpr_id, mi_13.dod, mi_13.dob, mi_13.death_verification, mi_13.data_source" " Hash Cond: (((mi_13.token_1)::text = (crd.token_1)::text) AND ((mi_13.token_2)::text = (crd.token_2)::text) AND ((mi_13.token_4)::text = (crd.token_4)::text))" " Buffers: shared hit=138 read=3" " Worker 0: actual time=9.014..9.017 rows=0 loops=1" " Buffers: shared hit=69 read=1" " Worker 1: actual time=11.521..11.523 rows=0 loops=1" " Buffers: shared hit=69 read=1" " -> Parallel Append (cost=0.00..11089723.14 rows=91512134 width=148) (actual time=8.920..8.920 rows=1 loops=3)" " Buffers: shared read=3" " Worker 0: actual time=8.689..8.689 rows=1 loops=1" " Buffers: shared read=1" " Worker 1: actual time=11.242..11.242 rows=1 loops=1" " Buffers: shared read=1" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_2001_2005 mi_13 (cost=0.00..1071803.02 rows=9009302 width=148) (actual time=11.240..11.240 rows=1 loops=1)" " Output: mi_13.dod, mi_13.dob, mi_13.death_verification, mi_13.data_source, mi_13.token_1, mi_13.token_2, mi_13.token_4" " Buffers: shared read=1" " Worker 1: actual time=11.240..11.240 rows=1 loops=1" " Buffers: shared read=1" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1996_2000 mi_12 (cost=0.00..1025583.10 rows=8684310 width=146) (actual time=8.687..8.687 rows=1 loops=1)" " Output: mi_12.dod, mi_12.dob, mi_12.death_verification, mi_12.data_source, mi_12.token_1, mi_12.token_2, mi_12.token_4" " Buffers: shared read=1" " Worker 0: actual time=8.687..8.687 rows=1 loops=1" " Buffers: shared read=1" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1991_1995 mi_11 (cost=0.00..946087.22 rows=8131622 width=145) (never executed)" " Output: mi_11.dod, mi_11.dob, mi_11.death_verification, mi_11.data_source, mi_11.token_1, mi_11.token_2, mi_11.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_2016_2020 mi_16 (cost=0.00..941021.10 rows=8897010 width=150) (never executed)" " Output: mi_16.dod, mi_16.dob, mi_16.death_verification, mi_16.data_source, mi_16.token_1, mi_16.token_2, mi_16.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_2011_2015 mi_15 (cost=0.00..912659.89 rows=8274389 width=149) (never executed)" " Output: mi_15.dod, mi_15.dob, mi_15.death_verification, mi_15.data_source, mi_15.token_1, mi_15.token_2, mi_15.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1986_1990 mi_10 (cost=0.00..868442.25 rows=7467025 width=145) (never executed)" " Output: mi_10.dod, mi_10.dob, mi_10.death_verification, mi_10.data_source, mi_10.token_1, mi_10.token_2, mi_10.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1981_1985 mi_9 (cost=0.00..855944.89 rows=7200489 width=149) (never executed)" " Output: mi_9.dod, mi_9.dob, mi_9.death_verification, mi_9.data_source, mi_9.token_1, mi_9.token_2, mi_9.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1976_1980 mi_8 (cost=0.00..834044.82 rows=7014082 width=149) (never executed)" " Output: mi_8.dod, mi_8.dob, mi_8.death_verification, mi_8.data_source, mi_8.token_1, mi_8.token_2, mi_8.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_2006_2010 mi_14 (cost=0.00..826553.35 rows=7065335 width=149) (never executed)" " Output: mi_14.dod, mi_14.dob, mi_14.death_verification, mi_14.data_source, mi_14.token_1, mi_14.token_2, mi_14.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1971_1975 mi_7 (cost=0.00..772189.73 rows=6492773 width=148) (never executed)" " Output: mi_7.dod, mi_7.dob, mi_7.death_verification, mi_7.data_source, mi_7.token_1, mi_7.token_2, mi_7.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1966_1970 mi_6 (cost=0.00..645641.55 rows=5434455 width=148) (never executed)" " Output: mi_6.dod, mi_6.dob, mi_6.death_verification, mi_6.data_source, mi_6.token_1, mi_6.token_2, mi_6.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1961_1965 mi_5 (cost=0.00..372921.92 rows=3139692 width=148) (never executed)" " Output: mi_5.dod, mi_5.dob, mi_5.death_verification, mi_5.data_source, mi_5.token_1, mi_5.token_2, mi_5.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1956_1960 mi_4 (cost=0.00..204172.40 rows=1717040 width=149) (never executed)" " Output: mi_4.dod, mi_4.dob, mi_4.death_verification, mi_4.data_source, mi_4.token_1, mi_4.token_2, mi_4.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1951_1955 mi_3 (cost=0.00..152593.93 rows=1282493 width=149) (never executed)" " Output: mi_3.dod, mi_3.dob, mi_3.death_verification, mi_3.data_source, mi_3.token_1, mi_3.token_2, mi_3.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1946_1950 mi_2 (cost=0.00..99094.05 rows=832705 width=149) (never executed)" " Output: mi_2.dod, mi_2.dob, mi_2.death_verification, mi_2.data_source, mi_2.token_1, mi_2.token_2, mi_2.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1941_1945 mi_1 (cost=0.00..80018.56 rows=672956 width=149) (never executed)" " Output: mi_1.dod, mi_1.dob, mi_1.death_verification, mi_1.data_source, mi_1.token_1, mi_1.token_2, mi_1.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_1940 mi (cost=0.00..23379.22 rows=196422 width=149) (never executed)" " Output: mi.dod, mi.dob, mi.death_verification, mi.data_source, mi.token_1, mi.token_2, mi.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_2021_2025 mi_17 (cost=0.00..10.47 rows=47 width=398) (never executed)" " Output: mi_17.dod, mi_17.dob, mi_17.death_verification, mi_17.data_source, mi_17.token_1, mi_17.token_2, mi_17.token_4" " -> Parallel Seq Scan on datavant_stg_o.mortality_index_2026_2030 mi_18 (cost=0.00..1.01 rows=1 width=398) (actual time=6.825..6.825 rows=1 loops=1)" " Output: mi_18.dod, mi_18.dob, mi_18.death_verification, mi_18.data_source, mi_18.token_1, mi_18.token_2, mi_18.token_4" " Buffers: shared read=1" " -> Hash (cost=13.20..13.20 rows=320 width=145) (actual time=0.010..0.011 rows=0 loops=3)" " Output: crd.cust_brth_dt, crd.indiv_entpr_id, crd.token_1, crd.token_2, crd.token_4" " Buckets: 1024 Batches: 1 Memory Usage: 8kB" " Worker 0: actual time=0.015..0.015 rows=0 loops=1" " Worker 1: actual time=0.013..0.013 rows=0 loops=1" " -> Seq Scan on datavant_o.covid_patnt_registry_deid crd (cost=0.00..13.20 rows=320 width=145) (actual time=0.010..0.010 rows=0 loops=3)" " Output: crd.cust_brth_dt, crd.indiv_entpr_id, crd.token_1, crd.token_2, crd.token_4" " Worker 0: actual time=0.015..0.015 rows=0 loops=1" " Worker 1: actual time=0.012..0.012 rows=0 loops=1" " -> Seq Scan on datavant_o.covid_patnt_registry cr (cost=0.00..213480.43 rows=5532343 width=29) (never executed)" " Output: cr.covid_patnt_regstry_sv_key, cr.cret_ts, cr.indiv_entpr_id, cr.patnt_frst_nm, cr.patnt_last_nm, cr.patnt_brth_dt, cr.patnt_gendr_cd, cr.patnt_st_cd, cr.patnt_postl_cd, cr.patnt_dth_dt, cr.covid_idfd_frm_clm_ind, cr.covid_idfd_frm_lab_ind, cr.frst_diag_dt, cr.hosp_ind, cr.frst_covid_admsn_clm_event_key, cr.frst_covid_admsn_refined_clm_event_key, cr.frst_covid_icu_admsn_refined_clm_event_key, cr.frst_fllwup_clm_ln_key, cr.frst_fllwup_clm_svc_beg_dt, cr.subscrbr_indiv_entpr_id, cr.pre_covid_clncl_case_key, cr.post_covid_clncl_case_key, cr.prim_covid_diag_cd, cr.prim_covid_diag_dt, cr.sec_covid_diag_cd, cr.sec_covid_diag_dt, cr.frst_vacnn_dt, cr.sec_vacnn_dt, cr.vacn_manfctrer_nm, cr.load_ctl_key, cr.ingest_timestamp, cr.incr_ingest_timestamp" "Planning Time: 4.229 ms" "Execution Time: 40.007 ms" DDL for Mortality Table CREATE UNLOGGED TABLE datavant_stg_o.mortality_index ( data_source character varying(25) COLLATE pg_catalog."default", op_directive character varying(25) COLLATE pg_catalog."default", dd_imp_flag integer, dod date, dob date, death_verification integer, gender_probability double precision, gender character varying(25) COLLATE pg_catalog."default", token_1 character varying(44) COLLATE pg_catalog."default", token_2 character varying(44) COLLATE pg_catalog."default", token_4 character varying(44) COLLATE pg_catalog."default", token_5 character varying(44) COLLATE pg_catalog."default", token_7 character varying(44) COLLATE pg_catalog."default", token_16 character varying(44) COLLATE pg_catalog."default", token_key character varying(44) COLLATE pg_catalog."default" ) PARTITION BY RANGE (dod);
Speed of Postgres SELECT
I am quite new to optimizing the speed of a select, but I have the one below which is time consuming. I would be grateful for suggestions to improve performance. SELECT DISTINCT p.id "pub_id", p.submission_year, pip.level, mv_s.integer_value "total_citations", 1 "count_pub" FROM publication p JOIN organisation_association "oa" ON (oa.publication_id = p.id AND oa.organisation_id IN (249189578, 249189824)) JOIN bfi_2017 "pip" ON (p.uuid = pip.uuid AND pip.bfi_score > 0 AND pip.bfi_score IS NOT NULL) LEFT JOIN metric_value mv_s ON (mv_s.name = 'citations' AND EXISTS (SELECT * FROM publication_metrics pm_s JOIN metrics m_s ON (m_s.id = pm_s.metrics_id AND m_s.source_id = 210247389 AND pm_s.publication_id = p.id AND mv_s.metrics_id = m_s.id))) WHERE p.peer_review = 'true' AND (p.type_classification_id IN (57360320, 57360322, 57360324, 57360326, 57360350)) AND p.submission_year = 2017 Execute plan: "Unique (cost=532129954.32..532286422.32 rows=4084080 width=24) (actual time=1549616.424..1549616.582 rows=699 loops=1)" " Buffers: shared read=27411, temp read=1774656 written=2496" " -> Sort (cost=532129954.32..532169071.32 rows=15646800 width=24) (actual time=1549616.422..1549616.445 rows=712 loops=1)" " Sort Key: p.id, pip.level, mv_s.integer_value" " Sort Method: quicksort Memory: 80kB" " Buffers: shared read=27411, temp read=1774656 written=2496" " -> Nested Loop Left Join (cost=393.40..529618444.45 rows=15646800 width=24) (actual time=1832.122..1549614.196 rows=712 loops=1)" " Join Filter: (SubPlan 1)" " Rows Removed by Join Filter: 607313310" " Buffers: shared read=27411, temp read=1774656 written=2496" " -> Nested Loop (cost=393.40..8704.01 rows=37 width=16) (actual time=5.470..125.773 rows=712 loops=1)" " Buffers: shared hit=20313 read=4585" " -> Hash Join (cost=392.97..7886.65 rows=72 width=16) (actual time=5.160..77.182 rows=3417 loops=1)" " Hash Cond: ((p.uuid)::text = (pip.uuid)::text)" " Buffers: shared hit=2 read=3670" " -> Bitmap Heap Scan on publication p (cost=160.30..7643.44 rows=2618 width=49) (actual time=2.335..67.546 rows=4527 loops=1)" " Recheck Cond: (submission_year = 2017)" " Filter: (peer_review AND (type_classification_id = ANY ('{57360320,57360322,57360324,57360326,57360350}'::bigint[])))" " Rows Removed by Filter: 3975" " Heap Blocks: exact=3556" " Buffers: shared hit=2 read=3581" " -> Bitmap Index Scan on idx_in2ix3rvuzxxf76bsipgn4l4sy (cost=0.00..159.64 rows=8430 width=0) (actual time=1.784..1.784 rows=8502 loops=1)" " Index Cond: (submission_year = 2017)" " Buffers: shared read=27" " -> Hash (cost=181.61..181.61 rows=4085 width=41) (actual time=2.787..2.787 rows=4085 loops=1)" " Buckets: 4096 Batches: 1 Memory Usage: 324kB" " Buffers: shared read=89" " -> Seq Scan on bfi_2017 pip (cost=0.00..181.61 rows=4085 width=41) (actual time=0.029..2.034 rows=4085 loops=1)" " Filter: ((bfi_score IS NOT NULL) AND (bfi_score > '0'::double precision))" " Rows Removed by Filter: 3324" " Buffers: shared read=89" " -> Index Only Scan using org_ass_publication_idx on organisation_association oa (cost=0.43..11.34 rows=1 width=8) (actual time=0.011..0.012 rows=0 loops=3417)" " Index Cond: ((publication_id = p.id) AND (organisation_id = ANY ('{249189578,249189824}'::bigint[])))" " Heap Fetches: 712" " Buffers: shared hit=20311 read=915" " -> Materialize (cost=0.00..53679.95 rows=845773 width=12) (actual time=0.012..93.456 rows=852969 loops=712)" " Buffers: shared read=20873, temp read=1774656 written=2496" " -> Seq Scan on metric_value mv_s (cost=0.00..45321.09 rows=845773 width=12) (actual time=0.043..470.590 rows=852969 loops=1)" " Filter: ((name)::text = 'citations'::text)" " Rows Removed by Filter: 1102878" " Buffers: shared read=20873" " SubPlan 1" " -> Nested Loop (cost=0.85..16.91 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=607313928)" " Buffers: shared read=1953" " -> Index Scan using idx_w4wbsbxcqvjmqu64ubjlmqywdy on publication_metrics pm_s (cost=0.43..8.45 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=607313928)" " Index Cond: (metrics_id = mv_s.metrics_id)" " Filter: (publication_id = p.id)" " Rows Removed by Filter: 1" " -> Index Scan using metrics_pkey on metrics m_s (cost=0.43..8.45 rows=1 width=8) (actual time=0.027..0.027 rows=0 loops=3108)" " Index Cond: (id = mv_s.metrics_id)" " Filter: (source_id = 210247389)" " Rows Removed by Filter: 1" " Buffers: shared hit=10496 read=1953" "Planning Time: 1.833 ms" "Execution Time: 1549621.523 ms"
Stuck with timeout issue. Here is the Query , I am getting timeout for:
I am getting this timeout error: Message: SQLSTATE[57014]: Query canceled: 7 ERROR: canceling statement due to statement timeout This is the query that is timing out: SELECT log.id, integration.id AS intid, log.integration_id AS integration_id, integration.name, log.createddate FROM integration log LEFT JOIN integration__sf integration on ( integration.id = log.integration_id) LEFT JOIN property prop on ( log.property_id = prop.id ) LEFT JOIN account acc on ( acc.sfid = integration.account ) WHERE log.id IS NOT NULL AND log.script_type = 'Pull' AND log.script_name = 'ModifyTags' AND log.createddate >= '2018-11-01 00:00:00' AND log.createddate <= '2018-11-30 23:59:59' ORDER BY log.id desc LIMIT 100 OFFSET 0; Is there any scope to optimize this query any more? Here is the EXPLAIN (ANALYZE, BUFFERS) output: "Limit (cost=30809.27..30820.93 rows=100 width=262) (actual time=11.793..11.803 rows=21 loops=1)" " Buffers: shared hit=5 read=935" " -> Gather Merge (cost=30809.27..31199.66 rows=3346 width=262) (actual time=11.791..11.799 rows=21 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " Buffers: shared hit=5 read=935" " -> Sort (cost=29809.24..29813.43 rows=1673 width=262) (actual time=6.844..6.844 rows=7 loops=3)" " Sort Key: log.id DESC" " Sort Method: quicksort Memory: 27kB" " Buffers: shared hit=1967 read=937" " -> Hash Left Join (cost=3003.36..29719.67 rows=1673 width=262) (actual time=6.774..6.819 rows=7 loops=3)" " Hash Cond: ((integration.account__c)::text = (acc.sfid)::text)" " Buffers: shared hit=1953 read=937" " -> Nested Loop Left Join (cost=2472.13..29167.33 rows=1673 width=254) (actual time=3.643..3.686 rows=7 loops=3)" " Buffers: shared hit=969 read=468" " -> Hash Left Join (cost=2471.71..17895.82 rows=1673 width=228) (actual time=3.635..3.673 rows=7 loops=3)" " Hash Cond: (log.integration_id = integration.id)" " Buffers: shared hit=969 read=468" " -> Parallel Bitmap Heap Scan on integration_log log (cost=1936.93..17339.92 rows=1673 width=148) (actual time=0.097..0.132 rows=7 loops=3)" " Recheck Cond: (((script_name)::text = 'ModifyTags'::text) AND ((script_type)::text = 'Pull'::text) AND (createddate >= '2018-11-01 00:00:00+05:30'::timestamp with time zone) AND (createddate <= '2018-12-07 23:59:59+05: (...)" " Filter: (id IS NOT NULL)" " Heap Blocks: exact=19" " Buffers: shared read=26" " -> Bitmap Index Scan on ah_idx_integeration_log_script_name (cost=0.00..1935.93 rows=4016 width=0) (actual time=0.201..0.201 rows=21 loops=1)" " Index Cond: (((script_name)::text = 'ModifyTags'::text) AND ((script_type)::text = 'Pull'::text) AND (createddate >= '2018-11-01 00:00:00+05:30'::timestamp with time zone) AND (createddate <= '2018-12-07 23:59:59 (...)" " Buffers: shared read=5" " -> Hash (cost=483.79..483.79 rows=4079 width=80) (actual time=3.463..3.463 rows=4079 loops=3)" " Buckets: 4096 Batches: 1 Memory Usage: 481kB" " Buffers: shared hit=887 read=442" " -> Seq Scan on integration__c integration (cost=0.00..483.79 rows=4079 width=80) (actual time=0.012..2.495 rows=4079 loops=3)" " Buffers: shared hit=887 read=442" " -> Index Scan using property__c_pkey on property__c prop (cost=0.42..6.74 rows=1 width=30) (actual time=0.001..0.001 rows=0 loops=21)" " Index Cond: (log.property_id = id)" " -> Hash (cost=498.88..498.88 rows=2588 width=42) (actual time=3.098..3.098 rows=2577 loops=3)" " Buckets: 4096 Batches: 1 Memory Usage: 220kB" " Buffers: shared hit=950 read=469" " -> Seq Scan on account acc (cost=0.00..498.88 rows=2588 width=42) (actual time=0.011..2.531 rows=2577 loops=3)" " Buffers: shared hit=950 read=469" "Planning time: 2.513 ms" "Execution time: 13.904 ms" Actually I have got the optimization solution, here the query would be like. SELECT log.id, integration.id AS intid, log.integration_id AS integration_id, integration.name, log.createddate FROM integration log LEFT JOIN integration__sf integration on ( integration.id = log.integration_id) LEFT JOIN property prop on ( log.property_id = prop.id ) LEFT JOIN account acc on ( acc.sfid = integration.account AND prop.account = acc.sfid AND prop.group_membership = integration.grouping) WHERE log.id IS NOT NULL AND log.script_type = 'Pull' AND log.script_name = 'ModifyTags' AND log.createddate >= '2018-11-01 00:00:00' AND log.createddate <= '2018-11-30 23:59:59' ORDER BY log.id desc LIMIT 100 OFFSET 0 If you would suggest more, I will be grateful.