not using index on more interval postgrasql - postgresql

I have 3 tables the number of record in this table is below.
select count(1) from device_data.raw; -- 33,67,66,948
select count(1) from device_data.internal_sensors; --29,36,05,529
select count(1) from device_data.calibrated;--32,00,97,945
if I run the below query
SELECT
raw."time",
raw.device,
raw.therm_temp,
raw.therm_temp_sd,
raw."temp",
cal.tair,
cal.tair_version,
cal.rh,
i_s.t_u14,
i_s.t_u21from device_data.calibrated cal
JOIN device_data.raw raw ON cal."time" = raw."time"
AND cal.device = raw.device
AND cal."location" = raw."location"
JOIN device_data.internal_sensors i_s ON cal.device = i_s.device
AND cal."location" = i_s."location"
AND cal."time" = i_s.time
WHERE
cal.time >= now() - interval '100 day'
all join conditions and filter condition are pk column.
if I run the above query with interval 10 days. it will use the index and
I am getting the output immediately.
Type: Gather; ; Cost: 1001.85 - 9794496.92
Type: Nested Loop (Inner); ; Cost: 1.85 - 9793496.82
Type: Nested Loop (Inner); ; Cost: 1.15 - 9793488.56
Type: Parallel Index Scan; Rel: calibrated ; Cost: 0.58 - 2553907.35
Type: Index Scan; Rel: internal_sensors ; Cost: 0.57 - 8.11
Type: Index Scan; Rel: raw ; Cost: 0.70 - 8.24`
if I run the same query with interval 100 or above it is not using index.
Type: Gather; ; Cost: 10997796.65 - 22328719.05
Type: Nested Loop (Inner); ; Cost: 10996796.65 - 22327718.95
Type: Parallel Hash Join (Inner); ; Cost: 10996795.96 - 22327702.46
Type: Parallel Seq Scan; Rel: internal_sensors ; Cost: 0.00 - 11039189.85
Type: Parallel Hash; ; Cost: 10831915.70 - 10831915.70
Type: Parallel Seq Scan; Rel: calibrated ; Cost: 0.00 - 10831915.70
Type: Index Scan; Rel: raw ; Cost: 0.70 - 8.24`
I am getting output after 10 min if I give the interval 360 days
takes more than 5 hours.
please help me on how to solve this
explain plan for 10 day interval
Gather (cost=1001.85..9670267.42 rows=1 width=74) (actual time=0.509..745226.202 rows=9067714 loops=1)
Output: raw."time", raw.device, raw.therm_temp, raw.therm_temp_sd, raw.temp, cal.tair, cal.tair_version, cal.rh, i_s.t_u14, i_s.t_u21
Workers Planned: 5
Workers Launched: 5
Buffers: shared hit=99973969 read=4810639
I/O Timings: read=33748395.560
-> Nested Loop (cost=1.85..9669267.32 rows=1 width=74) (actual time=0.200..743715.367 rows=1511286 loops=6)
Output: raw."time", raw.device, raw.therm_temp, raw.therm_temp_sd, raw.temp, cal.tair, cal.tair_version, cal.rh, i_s.t_u14, i_s.t_u21
Inner Unique: true
Join Filter: ((cal."time" = raw."time") AND (cal.device = raw.device) AND (cal.location = raw.location))
Buffers: shared hit=99973969 read=4810639
I/O Timings: read=33748395.560
Worker 0: actual time=0.101..744096.095 rows=1512339 loops=1
Buffers: shared hit=16671138 read=801662
I/O Timings: read=5690629.974
Worker 1: actual time=0.108..744178.103 rows=1512591 loops=1
Buffers: shared hit=16677843 read=802658
I/O Timings: read=5550919.933
Worker 2: actual time=0.712..744165.762 rows=1512841 loops=1
Buffers: shared hit=16679836 read=801947
I/O Timings: read=5665561.534
Worker 3: actual time=0.114..744080.538 rows=1515365 loops=1
Buffers: shared hit=16707275 read=803410
I/O Timings: read=5617567.579
Worker 4: actual time=0.101..744133.690 rows=1505928 loops=1
Buffers: shared hit=16602313 read=802074
I/O Timings: read=5531045.600
-> Nested Loop (cost=1.15..9669259.07 rows=1 width=115) (actual time=0.158..352297.308 rows=1511286 loops=6)
Output: cal.tair, cal.tair_version, cal.rh, cal."time", cal.device, cal.location, i_s.t_u14, i_s.t_u21, i_s.device, i_s.location, i_s."time"
Inner Unique: true
Buffers: shared hit=47300501 read=2905220
I/O Timings: read=7712080.552
Worker 0: actual time=0.079..352191.498 rows=1512340 loops=1
Buffers: shared hit=7886345 read=483831
I/O Timings: read=1308459.669
Worker 1: actual time=0.088..352358.497 rows=1512591 loops=1
Buffers: shared hit=7891226 read=484819
I/O Timings: read=1194998.833
Worker 2: actual time=0.561..352774.681 rows=1512841 loops=1
Buffers: shared hit=7891350 read=484415
I/O Timings: read=1248616.239
Worker 3: actual time=0.089..352490.425 rows=1515366 loops=1
Buffers: shared hit=7904173 read=485468
I/O Timings: read=1349107.087
Worker 4: actual time=0.081..352461.215 rows=1505928 loops=1
Buffers: shared hit=7855768 read=484320
I/O Timings: read=1262171.500
-> Parallel Index Scan Backward using idx_caltimedesc on device_data.calibrated cal (cost=0.58..2521559.02 rows=881210 width=59) (actual time=0.134..15419.582 rows=1547766 loops=6)
Output: cal.tair, cal.tair_version, cal.rh, cal."time", cal.device, cal.location
Index Cond: (cal."time" >= (now() - '10 days'::interval))
Buffers: shared hit=418910 read=1256690
I/O Timings: read=5809154.203
Worker 0: actual time=0.056..15531.181 rows=1548555 loops=1
Buffers: shared hit=69465 read=209576
I/O Timings: read=991509.608
Worker 1: actual time=0.062..15490.237 rows=1549215 loops=1
Buffers: shared hit=70003 read=209910
I/O Timings: read=877883.648
Worker 2: actual time=0.532..15342.439 rows=1549390 loops=1
Buffers: shared hit=69837 read=209064
I/O Timings: read=930936.107
Worker 3: actual time=0.065..15353.238 rows=1551856 loops=1
Buffers: shared hit=70498 read=210653
I/O Timings: read=1031704.646
Worker 4: actual time=0.057..15392.558 rows=1542332 loops=1
Buffers: shared hit=69263 read=209137
I/O Timings: read=944736.112
-> Index Scan using idx_time_dev_loc on device_data.internal_sensors i_s (cost=0.57..8.11 rows=1 width=56) (actual time=0.215..0.215 rows=1 loops=9286595)
Output: i_s.t_u14, i_s.t_u21, i_s.device, i_s.location, i_s."time"
Index Cond: ((i_s."time" = cal."time") AND ((i_s.device)::text = cal.device) AND ((i_s.location)::text = cal.location))
Buffers: shared hit=45733772 read=1648530
I/O Timings: read=1902926.349
Worker 0: actual time=0.215..0.215 rows=1 loops=1548555
Buffers: shared hit=7626337 read=274255
I/O Timings: read=316950.061
Worker 1: actual time=0.215..0.215 rows=1 loops=1549215
Buffers: shared hit=7629547 read=274909
I/O Timings: read=317115.184
Worker 2: actual time=0.216..0.216 rows=1 loops=1549390
Buffers: shared hit=7629938 read=275351
I/O Timings: read=317680.132
Worker 3: actual time=0.215..0.215 rows=1 loops=1551856
Buffers: shared hit=7642542 read=274815
I/O Timings: read=317402.441
Worker 4: actual time=0.216..0.216 rows=1 loops=1542332
Buffers: shared hit=7595028 read=275183
I/O Timings: read=317435.388
-> Index Scan using raw_pk on device_data.raw (cost=0.70..8.24 rows=1 width=64) (actual time=0.256..0.256 rows=1 loops=9067717)
Output: raw."time", raw.device, raw.therm_temp, raw.therm_temp_sd, raw.temp, raw.location
Index Cond: ((raw."time" = i_s."time") AND (raw.device = (i_s.device)::text) AND (raw.location = (i_s.location)::text))
Buffers: shared hit=52673468 read=1905419
I/O Timings: read=26036315.008
Worker 0: actual time=0.256..0.256 rows=1 loops=1512340
Buffers: shared hit=8784793 read=317831
I/O Timings: read=4382170.305
Worker 1: actual time=0.256..0.256 rows=1 loops=1512591
Buffers: shared hit=8786617 read=317839
I/O Timings: read=4355921.100
Worker 2: actual time=0.256..0.256 rows=1 loops=1512841
Buffers: shared hit=8788486 read=317532
I/O Timings: read=4416945.295
Worker 3: actual time=0.256..0.256 rows=1 loops=1515366
Buffers: shared hit=8803102 read=317942
I/O Timings: read=4268460.492
Worker 4: actual time=0.257..0.257 rows=1 loops=1505928
Buffers: shared hit=8746545 read=317754
I/O Timings: read=4268874.101
Planning Time: 4.538 ms
If I run this query it is running only I am not getting any result.(almost it is running 5 hours).
explain (analyse,verbose,buffers)
select raw."time",raw.device,raw.therm_temp,raw.therm_temp_sd,
raw."temp",cal.tair,cal.tair_version,cal.rh
,i_s.t_u14,i_s.t_u21
from device_data.calibrated cal
join device_data.raw raw on cal."time"=raw."time" and cal.device =raw.device and cal."location" =raw."location"
join device_data.internal_sensors i_s on cal.device =i_s.device and cal."location"=i_s."location" and cal."time"=i_s.time
where cal.time >= now()-interval '100 day'

Related

Sudden drop in postgreSQL Memory

Version: 13.6
Hi ,
This is a 2 days old issue. Our Postgresql is hosted on AWS RDS Postgresql. Suddenly our free memory dropped from 1300 MB to 23 MB and swap memory increased from 150 mb to 12OO. MB. CPU was little high, read and write iops were high for 2-3 hours , post that low.
Average row size from tbl1 was 65 bytes and from tbl2 was 350 bytes.
In performance insight we had only one query which was having most wait on CPU. Below is explain plan of the query. I wanted to understand why this query lead to so much of free memory drop and swap memory increase?
Once we created a combined index on (hotel_id, checkin) include(booking_id), free memory became stable.
Query:
database1 => EXPLAIN(ANALYZE, VERBOSE, BUFFERS, COSTS, TIMING)
database1-> select l.*, o.* FROM tbl1 o inner join tbl2 l on o.booking_id = l.booking_id where o.checkin = '2022-09-09' and o.hotel_id = 95315 and l.status = 'YES_RM';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=3014.65..3077.08 rows=1 width=270) (actual time=71.157..71.159 rows=0 loops=1)
Output: l.*,o.*
Inner Unique: true
Buffers: shared hit=3721 read=170 dirtied=4
I/O Timings: read=8.173
-> Bitmap Heap Scan on public.tbl1 o (cost=3014.08..3034.13 rows=5 width=74) (actual time=69.447..70.814 rows=8 loops=1)
Output: o.booking_id, o.checkin, o.created_at, o.hotel_id, o.hotel_time_zone, o.invoice, o.booking_created_at, o.updated_at, o.user_profile_id
Recheck Cond: ((o.hotel_id = 95315) AND (o.checkin = '2022-09-09'::date))
Rows Removed by Index Recheck: 508
Heap Blocks: exact=512
Buffers: shared hit=3683 read=163
I/O Timings: read=8.148
-> BitmapAnd (cost=3014.08..3014.08 rows=5 width=0) (actual time=69.182..69.183 rows=0 loops=1)
Buffers: shared hit=3312 read=19
I/O Timings: read=7.812
-> Bitmap Index Scan on idx33t1pn8h284cj2rg6pa33bxqb (cost=0.00..167.50 rows=6790 width=0) (actual time=8.550..8.550 rows=3909 loops=1)
Index Cond: (o.hotel_id = 95315)
Buffers: shared hit=5 read=19
I/O Timings: read=7.812
-> Bitmap Index Scan on idx6xuah53fdtx4p5afo5cbgul47 (cost=0.00..2846.33 rows=114368 width=0) (actual time=59.391..59.391 rows=297558 loops=1)
Index Cond: (o.checkin = '2022-09-09'::date)
Buffers: shared hit=3307
-> Index Scan using idxicjnq3084849d8vvmnhrornn7 on public. tbl2 l (cost=0.57..8.59 rows=1 width=204) (actual time=0.042..0.042 rows=0 loops=8)
Output: l.*
Index Cond: (l.booking_id = o.booking_id)
Filter: ((l.status)::text = 'YES_RM'::text)
Rows Removed by Filter: 1
Buffers: shared hit=38 read=7 dirtied=4
I/O Timings: read=0.024
Planning:
Buffers: shared hit=335 read=57
I/O Timings: read=0.180
Planning Time: 1.472 ms
Execution Time: 71.254 ms
(34 rows)
After Index creation:
Nested Loop (cost=1.14..65.02 rows=1 width=270) (actual time=6.640..6.640 rows=0 loops=1)
Output: l.*,o.*
Inner Unique: true
Buffers: shared hit=54 read=16 dirtied=4
I/O Timings: read=5.554
-> Index Scan using hotel_id_check_in_index on public.tbl1 (cost=0.57..22.07 rows=5 width=74) (actual time=2.105..5.361 rows=11 loops=1)
Output: o.*
Index Cond: ((o.hotel_id = 95315) AND (o.checkin = '2022-09-09'::date))
Buffers: shared hit=8 read=13 dirtied=4
I/O Timings: read=4.354
-> Index Scan using idxicjnq3084849d8vvmnhrornn7 on public.tbl2 l (cost=0.57..8.59 rows=1 width=204) (actual time=0.116..0.116 rows=0 loops=11)
Output: l.*
Index Cond: (l.booking_id = o.booking_id)
Filter: ((l.tatus)::text = 'YES_RM'::text)
Rows Removed by Filter: 0
Buffers: shared hit=46 read=3
I/O Timings: read=1.199
Planning:
Buffers: shared hit=416
Planning Time: 1.323 ms
Execution Time: 6.667 ms
(21 rows)

Understanding why my Postgres queries are faster without indexes

We've inherited a Postgresql driven datawarehouse with some serious performance issues. We've selected a database for one of our customers and are benchmarking queries. We've picked into the queries and found a couple of common tables which are selected from in all queries, which we believe is at the heart of our poor performance. We're concentrating particularly on cold start performance, when no data is loaded in to the shared buffers as this is common scenario for our customers. We took a large query and stripped it down to its slowest part;
explain (analyze, buffers, costs, format json)
select
poi."SKUId" as "SKUId",
poi."ConvertedLineTotal" as "totalrevenue",
poi."TotalDispachCost" as "totaldispachcost",
poi."Quantity" as "quantitysold"
from public.processedorder o
join public.processedorder_item poi on poi."OrderId" = o."OrderId"
WHERE o."ReceivedDate" >= '2020-09-01' and o."ReceivedDate" <= '2021-01-01';
And in fact, stripped this down further to a single table which seems particularly slow;
explain (analyze, buffers, costs, format json)
select o."OrderId", o."ChannelId"
from public.processedorder o
WHERE o."ReceivedDate" >= '2020-09-01' and o."ReceivedDate" <= '2021-01-01'
This processedorders table has around 2.1 million rows with around 35/40k rows being added each month.
The table looks like this- it's fairly wide;
CREATE TABLE public.processedorder (
"OrderId" int4 NOT NULL,
"ChannelId" int4 NOT NULL,
"ShippingId" int4 NOT NULL,
"CountryId" int4 NOT NULL,
"LocationId" int4 NOT NULL,
"PackagingId" int4 NOT NULL,
"ConvertedTotal" numeric(18, 6) NOT NULL,
"ConvertedSubtotal" numeric(18, 6) NOT NULL,
"ConvertedShippingCost" numeric(18, 6) NOT NULL,
"ConvertedShippingTax" numeric(18, 6) NOT NULL,
"ConvertedTax" numeric(18, 6) NOT NULL,
"ConvertedDiscount" numeric(18, 6) NOT NULL,
"ConversionRate" numeric(18, 6) NOT NULL,
"Currency" varchar(3) NOT NULL,
"OriginalTotal" numeric(18, 6) NOT NULL,
"OriginalSubtotal" numeric(18, 6) NOT NULL,
"OriginalShippingCost" numeric(18, 6) NOT NULL,
"OrignalShippingTax" numeric(18, 6) NOT NULL,
"OriginalTax" numeric(18, 6) NOT NULL,
"OriginalDiscount" numeric(18, 6) NOT NULL,
"ReceivedDate" timestamp NOT NULL,
"DispatchByDate" timestamp NOT NULL,
"ProcessedDate" timestamp NOT NULL,
"HoldOrCancel" bool NOT NULL,
"CustomerHash" varchar(100) NOT NULL,
"EmailHash" varchar(100) NOT NULL,
"GetPostalCode" varchar(10) NOT NULL,
"TagId" uuid NOT NULL,
"timestamp" timestamp NOT NULL,
"IsRMA" bool NOT NULL DEFAULT false,
"ConversionType" int4 NOT NULL DEFAULT 0,
"ItemWeight" numeric(18, 6) NULL,
"TotalWeight" numeric(18, 6) NULL,
"PackageWeight" numeric(18, 6) NULL,
"PackageCount" int4 NULL,
CONSTRAINT processedorder_tagid_unique UNIQUE ("TagId")
)
WITH (
fillfactor=50
);
The confusion we have is, on a local copy of the database we run the smallest query with a simple index on receivedDate and it returns the results in 4 seconds-
create INDEX if not exists ix_processedorder_btree_receieveddate ON public.processedorder USING btree ("ReceivedDate" DESC);
execution plan can be seen here https://explain.tensor.ru/archive/explain/639d403ef7bf772f698502ed98ae3f63:0:2021-12-08#explain
Hash Join (cost=166953.18..286705.36 rows=201855 width=19) (actual time=3078.441..4176.251 rows=198552 loops=1)
Hash Cond: (poi."OrderId" = o."OrderId")
Buffers: shared hit=3 read=160623
-> Seq Scan on processedorder_item poi (cost=0.00..108605.28 rows=2434228 width=23) (actual time=0.158..667.435 rows=2434228 loops=1)
Buffers: shared read=84263
-> Hash (cost=164773.85..164773.85 rows=174346 width=4) (actual time=3077.990..3077.991 rows=173668 loops=1)
Buckets: 262144 (originally 262144) Batches: 1 (originally 1) Memory Usage: 8154kB
Buffers: shared read=76360
-> Bitmap Heap Scan on processedorder o (cost=3703.48..164773.85 rows=174346 width=4) (actual time=27.285..3028.721 rows=173668 loops=1)
Recheck Cond: (("ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
Heap Blocks: exact=75882
Buffers: shared read=76360
-> Bitmap Index Scan on ix_receiveddate (cost=0.00..3659.89 rows=174346 width=0) (actual time=17.815..17.815 rows=173668 loops=1)
Index Cond: (("ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
Buffers: shared read=478
We then apply this index to our staging server (same copy of the DB) and run the query, but this time it takes 44 seconds;
https://explain.tensor.ru/archive/explain/9610c603972ba89aac4e223072f27575:0:2021-12-08
Gather (cost=112168.21..275565.45 rows=174360 width=19) (actual time=42401.776..44549.996 rows=145082 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=149058 read=50771
-> Hash Join (cost=111168.21..257129.45 rows=72650 width=19) (actual time=42397.903..44518.824 rows=48361 loops=3)
Hash Cond: (poi."OrderId" = o."OrderId")
Inner Unique: true
Buffers: shared hit=445001 read=161024
-> Parallel Seq Scan on processedorder_item poi (cost=0.00..117223.50 rows=880850 width=23) (actual time=0.302..1426.753 rows=702469 loops=3)
Filter: ((NOT "ContainsComposites") AND ("SKUId" <> 0))
Rows Removed by Filter: 108940
Buffers: shared read=84260
-> Hash (cost=105532.45..105532.45 rows=173408 width=4) (actual time=42396.156..42396.156 rows=173668 loops=3)
Buckets: 262144 (originally 262144) Batches: 1 (originally 1) Memory Usage: 8154kB
Buffers: shared hit=444920 read=76764
-> Index Scan using ix_processedorder_receieveddate on processedorder o (cost=0.43..105532.45 rows=173408 width=4) (actual time=0.827..42152.428 rows=173668 loops=3)
Index Cond: (("ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
Buffers: shared hit=444920 read=76764
Finally, with common sense apparently not working, we just remove the index on the staging server and find it returns the data in 4 seconds (just like our local machine with the index)
https://explain.tensor.ru/archive/explain/486512e19b45d5cbe4b893fdecc434b8:0:2021-12-08
Gather (cost=1000.00..200395.65 rows=177078 width=4) (actual time=2.556..4695.986 rows=173668 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=41868
-> Parallel Seq Scan on processedorder o (cost=0.00..181687.85 rows=73782 width=4) (actual time=0.796..4663.511 rows=57889 loops=3)
Filter: (("ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 642941
Buffers: shared read=151028
Before each query run I execute from SSH:
echo 3 > /proc/sys/vm/drop_caches; service postgresql restart;
We also ran a vacuum full; analyze; before testing.
Is anyone able to explain why this is happening as it makes no sense to us- I would expect the query to perform fastest with the index, given that we are querying a small portion of the data (order records span 9 years and we are selecting just 3 months).
The server itself is Postgres 10.4 running on Amazon AWS E2 i3.2xlarge instance with a couple io2 EBS block store drives running in RAID 0 hosting the psql data.
work_mem is 150MB
shared_buffers is set to 15Gb (60gb server total ram)
effective_io_concurrency = 256
effective_cache_size=45GB
----- Update 1
As per franks suggestion we tried adding a new index which didn't seem to help
Gather (cost=1000.86..218445.54 rows=201063 width=19) (actual time=4.412..61616.676 rows=198552 loops=1)
Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=239331 read=45293
-> Nested Loop (cost=0.86..197339.24 rows=83776 width=19) (actual time=3.214..61548.378 rows=66184 loops=3)
Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
Buffers: shared hit=748815 read=136548
Worker 0: actual time=2.494..61658.415 rows=65876 loops=1
Buffers: shared hit=247252 read=45606
Worker 1: actual time=3.033..61667.490 rows=69100 loops=1
Buffers: shared hit=262232 read=45649
-> Parallel Index Only Scan using ix_processedorder_btree_receieveddate_orderid on public.processedorder o (cost=0.43..112293.34 rows=72359 width=4) (actual time=1.811..40429.474 rows=57889 loops=3)
Output: o."ReceivedDate", o."OrderId"
Index Cond: ((o."ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND (o."ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
Buffers: shared hit=97131 read=76571
Worker 0: actual time=1.195..40625.809 rows=57420 loops=1
Buffers: shared hit=31847 read=25583
Worker 1: actual time=1.850..40463.813 rows=60469 loops=1
Buffers: shared hit=34898 read=25584
-> Index Scan using ix_processedorder_item_orderid on public.processedorder_item poi (cost=0.43..1.12 rows=2 width=23) (actual time=0.316..0.361 rows=1 loops=173668)
Output: poi."OrderItemId", poi."OrderId", poi."SKUId", poi."Quantity", poi."ConvertedLineTotal", poi."ConvertedLineSubtotal", poi."ConvertedLineTax", poi."ConvertedLineDiscount", poi."ConversionRate", poi."OriginalLineTotal", poi."OriginalLineSubtotal", poi."OriginalLineTax", poi."OriginalLineDiscount", poi."Currency", poi."ContainsComposites", poi."TotalDispachCost", poi."TagId", poi."timestamp"
Index Cond: (poi."OrderId" = o."OrderId")
Buffers: shared hit=651684 read=59977
Worker 0: actual time=0.316..0.362 rows=1 loops=57420
Buffers: shared hit=215405 read=20023
Worker 1: actual time=0.303..0.347 rows=1 loops=60469
Buffers: shared hit=227334 read=20065
---- update 2
We've rerun the larger of the two queries
explain (analyze, buffers, verbose, costs, format json)
select
poi."SKUId" as "SKUId",
poi."ConvertedLineTotal" as "totalrevenue",
poi."TotalDispachCost" as "totaldispachcost",
poi."Quantity" as "quantitysold"
from public.processedorder o
join public.processedorder_item poi on poi."OrderId" = o."OrderId"
WHERE o."ReceivedDate" >= '2020-09-01' and o."ReceivedDate" <= '2021-01-01';
this time with track io timings enabled- we ran this twice, once with an index, and again without the index- these are the plans;
With Index:
https://explain.tensor.ru/archive/explain/d763d7e1754c4ddac8bb61e403b135d2:0:2021-12-09
Gather (cost=111153.17..252254.44 rows=201029 width=19) (actual time=36978.100..39083.705 rows=198552 loops=1)
Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=138354 read=60159
I/O Timings: read=16453.227
-> Hash Join (cost=110153.17..231151.54 rows=83762 width=19) (actual time=36974.257..39044.116 rows=66184 loops=3)
Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
Hash Cond: (poi."OrderId" = o."OrderId")
Buffers: shared hit=444230 read=160640
I/O Timings: read=37254.816
Worker 0: actual time=36972.516..39140.086 rows=79787 loops=1
Buffers: shared hit=155175 read=48507
I/O Timings: read=9382.648
Worker 1: actual time=36972.438..39138.754 rows=77496 loops=1
Buffers: shared hit=150701 read=51974
I/O Timings: read=11418.941
-> Parallel Seq Scan on public.processedorder_item poi (cost=0.00..114682.68 rows=1014089 width=23) (actual time=0.262..1212.102 rows=811409 loops=3)
Output: poi."OrderItemId", poi."OrderId", poi."SKUId", poi."Quantity", poi."ConvertedLineTotal", poi."ConvertedLineSubtotal", poi."ConvertedLineTax", poi."ConvertedLineDiscount", poi."ConversionRate", poi."OriginalLineTotal", poi."OriginalLineSubtotal", poi."OriginalLineTax", poi."OriginalLineDiscount", poi."Currency", poi."ContainsComposites", poi."TotalDispachCost", poi."TagId", poi."timestamp"
Buffers: shared read=84260
I/O Timings: read=1574.316
Worker 0: actual time=0.073..1258.453 rows=870378 loops=1
Buffers: shared read=30133
I/O Timings: read=535.914
Worker 1: actual time=0.021..1256.769 rows=841299 loops=1
Buffers: shared read=29126
I/O Timings: read=538.814
-> Hash (cost=104509.15..104509.15 rows=173662 width=4) (actual time=36972.511..36972.511 rows=173668 loops=3)
Output: o."OrderId"
Buckets: 262144 (originally 262144) Batches: 1 (originally 1) Memory Usage: 8154kB
Buffers: shared hit=444149 read=76380
I/O Timings: read=35680.500
Worker 0: actual time=36970.996..36970.996 rows=173668 loops=1
Buffers: shared hit=155136 read=18374
I/O Timings: read=8846.735
Worker 1: actual time=36970.783..36970.783 rows=173668 loops=1
Buffers: shared hit=150662 read=22848
I/O Timings: read=10880.127
-> Index Scan using ix_processedorder_btree_receieveddate on public.processedorder o (cost=0.43..104509.15 rows=173662 width=4) (actual time=0.617..36736.881 rows=173668 loops=3)
Output: o."OrderId"
Index Cond: ((o."ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND (o."ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
Buffers: shared hit=444149 read=76380
I/O Timings: read=35680.500
Worker 0: actual time=0.018..36741.158 rows=173668 loops=1
Buffers: shared hit=155136 read=18374
I/O Timings: read=8846.735
Worker 1: actual time=0.035..36733.684 rows=173668 loops=1
Buffers: shared hit=150662 read=22848
I/O Timings: read=10880.127
And then, the faster run, without indexes;
https://explain.tensor.ru/archive/explain/d0815f4b0c9baf3bdd512bc94051e768:0:2021-12-09
Gather (cost=231259.20..372360.47 rows=201029 width=19) (actual time=4829.302..7920.614 rows=198552 loops=1)
Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=106720 read=69983
I/O Timings: read=2431.673
-> Hash Join (cost=230259.20..351257.57 rows=83762 width=19) (actual time=4825.285..7877.981 rows=66184 loops=3)
Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
Hash Cond: (poi."OrderId" = o."OrderId")
Buffers: shared hit=302179 read=235288
I/O Timings: read=7545.729
Worker 0: actual time=4823.181..7978.501 rows=81394 loops=1
Buffers: shared hit=87613 read=93424
I/O Timings: read=2556.079
Worker 1: actual time=4823.645..7979.241 rows=76022 loops=1
Buffers: shared hit=107846 read=71881
I/O Timings: read=2557.977
-> Parallel Seq Scan on public.processedorder_item poi (cost=0.00..114682.68 rows=1014089 width=23) (actual time=0.267..2122.529 rows=811409 loops=3)
Output: poi."OrderItemId", poi."OrderId", poi."SKUId", poi."Quantity", poi."ConvertedLineTotal", poi."ConvertedLineSubtotal", poi."ConvertedLineTax", poi."ConvertedLineDiscount", poi."ConversionRate", poi."OriginalLineTotal", poi."OriginalLineSubtotal", poi."OriginalLineTax", poi."OriginalLineDiscount", poi."Currency", poi."ContainsComposites", poi."TotalDispachCost", poi."TagId", poi."timestamp"
Buffers: shared read=84260
I/O Timings: read=4135.860
Worker 0: actual time=0.034..2171.677 rows=865244 loops=1
Buffers: shared read=29949
I/O Timings: read=1394.407
Worker 1: actual time=0.068..2174.408 rows=827170 loops=1
Buffers: shared read=28639
I/O Timings: read=1395.723
-> Hash (cost=224615.18..224615.18 rows=173662 width=4) (actual time=4823.318..4823.318 rows=173668 loops=3)
Output: o."OrderId"
Buckets: 262144 (originally 262144) Batches: 1 (originally 1) Memory Usage: 8154kB
Buffers: shared hit=302056 read=151028
I/O Timings: read=3409.869
Worker 0: actual time=4820.884..4820.884 rows=173668 loops=1
Buffers: shared hit=87553 read=63475
I/O Timings: read=1161.672
Worker 1: actual time=4822.104..4822.104 rows=173668 loops=1
Buffers: shared hit=107786 read=43242
I/O Timings: read=1162.254
-> Seq Scan on public.processedorder o (cost=0.00..224615.18 rows=173662 width=4) (actual time=0.744..4644.291 rows=173668 loops=3)
Output: o."OrderId"
Filter: ((o."ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND (o."ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 1928823
Buffers: shared hit=302056 read=151028
I/O Timings: read=3409.869
Worker 0: actual time=0.040..4651.203 rows=173668 loops=1
Buffers: shared hit=87553 read=63475
I/O Timings: read=1161.672
Worker 1: actual time=0.035..4637.415 rows=173668 loops=1
Buffers: shared hit=107786 read=43242
I/O Timings: read=1162.254
From your last two execution plans, I gather the following:
the slow plan using the index reads 22848 blocks from disk for processedorder, which takes 10.88 seconds
the fast plan using the sequential scan reads 43242 blocks from disk for processedorder, which takes 1.162 seconds
So there can be two explanations:
in the fast case, these blocks are actually cached in the kernel cache, so it is a caching effect
Experiment by clearing the kernel cache.
random I/O is more expensive than the optimizer thinks it is
In that case, you could consider raising random_page_cost.
That first query would benefit from an index on the data and the id in a single index:
CREATE INDEX ix_processedorder_btree_receieveddate ON public.processedorder USING btree ("ReceivedDate" DESC, "OrderId");
Does that change the query plan?

SELECT query waiting on DataFileWrite in Postgres

I have a partitioned table on which I run both queries and INSERTs simultaneously from multiple sessions. Database is Postgres 14.1.
Running select wait_event_type, wait_event, state, query from pg_stat_activity often reports my queries (plain SELECT on an indexed column) are waiting on DataFileWrite wait_event, type IO.
I don't understand, why would a query wait on file write?
Limit (cost=210.00..210.00 rows=1 width=14) (actual time=3.279..3.281 rows=0 loops=1)
Buffers: shared hit=27 read=7
I/O Timings: read=3.044
-> Sort (cost=210.00..210.11 rows=43 width=14) (actual time=3.278..3.280 rows=0 loops=1)
Sort Key: mldata.sinceepoch DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=27 read=7
I/O Timings: read=3.044
-> Append (cost=0.56..209.79 rows=43 width=14) (actual time=3.274..3.276 rows=0 loops=1)
Buffers: shared hit=27 read=7
I/O Timings: read=3.044
-> Index Scan using mldata_313_destination_idx on mldata_313 mldata_1 (cost=0.56..20.67 rows=4 width=14) (actual time=0.056..0.056 rows=0 loops=1)
Index Cond: (destination = 154328765)
Filter: (day = ANY ('{313,314,315,316,317,318,319,320}'::integer[]))
Buffers: shared hit=6 read=1
I/O Timings: read=0.010
....... a few more of IndexScan .........
Planning:
Buffers: shared hit=1340 read=136 written=11
I/O Timings: read=0.910 **write=0.138**
Planning Time: 3.920 ms
Execution Time: 3.338 ms

Postgresql Limit 1 causing large number of loops

We have a transactions table, approx. 10m rows and growing. Each customer we have specifies many rules which group certain transactions together based on locations, associated products, sale customer, etc. Off of these rules we produce reports each night which allows them to see the price the customer is paying for products vs their purchase prices from different price lists, these lists are changing daily and each date on the transaction we have to either find their set yearly price or the price that was effective at the date of the transaction.
These price lists can change historically and do all the time as do new historic transactions which are added so within each financial year we have to continue to regenerate these reports.
We are having a problem with the two types of price list/price joins we have to do. The first is on the set yearly price list.
I have removed the queries which bring the transactions in and put into a table called transaction_data_6787.
EXPLAIN analyze
SELECT *
FROM transaction_data_6787 t
inner JOIN LATERAL
(
SELECT p."Price"
FROM "Prices" p
INNER JOIN "PriceLists" pl on p."PriceListId" = pl."Id"
WHERE (pl."CustomerId" = 20)
AND (pl."Year" = 2020)
AND (pl."PriceListTypeId" = 2)
AND p."ProductId" = t.product_id
limit 1
) AS prices ON true
Nested Loop (cost=0.70..133877.20 rows=5394 width=165) (actual time=0.521..193.638 rows=5394 loops=1) -> Seq Scan on transaction_data_6787 t (cost=0.00..159.94 rows=5394 width=145) (actual time=0.005..0.593 rows=5394 loops=1) -> Limit (cost=0.70..24.77 rows=1 width=20) (actual time=0.035..0.035 rows=1 loops=5394)
-> Nested Loop (cost=0.70..24.77 rows=1 width=20) (actual time=0.035..0.035 rows=1 loops=5394)
-> Index Scan using ix_prices_covering on "Prices" p (cost=0.42..8.44 rows=1 width=16) (actual time=0.006..0.015 rows=23 loops=5394)
Index Cond: (("ProductId" = t.product_id))
-> Index Scan using ix_pricelists_covering on "PriceLists" pl (cost=0.28..8.30 rows=1 width=12) (actual time=0.001..0.001 rows=0 loops=122443)
Index Cond: (("Id" = p."PriceListId") AND ("CustomerId" = 20) AND ("PriceListTypeId" = 2))
Filter: ("Year" = 2020)
Rows Removed by Filter: 0 Planning Time: 0.307 ms Execution Time: 193.982 ms
If I remove the LIMIT 1, the execution time drops to 3ms and the 122443 loops on ix_pricelists_covering don't happen. The reason we are doing a lateral join is the price query is dynamically built and sometimes when not joining on the annual price list we join on the effective price lists. This looks like the below:
EXPLAIN analyze
SELECT *
FROM transaction_data_6787 t
inner JOIN LATERAL
(
SELECT p."Price"
FROM "Prices" p
INNER JOIN "PriceLists" pl on p."PriceListId" = pl."Id"
WHERE (pl."CustomerId" = 20)
AND (pl."PriceListTypeId" = 1)
AND p."ProductId" = t.product_id
and pl."ValidFromDate" <= t.transaction_date
ORDER BY pl."ValidFromDate" desc
limit 1
) AS prices ON true
This is killing our performance, some queries are taking 20 seconds plus when we don't order by date desc/limit 1 it completes in ms but we could get duplicate prices back.
We are happy to rewrite if a better way of joining the most recent record. We have thousands of price lists and 100k or prices and there could be 100s if not 1000s of effective prices for each transaction and we need to ensure we get the one which was most recently effective for a product at the date of the transaction.
I have found if I denormalise the price lists/prices into a single table and add an index with ValidFromDate DESC it seems to eliminate the loops but I am hesitant to denormalise and have to maintain that data, these reports can be run adhoc as well as batch jobs and we would have to maintain that data in real time.
Updated Explain/Analyze:
I've added below the query which joins on prices which need to get the most recently effective for the transaction date. I see now that when <= date clause and limit 1 is removed it's actually spinning up multiple workers which is why it seems faster.
I am still seeing the slower query doing a large number of loops, 200k+ (when limit 1/<= date is included).
Maybe the better question is what can we do instead of a lateral join which will allow us to join the effective prices for transactions in the most efficient/performant way possible. I am hoping to avoid denormalising and maintainint that data but if it is the only way we'll do it. If there is a way to rewrite this and not denormalise then I'd really appreciate any insight.
Nested Loop (cost=14.21..76965.60 rows=5394 width=10) (actual time=408.948..408.950 rows=0 loops=1)
Output: t.transaction_id, pr."Price"
Buffers: shared hit=688022
-> Seq Scan on public.transaction_data_6787 t (cost=0.00..159.94 rows=5394 width=29) (actual time=0.018..0.682 rows=5394 loops=1)
Output: t.transaction_id
Buffers: shared hit=106
-> Limit (cost=14.21..14.22 rows=1 width=10) (actual time=0.075..0.075 rows=0 loops=5394)
Output: pr."Price", pl."ValidFromDate"
Buffers: shared hit=687916
-> Sort (cost=14.21..14.22 rows=1 width=10) (actual time=0.075..0.075 rows=0 loops=5394)
Output: pr."Price", pl."ValidFromDate"
Sort Key: pl."ValidFromDate" DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=687916
-> Nested Loop (cost=0.70..14.20 rows=1 width=10) (actual time=0.074..0.074 rows=0 loops=5394)
Output: pr."Price", pl."ValidFromDate"
Inner Unique: true
Buffers: shared hit=687916
-> Index Only Scan using ix_prices_covering on public."Prices" pr (cost=0.42..4.44 rows=1 width=10) (actual time=0.007..0.019 rows=51 loops=5394)
Output: pr."ProductId", pr."ValidFromDate", pr."Id", pr."Price", pr."PriceListId"
Index Cond: (pr."ProductId" = t.product_id)
Heap Fetches: 0
Buffers: shared hit=17291
-> Index Scan using ix_pricelists_covering on public."PriceLists" pl (cost=0.28..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=273678)
Output: pl."Id", pl."Name", pl."CustomerId", pl."ValidFromDate", pl."PriceListTypeId"
Index Cond: ((pl."Id" = pr."PriceListId") AND (pl."CustomerId" = 20) AND (pl."PriceListTypeId" = 1))
Filter: (pl."ValidFromDate" <= t.transaction_date)
Rows Removed by Filter: 0
Buffers: shared hit=670625
Planning Time: 1.254 ms
Execution Time: 409.088 ms
Gather (cost=6395.67..7011.99 rows=68 width=10) (actual time=92.481..92.554 rows=0 loops=1)
Output: t.transaction_id, pr."Price"
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1466 read=2
-> Hash Join (cost=5395.67..6005.19 rows=28 width=10) (actual time=75.126..75.129 rows=0 loops=3)
Output: t.transaction_id, pr."Price"
Inner Unique: true
Hash Cond: (pr."PriceListId" = pl."Id")
Join Filter: (pl."ValidFromDate" <= t.transaction_date)
Rows Removed by Join Filter: 41090
Buffers: shared hit=1466 read=2
Worker 0: actual time=64.707..64.709 rows=0 loops=1
Buffers: shared hit=462
Worker 1: actual time=72.545..72.547 rows=0 loops=1
Buffers: shared hit=550 read=1
-> Merge Join (cost=5374.09..5973.85 rows=3712 width=18) (actual time=26.804..61.492 rows=91226 loops=3)
Output: t.transaction_id, t.transaction_date, pr."Price", pr."PriceListId"
Merge Cond: (pr."ProductId" = t.product_id)
Buffers: shared hit=1325 read=2
Worker 0: actual time=17.677..51.590 rows=83365 loops=1
Buffers: shared hit=400
Worker 1: actual time=24.995..59.395 rows=103814 loops=1
Buffers: shared hit=488 read=1
-> Parallel Index Only Scan using ix_prices_covering on public."Prices" pr (cost=0.42..7678.38 rows=79544 width=29) (actual time=0.036..12.136 rows=42281 loops=3)
Output: pr."ProductId", pr."ValidFromDate", pr."Id", pr."Price", pr."PriceListId"
Heap Fetches: 0
Buffers: shared hit=989 read=2
Worker 0: actual time=0.037..9.660 rows=36873 loops=1
Buffers: shared hit=285
Worker 1: actual time=0.058..13.459 rows=47708 loops=1
Buffers: shared hit=373 read=1
-> Sort (cost=494.29..507.78 rows=5394 width=29) (actual time=9.037..14.700 rows=94555 loops=3)
Output: t.transaction_id, t.product_id, t.transaction_date
Sort Key: t.product_id
Sort Method: quicksort Memory: 614kB
Worker 0: Sort Method: quicksort Memory: 614kB
Worker 1: Sort Method: quicksort Memory: 614kB
Buffers: shared hit=336
Worker 0: actual time=6.608..12.034 rows=86577 loops=1
Buffers: shared hit=115
Worker 1: actual time=8.973..14.598 rows=107126 loops=1
Buffers: shared hit=115
-> Seq Scan on public.transaction_data_6787 t (cost=0.00..159.94 rows=5394 width=29) (actual time=0.020..2.948 rows=5394 loops=3)
Output: t.transaction_id, t.product_id, t.transaction_date
Buffers: shared hit=318
Worker 0: actual time=0.017..2.078 rows=5394 loops=1
Buffers: shared hit=106
Worker 1: actual time=0.027..2.976 rows=5394 loops=1
Buffers: shared hit=106
-> Hash (cost=21.21..21.21 rows=30 width=8) (actual time=0.145..0.145 rows=35 loops=3)
Output: pl."Id", pl."ValidFromDate"
Buckets: 1024 Batches: 1 Memory Usage: 10kB
Buffers: shared hit=53
Worker 0: actual time=0.137..0.138 rows=35 loops=1
Buffers: shared hit=18
Worker 1: actual time=0.149..0.150 rows=35 loops=1
Buffers: shared hit=18
-> Bitmap Heap Scan on public."PriceLists" pl (cost=4.59..21.21 rows=30 width=8) (actual time=0.067..0.114 rows=35 loops=3)
Output: pl."Id", pl."ValidFromDate"
Recheck Cond: (pl."CustomerId" = 20)
Filter: (pl."PriceListTypeId" = 1)
Rows Removed by Filter: 6
Heap Blocks: exact=15
Buffers: shared hit=53
Worker 0: actual time=0.068..0.108 rows=35 loops=1
Buffers: shared hit=18
Worker 1: actual time=0.066..0.117 rows=35 loops=1
Buffers: shared hit=18
-> Bitmap Index Scan on "IX_PriceLists_CustomerId" (cost=0.00..4.58 rows=41 width=0) (actual time=0.049..0.049 rows=41 loops=3)
Index Cond: (pl."CustomerId" = 20)
Buffers: shared hit=8
Worker 0: actual time=0.053..0.054 rows=41 loops=1
Buffers: shared hit=3
Worker 1: actual time=0.048..0.048 rows=41 loops=1
Buffers: shared hit=3
Planning Time: 2.236 ms
Execution Time: 92.814 ms

Simple equijoin between postgreSQL partitions is taking a long time (10 minutes)

We are using postgreSQL database and I am hitting some limits. I have partitions on company_sale_account table
based on company name
We generate a report on accounts matched between the two. Below is the query:
SELECT cpsa1.*
FROM company_sale_account cpsa1
JOIN company_sale_account cpsa2 ON cpsa1.sale_account_id = cpsa2.sale_account_id
WHERE cpsa1.company_name = 'company_a'
AND cpsa2.company_name = 'company_b'
We have setup BTREE indexes on sale_account_id column on both the tables.
This worked fine till recently. Now, we have 10 million rows in
company_a partition and 7 million rows in company_b partition. This query is taking
more than 10 minutes.
Below is the explain plan output for it:
Buffers: shared hit=20125996 read=47811 dirtied=75, temp read=1333427 written=1333427 I/O Timings: read=19619.322
Sort (cost=167950986.43..168904299.23 rows=381325118 width=132) (actual time=517017.334..603691.048 rows=16854094 loops=1)
Sort Key: cpsa1.crm_account_id, ((cpsa1.account_name)::text),((cpsa1.account_owner)::text), ((cpsa1.account_type)::text), cpsa1.is_customer, ((date_part('epoch'::text,cpsa1.created_date))::integer),((hstore_to_json(cpsa1.custom_crm_fields))::tex (...)
Sort Method: external merge Disk: 2862656kB
Buffers: shared hit=20125996 read=47811 dirtied=75, temp read=1333427 written=1333427
I/O Timings: read=19619.322
- Nested Loop (cost=0.00..9331268.39 rows=381325118 width=132) (actual time=1.680..118698.570 rows=16854094 loops=1)
Buffers: shared hit=20125977 read=47811 dirtied=75
I/O Timings: read=19619.322
- Append (cost=0.00..100718.94 rows=2033676 width=33) (actual time=0.014..1783.243 rows=2033675 loops=1)
Buffers: shared hit=75298 dirtied=75
- Seq Scan on company_sale_account cpsa2 (cost=0.00..0.00 rows=1 width=516) (actual time=0.001..0.001 rows=0 loops=1)
Filter: ((company_name)::text = 'company_b'::text)
- Seq Scan on company_sale_account_concur cpsa2_1 (cost=0.00..100718.94 rows=2033675 width=33) (actual time=0.013..938.145 rows=2033675 loops=1)
Filter: ((company_name)::text = 'company_b'::text)
Buffers: shared hit=75298 dirtied=75
- Append (cost=0.00..1.97 rows=23 width=355) (actual time=0.034..0.047 rows=8 loops=2033675)
Buffers: shared hit=20050679 read=47811
I/O Timings: read=19619.322
- Seq Scan on company_sale_account cpsa1 (cost=0.00..0.00 rows=1 width=4525) (actual time=0.000..0.000 rows=0 loops=2033675)
Filter: (((company_name)::text = 'company_a'::text) AND ((cpsa2.sale_account_id)::text = (sale_account_id)::text))
- Index Scan using ix_csa_adp_sale_account on company_sale_account_adp cpsa1_1 (cost=0.56..1.97 rows=22 width=165) (actual time=0.033..0.042 rows=8 loops=2033675)
Index Cond: ((sale_account_id)::text = (cpsa2.sale_account_id)::text)
Filter: ((company_name)::text = 'company_a'::text)
Buffers: shared hit=20050679 read=47811
I/O Timings: read=19619.322
Planning time: 30.853 ms
Execution time: 618218.321ms
Do you have any suggestion on how to tune postgres.
Please share your thoughts. It would be a great help to me.