Inordinately slow Nested Loop with join on simple query - postgresql

I'm running the query below against the primary key lt_id (no other index bar the pkey btree) and joining against 1000 ids.
It might be just my lack of experience with postgres but it seems like it's maybe an order of magnitude slow.. There are 800k rows in the table in total.
This is a low spec machine(4G mem) but still thought it should be faster. CPU is idle.
EXPLAIN (ANALYZE,BUFFERS) SELECT lt_id FROM "mytable" d INNER JOIN ( VALUES (1839147),(...998 more rows here...),(1756908)) v(id) ON (d.lt_id = v.id);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.42..7743.00 rows=1000 width=4) (actual time=69.852..20743.393 rows=1000 loops=1)
Buffers: shared hit=2395 read=1607
-> Values Scan on "*VALUES*" (cost=0.00..12.50 rows=1000 width=4) (actual time=0.004..4.770 rows=1000 loops=1)
-> Index Only Scan using lt_id_idx on mytable d (cost=0.42..7.73 rows=1 width=4) (actual time=20.732..20.732 rows=1 loops=1000)
Index Cond: (lt_id = "*VALUES*".column1)
Heap Fetches: 1000
Buffers: shared hit=2395 read=1607
Planning Time: 86.284 ms
Execution Time: 20744.223 ms
(9 rows)
psql 11.7 , I was using 9 but upgraded to 11.7 , no real difference in speed observed.
free
total used free shared buff/cache available
Mem: 3783732 158076 3400932 55420 224724 3366832
Swap: 0 0 0
Even though it's low spec should it really be taking 20 seconds? In fact many other queries are taking twice as long or more. 20 seconds seems to be the best case scenario. There are a couple of other text columns in the table with some small text articles which I doubt is the issue.
I was previously using IN operator but observed similar or worse speeds.
I also made a couple of small changes from the default config, but it doesn't seem to make much difference.
work_mem = 32MB
shared_buffers = 512MB
Any ideas if this is expected performance given the machine? Or is there something else I can try?
edit: I guess what I'm curious about it the time in the actual loop
actual time=20.732..20.732 rows=1 loops=1000
It seems like the actual time is less than or equal 1ms per loop which in worst case would be less than 1 second for 1000 iterations and other operations also seem negligible. Does this mean the issue is simple IO ? slow disk ? What would typically be the situation here.
I notice if I run the query on my desktop which only has 8G ram but is using an SSD the query is massively faster..
Using an SSD is fine of course but I'd like to know if something in my config or query/setup is not optimal..

As #pifor suggested, set track_io_timing=on , can see that this is indeed almost entirely IO slowness..
Nested Loop (cost=0.42..7743.00 rows=1000 width=69) (actual time=0.026..14901.004 rows=1000 loops=1)
Buffers: shared hit=2859 read=1145
I/O Timings: read=14861.578
-> Values Scan on "*VALUES*" (cost=0.00..12.50 rows=1000 width=4) (actual time=0.002..5.497 rows=1000 loops=1)
-> Index Scan using mytable_pkey on mytable d (cost=0.42..7.73 rows=1 width=69) (actual time=14.888..14.888 rows=1 loops=1000)
Index Cond: (lt_id = "*VALUES*".column1)
Buffers: shared hit=2859 read=1145
I/O Timings: read=14861.578
Planning Time: 0.420 ms
Execution Time: 14901.734 ms
(10 rows)

Related

Improve Postgres performance

I am new to Postgres and sure I’m doing something wrong.
So I just wondered if anybody had experienced something similar to my experiences below or could point me in the right direction to improve Postgres performance.
My initial goal was to speed up the analytical processing of my Datamarts in various Dashboards by moving from MS SQL Server to Postgres.
To get a sample query to compare speeds I ran query profiler on MS SQL Server whilst referencing a BI dashboard, which produced something similar to this (I know there are redundant columns in the sub query):
SELECT COUNT(*)
FROM (
SELECT
BM.Key_Date, BM.[Actual Date], BM.[Month]
,BM.[Month Number], BM.[Month Year], BM.[No of Working Days]
,SDI.Key_Delivery, SDI.[Order Number], SDI.[Quantity SKU]
,SDI.[Quantity Sales Unit], SDI.[FactSales - GBP], SDI.[NNSA Capsules]
,SFI.[Ship-to], SFI.[Sold-to], SFI.[Sales Force Type], SFI.Region
,SFI.[Top Level Account], SFI.[Customer Organisation]
,EX.Rate
,PDI.[Product Description], PDI.[Product Type Group], PDI.[Product Type],
PDI.[Main Product Categories], PDI.Section, PDI.Family
FROM Fact.SalesDataInvoiced AS SDI
JOIN Dimension.SalesforceInvoiced AS SFI
ON SDI.[Key_Ship-to]=SFI.[Key_Ship-to]
JOIN Dimension.BillingMonth AS BM
ON SDI.[Key_Billing Month]=BM.Key_Date
JOIN Dimension.ProductDataInvoiced AS PDI
ON SDI.[Key_Product Code]=PDI.[Key_Product Code]
CROSS JOIN Dimension.Exchange AS EX
WHERE BM.[Actual Date] BETWEEN '20160101' AND '20211001'
) AS a
GROUP BY [Product Type], [Product Type Group],[Main Product Categories]
I then installed Postgres 14 (on Centos 8) and MS SQL Server Developer 2017 (on windows 10) on separate identical laptops and created a Database and tables from the same csv data files to enable the replication of the above query.
Running a Postgres query with indexing performs massively slower than MS SQL without indexing.
Adding indexes to MS SQL produces results almost instantly.
Because of the difference in processing time I even installed Citus with Postgres14 and created Fact.SalesDataInvoiced as a columnar table (This made the processing time worse).
I have played about with memory settings in postgresql.conf but nothing seems to enable speeds comparable to MSSQL.
Explain Analyze shows that despite the indexes it always runs a sequential scan of all tables. Forcing indexed scans doesn't make any difference to processing time.
Would I be right in thinking Postgres would perform significantly better using a cluster and partitioning? Even if this is the case surely a simple query like the one I'm trying to run on a stand alone machine should be faster?
TABLE DETAILS
Dimension.BillingMonth
Records 120,
Primary Key is KeyDate,
Clustered Unique Index on KeyDate
Dimension.Exchange
Records 1
Dimension.ProductDataInvoiced
Records 275563,
Primary Key is KeyProduct,
Clustered Unique Index on KeyProduct
Dimension.SalesforceInvoiced
Records 377414,
Primary Key is KeyShipTo,
Clustered Unique Index on KeyShipTo
Fact.SalesDataInvoiced
Records 43807943,
Non-Clustered Unique Index on KeyShipTo, KeyProduct, KeyBillingMonth
Any help would be appreciated as previously mentioned I'm sure I must be missing something obvious.
Many thanks in advance.
David
Thank you for the responses. I have placed additional info below.
Forgot to add my postgres performance woes were after i'd carried out a Full Vacuum and Reindex. I performed these maintenance tasks after I had imported the data and created my indexes.
Output after querying pg_indexes
tablename
indexname
indexdef
BillingMonth
BillingMonth_pkey
CREATE UNIQUE INDEX BillingMonth_pkey ON public.BillingMonth USING btree (KeyDate)
ProductDataInvoiced
ProductDataInvoiced_pkey
CREATE UNIQUE INDEX ProductDataInvoiced_pkey ON public.ProductDataInvoiced USING btree (KeyProductCode)
SalesforceInvoiced
SalesforceInvoiced_pkey
CREATE UNIQUE INDEX SalesforceInvoiced_pkey ON public.SalesforceInvoiced USING btree (KeyShipTo)
SalesDataInvoiced
CI_SalesData
CREATE INDEX CI_SalesData ON public.SalesDataInvoiced USING btree (KeyShipTo, KeyProductCode, KeyBillingMonth)
Output After running EXPLAIN (ANALYZE, BUFFERS)
Finalize GroupAggregate (cost=1435439.30..1435565.71 rows=480 width=53) (actual time=25960.468..25973.326 rows=31 loops=1)
Group Key: pdi."ProductType", pdi."ProductTypeGroup", pdi."MainProductCategories"
Buffers: shared hit=71246 read=859119
-> Gather Merge (cost=1435439.30..1435551.31 rows=960 width=53) (actual time=25960.458..25973.282 rows=89 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=71246 read=859119
-> Sort (cost=1434439.28..1434440.48 rows=480 width=53) (actual time=25956.982..25956.989 rows=30 loops=3)
Sort Key: pdi."ProductType", pdi."ProductTypeGroup", pdi."MainProductCategories"
Sort Method: quicksort Memory: 28kB
Buffers: shared hit=71246 read=859119
Worker 0: Sort Method: quicksort Memory: 29kB
Worker 1: Sort Method: quicksort Memory: 29kB
-> Partial HashAggregate (cost=1434413.10..1434417.90 rows=480 width=53) (actual time=25956.878..25956.895 rows=30 loops=3)
Group Key: pdi."ProductType", pdi."ProductTypeGroup", pdi."MainProductCategories"
Batches: 1 Memory Usage: 49kB
Buffers: shared hit=71230 read=859119
Worker 0: Batches: 1 Memory Usage: 49kB
Worker 1: Batches: 1 Memory Usage: 49kB
-> Parallel Hash Join (cost=62124.74..1327935.46 rows=10647764 width=45) (actual time=285.864..19240.004 rows=14602648 loops=3)
Hash Cond: (sdi."KeyShipTo" = sfi."KeyShipTo")
Buffers: shared hit=71230 read=859119
-> Hash Join (cost=19648.48..1257508.51 rows=10647764 width=49) (actual time=204.794..12862.063 rows=14602648 loops=3)
Hash Cond: (sdi."KeyProductCode" = pdi."KeyProductCode")
Buffers: shared hit=32264 read=859119
-> Hash Join (cost=3.67..1091456.95 rows=10647764 width=8) (actual time=0.143..7076.104 rows=14602648 loops=3)
Hash Cond: (sdi."KeyBillingMonth" = bm."KeyDate")
Buffers: shared hit=197 read=859119
-> Parallel Seq Scan on "SalesData_Invoiced" sdi (cost=0.00..1041846.10 rows=18253310 width=12) (actual
time=0.071..2585.596 rows=14602648 loops=3)
Buffers: shared hit=194 read=859119
-> Hash (cost=2.80..2.80 rows=70 width=4) (actual time=0.049..0.050 rows=70 loops=3)
Hash Cond: (sdi."KeyBillingMonth" = bm."KeyDate")
Buffers: shared hit=197 read=859119
-> Parallel Seq Scan on "SalesData_Invoiced" sdi (cost=0.00..1041846.10 rows=18253310 width=12) (actual
time=0.071..2585.596 rows=14602648 loops=3)
Buffers: shared hit=194 read=859119
-> Hash (cost=2.80..2.80 rows=70 width=4) (actual time=0.049..0.050 rows=70 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
Buffers: shared hit=3
-> Seq Scan on "BillingMonth" bm (cost=0.00..2.80 rows=70 width=4) (actual time=0.012..0.028
rows=70 loops=3)
Filter: (("ActualDate" >= '2016-01-01'::date) AND ("ActualDate" <= '2021-10-01'::date))
Rows Removed by Filter: 50
Buffers: shared hit=3
-> Hash (cost=16200.27..16200.27 rows=275563 width=49) (actual time=203.237..203.238 rows=275563 loops=3)
Buckets: 524288 Batches: 1 Memory Usage: 26832kB
Buffers: shared hit=32067
-> Nested Loop (cost=0.00..16200.27 rows=275563 width=49) (actual time=0.034..104.143 rows=275563 loops=3)
Buffers: shared hit=32067
-> Seq Scan on "Exchange" ex (cost=0.00..1.01 rows=1 width=0) (actual time=0.024..0.024 rows=
1 loops=3)
Buffers: shared hit=3
-> Seq Scan on "ProductData_Invoiced" pdi (cost=0.00..13443.63 rows=275563 width=49) (actual
time=0.007..48.176 rows=275563 loops=3)
Buffers: shared hit=32064
-> Parallel Hash (cost=40510.56..40510.56 rows=157256 width=4) (actual time=79.536..79.536 rows=125805 loops=3)
Buckets: 524288 Batches: 1 Memory Usage: 18912kB
Buffers: shared hit=38938
-> Parallel Seq Scan on "Salesforce_Invoiced" sfi (cost=0.00..40510.56 rows=157256 width=4) (actual time=
0.011..42.968 rows=125805 loops=3)
Buffers: shared hit=38938
Planning:
Buffers: shared hit=426
Planning Time: 1.936 ms
Execution Time: 25973.709 ms
(55 rows)
Firstly, remember to run VACUUM ANALYZE after rebuilding indexes, or sometimes after importing large amount of data. (VACUUM FULL is mainly useful for the OS to reclaim disk space, and you'd still need to analyse afterwards, especially after rebuilding indexes.)
It seems from your query that your main table is SalesDataInvoiced (SDI) and that you'd want to use an index on KeyBillingMonth if possible (since it's the main restriction you're placing). In general, you'd also want indexes, at least on the other tables on the columns that are used for the joins.
As the documentation for multi-column indexes in PostgreSQL says:
A multicolumn B-tree index can be used with query conditions that involve any subset of the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns. The exact rule is that equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will be used to limit the portion of the index that is scanned. Constraints on columns to the right of these columns are checked in the index, so they save visits to the table proper, but they do not reduce the portion of the index that has to be scanned. For example, given an index on (a, b, c) and a query condition WHERE a = 5 AND b >= 42 AND c < 77, the index would have to be scanned from the first entry with a = 5 and b = 42 up through the last entry with a = 5. Index entries with c >= 77 would be skipped, but they'd still have to be scanned through. This index could in principle be used for queries that have constraints on b and/or c with no constraint on a — but the entire index would have to be scanned, so in most cases the planner would prefer a sequential table scan over using the index.
In your example, the main column you'd want to use a constraint on (KeyBillingMonth) is in third position, so it's unlikely to be used.
CREATE INDEX CI_SalesData ON public.SalesDataInvoiced
USING btree (KeyShipTo, KeyProductCode, KeyBillingMonth)
Creating this should make it more likely to be used:
CREATE INDEX ON SalesDataInvoiced(KeyBillingMonth);
Then, run VACUUM ANALYZE and try your query again.
You may also want an index on BillingMonth(ActualDate), but that's not necessarily useful since there seems to be few rows (and most of them are returned in your query).
It's not clear what the BillingMonth table is for. If it's basically about truncating the ActualDate to have the first day of the month, you could for example get rid of the join on BillingMonth and use the constraint on SalesDataInvoiced.KeyBillingMonth directly. For example ... WHERE SDI.KeyBillingMonth BETWEEN '2016-01-01' AND '2021-10-01' ....
As a side-note, as far as I know, BETWEEN is inclusive for its upper bound. I'd imagine a query like this is meant to represent some monthly statistics, hence should probably not include what's on 2021-10-01 (but not the rest of that month).

Very slow query planning time with many indexes

I have a table "nodes" with a JSONB-column "data", in which I store various types of information.
The JSON includes pieces of text in different languages, that need to be frequently searched on by end-users. Per language, I therefore create about 4 indices similar to the following (usually with a separate search dictionary for that language)
CREATE INDEX nodes_label_sv_idx
ON nodes
USING GIN (to_tsvector('swedish_text', data #>> '{label,sv}'));
This works fine when only 2 or 3 languages are present, but when adding 20 more languages (each with 4 indices for that language's path into the JSON), the query planner becomes extremely slow for some queries (180 ms), even though those queries are still executing very fast (less than 1ms). The table currently contains about 50K records.
Weird thing is, those queries are simple joins on other columns of the table (unrelated to the "data" column), so the language-related indices are completely irrelevant. Also, the more language-related indices I drop, the faster the planner becomes again.
I completely understand that the planner needs to check all (150+) indices for potential relevance, but 180ms is extreme. Anybody have a suggestion? By the way, the problem only seems to occur when using a view (not when directly using the query underlying the view).
I am using PostgresSQL 13 on Mac & Linux.
Edit:
query:
EXPLAIN (ANALYZE, BUFFERS)
select 1
from ca_nodes can
where (can.owner_id = 168 or can.aco_id = 0)
limit 1;
underlying view:
CREATE VIEW ca_nodes AS
SELECT n.nid, n.owner_id, c.aco_id,
FROM nodes n inner join acos c on n.nid = c.nid;
explain (analyze, buffers) output:
Limit (cost=0.58..32.45 rows=1 width=4) (actual time=0.038..0.039 rows=1 loops=1)
Buffers: shared hit=6
-> Merge Join (cost=0.58..15136.78 rows=475 width=4) (actual time=0.037..0.037 rows=1 loops=1)
Merge Cond: (n.nid = c.nid)
Join Filter: ((n.owner_id = 168) OR (c.aco_id = 0))
Buffers: shared hit=6
-> Index Scan using nodes_pkey on nodes n (cost=0.29..12094.35 rows=47604 width=8) (actual time=0.017..0.017 rows=1 loops=1)
Buffers: shared hit=3
-> Index Scan using acos_nid_idx on precalculated_acos c (cost=0.29..2090.35 rows=47604 width=8) (actual time=0.014..0.014 rows=1 loops=1)
Buffers: shared hit=3
Planning:
Buffers: shared hit=83
Planning Time: 180.392 ms
Execution Time: 0.079 ms

Can I have a feedback about my Postgres performance?

this is the query I performed in pgAdmin4:
update point
set grid_id_new=g.grid_id
from grid as g
where (point.region='EMILIA-ROMAGNA'and st_within(point.geom,g.geom))
Point is a 34 millions record table describing a point geometry (16 GB - 20 columns)
Grid is a 10 millions record table describing a multlipolygon geometry (grid) (4 GB)
I want my point table to associate with the grid ID they lie in. The query output are 2.5 million records updated (since I filter by region), in 24 minutes.
I feel like it took too much time.
These are my computer specifics:
Windows 10 PRO/Intel(R) Core(TM) i9-10920X CPU # 3.50 GHz/RAM 128 GB/953GB SSD(C)+3.4TB HDD(F)
I have installed Postgres13 and the data folder is on F (I know this may be wrong so I am planning to move it).
I have also tried to tune postgres.conf file but I got poor results.
Can someone please explain if my Postgres performance are as poor as I think? And, if so, how can I make it better? Also, what could be a good configuration for postgres.conf according with my hardware?
Update
#jjanes Hi there! it took 8 minutes to run the query you wrote, and this is the result:
QUERY PLAN
Gather (cost=1363.89..273178616690.49 rows=23057026760 width=28) (actual time=76.935..503830.684 rows=2335279 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=18634521 read=2426823
-> Nested Loop (cost=363.89..270872913014.49 rows=9607094483 width=28) (actual time=157.628..503021.991 rows=778426 loops=3)
Buffers: shared hit=18634521 read=2426823
-> Parallel Seq Scan on egon_geom_new (cost=0.00..2657488.69 rows=1064319 width=59) (actual time=1.575..8642.488 rows=855390 loops=3)
Filter: (dsxreg = 'EMILIA-ROMAGNA'::text)
Rows Removed by Filter: 10581246
Buffers: shared hit=259223 read=2225262
-> Bitmap Heap Scan on "6_emilia_grid" (cost=363.89..254491.98 rows=903 width=148) (actual time=0.573..0.573 rows=1 loops=2566171)
Filter: st_within((egon_geom_new.geom_new)::geometry, geom)
Heap Blocks: exact=784879
Buffers: shared hit=18375298 read=201561
-> Bitmap Index Scan on emilia_idx (cost=0.00..363.66 rows=9027 width=0) (actual time=0.283..0.283 rows=1 loops=2566171)
Index Cond: (geom ~ (egon_geom_new.geom_new)::geometry)
Buffers: shared hit=16167046 read=74534
Planning:
Buffers: shared hit=130 read=3 dirtied=2
Planning Time: 22.756 ms
Execution Time: 504042.609 ms
Thanks!
You can create a GiST index on one of the geometry columns, that will speed up the nested loop join. But you cannot use another join strategy, because the join condition is not using the equality operator (=), so it will always be slow to join two big tables.

Why do seq/index scans take so long when running query after a while? How to make it fast?

Problem:
I have a query that joins three tables. Whenever I run this query after a while (say 24hrs), it would take a lot of time to execute. But from that time onwards, it would execute really fast (~ 70x faster). I wanted to know what's the problem that it takes so long to execute for the first time, and how to solve it.
Table conditions:
The tables are: property_2, property_attribute_2, and property_address_2. Each of which is a partition of a bigger table (i.e. property, property_attribute, and property_address). Also, rows in property_attribute_2 and property_address_2 have reference key to property_2 using column property_id. These columns (property.id, property_attribute_2.property_id, and property_address_2.property_id) are all indexed.
The query is:
select * from public.property_2 a
inner join public.property_attribute_2 b on a.id = b.property_id
left join public.property_address_2 c on a.id=c.property_id
The query plan when I run it after a while is:
Hash Right Join (cost=670010.33..983391.75 rows=2477776 width=185) (actual time=804159.499..1065892.338 rows=2477924 loops=1)
Hash Cond: (c.property_id = a.id)
-> Seq Scan on property_address_2 c (cost=0.00..131660.48 rows=4257948 width=72) (actual time=289.781..247906.955 rows=4257973 loops=1)
-> Hash (cost=595483.13..595483.13 rows=2477776 width=117) (actual time=803833.183..803833.185 rows=2477921 loops=1)
Buckets: 32768 Batches: 128 Memory Usage: 3165kB
-> Hash Join (cost=94193.96..595483.13 rows=2477776 width=117) (actual time=98061.326..802753.642 rows=2477921 loops=1)
Hash Cond: (a.id = b.property_id)
-> Seq Scan on property_2 a (cost=0.00..265463.84 rows=6176884 width=105) (actual time=1349.284..696922.438 rows=4272433 loops=1)
-> Hash (cost=48702.76..48702.76 rows=2477776 width=20) (actual time=95497.307..95497.308 rows=2477921 loops=1)
Buckets: 65536 Batches: 64 Memory Usage: 2624kB
-> Seq Scan on property_attribute_2 b (cost=0.00..48702.76 rows=2477776 width=20) (actual time=464.476..94126.890 rows=2477921 loops=1)
Planning time: 4.034 ms
Execution time: 1065995.827 ms
And the query plan after the first run is:
Hash Right Join (cost=670010.33..983391.75 rows=2477776 width=185) (actual time=8828.873..13764.283 rows=2477924 loops=1)
Hash Cond: (c.property_id = a.id)
-> Seq Scan on property_address_2 c (cost=0.00..131660.48 rows=4257948 width=72) (actual time=0.050..1411.877 rows=4257973 loops=1)
-> Hash (cost=595483.13..595483.13 rows=2477776 width=117) (actual time=8826.620..8826.623 rows=2477921 loops=1)
Buckets: 32768 Batches: 128 Memory Usage: 3165kB
-> Hash Join (cost=94193.96..595483.13 rows=2477776 width=117) (actual time=1356.224..7925.850 rows=2477921 loops=1)
Hash Cond: (a.id = b.property_id)
-> Seq Scan on property_2 a (cost=0.00..265463.84 rows=6176884 width=105) (actual time=0.034..2652.013 rows=4272433 loops=1)
-> Hash (cost=48702.76..48702.76 rows=2477776 width=20) (actual time=1354.828..1354.829 rows=2477921 loops=1)
Buckets: 65536 Batches: 64 Memory Usage: 2624kB
-> Seq Scan on property_attribute_2 b (cost=0.00..48702.76 rows=2477776 width=20) (actual time=0.023..630.081 rows=2477921 loops=1)
Planning time: 1.181 ms
Execution time: 13872.977 ms
Also worth noting that I have a couple of other Postgres databases on this machine and different jobs use different tables on these databases on a regular basis.
If cold cache is the problem, as it seems to be the case, you can warm it up before running the query. Postgres ships with the additional module pg_prewarm providing a range of tools to populate the cache.
Instructions how to set it up here:
PostgreSQL: Force data into memory
Then you run something like:
SELECT pg_prewarm('public.property_2', 'prefetch');
SELECT pg_prewarm('public.property_attribute_2', 'prefetch');
SELECT pg_prewarm('public.property_address_2', 'prefetch');
Of course, if you always run the same SELECT query without filter predicates, you might as well just run the same query to populate the cache, without using the fancy module. Possibly scheduled with a cron job?
... are all indexed.
As you can see in the EXPLAIN output, your indexes go unused. You fetch all rows without filter predicate, so indexes typically won't help. And you do it with SELECT *, i.e. get all columns from all joined tables, so index-only scans are out, too. You might improve performance by listing only the columns you actually need in the SELECT list.
Obviously, more RAM (and proper configuration for PostgreSQL buffer cache) can help, too.
Or you might be able to reduce RAM requirements with VACUUM (FULL) or, possibly, with an optimized table definition with proper column types and order. Not just for the tables at hand, also for other tables competing for the same resources (thereby evicting "your" blocks from the cache). See:
Calculating and saving space in PostgreSQL
The difference must be caching: the first time, the data are read from disk, in subsequent runs they are found in RAM. Run EXPLAIN (ANALYZE, BUFFERS) with track_io_timing = on to confirm that.
However, it seems that either your I/O system is really slow or your tables are quite bloated. EXPLAIN (ANALYZE, BUFFERS) would show how many blocks are read, so you would know.
If bloat is indeed your problem, VACUUM (FULL) would help.

Postgres not using Covering Index with Aggregate

Engine version: 12.4
Postgres wasn't using index only scan, then I ran vacuum analyze verbose table_name. After that it started using index only scan. Earlier when I had ran analyze verbose table_name without vacuum that time index only scan wasn't used.
So it means there is very heavy dependency on vacuum to use index only plan. Is there any way to eliminate this dependency OR should we configure vacuum very regularly? frequency like daily.
Our objective is to reduce cpu usage.. Overall machine cpu usage is 10%-15% throughout the day but when this query runs then cpu goes very high( this query runs in multiple threads at same time with diff values)
EXPLAIN ANALYZE SELECT COALESCE(requested_debit, 0) AS requestedDebit, COALESCE(requested_credit, 0) AS requestedCredit
FROM (SELECT COALESCE(Sum(le.credit), 0) AS requested_credit, COALESCE(Sum(le.debit), 0) AS requested_debit
FROM ledger_entries le
WHERE le.accounting_entity_id = 1
AND le.general_ledger_id = 503
AND le.post_date BETWEEN '2020-09-10' AND '2020-11-30') AS requested_le;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on requested_le (cost=66602.65..66602.67 rows=1 width=64) (actual time=81.263..81.352 rows=1 loops=1)
-> Finalize Aggregate (cost=66602.65..66602.66 rows=1 width=64) (actual time=81.261..81.348 rows=1 loops=1)
-> Gather (cost=66602.41..66602.62 rows=2 width=64) (actual time=79.485..81.331 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=65602.41..65602.42 rows=1 width=64) (actual time=74.293..74.294 rows=1 loops=3)
-> Parallel Index Only Scan using post_date_gl_id_accounting_entity_id_include_idx on ledger_entries le (cost=0.56..65203.73 rows=79735 width=8) (actual time=47.874..74.212 rows=197 loops=3)
Index Cond: ((post_date >= '2020-09-10'::date) AND (post_date <= '2020-11-30'::date) AND (general_ledger_id = 503) AND (accounting_entity_id = 1))
Heap Fetches: 0
Planning Time: 0.211 ms
Execution Time: 81.395 ms
(11 rows)
There is a very strong connection between VACUUM and index-only scans in PostgreSQL: an index-only scan can only skip fetching the table row (to check for visibility) if the block containing the row is marked all-visible in the visibility map. And the visibility map is updated by VACUUM.
So yes, you have to VACUUM often to get efficient index-only scans.
Typically, there is no need to schedule a manual VACUUM, you can simply
ALTER TABLE mytab SET (autovacuum_vacuum_scale_factor = 0.01);
(or a similar value) and let autovacuum do the job.
The only case where this is problematic are insert-only tables, because for them autovacuum won't be triggered for PostgreSQL versions below v13. In v13, you can simply change autovacuum_vacuum_insert_scale_factor, while in older versions you will have to set autovacuum_freeze_max_age to a low value for that table.