Slow on first query - postgresql

I'm having troubles when I perform the first query on a table. Subsequent queries are much faster, even if I change the range date to look for. I assume that PostgreSQL implements a caching mechanism that allows the subsequent queries to be much faster. I can try to warmup the cache so the first user request can hit the cache. However, I think I can somehow improve the following query:
SELECT
y.id, y.title, x.visits, x.score
FROM (
SELECT
article_id, visits,
COALESCE(ROUND((visits / NULLIF(hits ,0)::float)::numeric, 4), 0) score
FROM (
SELECT
article_id, SUM(visits) visits, SUM(hits) hits
FROM
article_reports
WHERE
a.site_id = 'XYZ' AND a.date >= '2017-04-13' AND a.date <= '2017-06-28'
GROUP BY
article_id
) q ORDER BY score DESC, visits DESC LIMIT(20)
) x
INNER JOIN
articles y ON x.article_id = y.id
Any ideas on how can I improve this. The following is the result of EXPLAIN:
Nested Loop (cost=84859.76..85028.54 rows=20 width=272) (actual time=12612.596..12612.836 rows=20 loops=1)
-> Limit (cost=84859.34..84859.39 rows=20 width=52) (actual time=12612.502..12612.517 rows=20 loops=1)
-> Sort (cost=84859.34..84880.26 rows=8371 width=52) (actual time=12612.499..12612.503 rows=20 loops=1)
Sort Key: q.score DESC, q.visits DESC
Sort Method: top-N heapsort Memory: 27kB
-> Subquery Scan on q (cost=84218.04..84636.59 rows=8371 width=52) (actual time=12513.168..12602.649 rows=28965 loops=1)
-> HashAggregate (cost=84218.04..84301.75 rows=8371 width=36) (actual time=12513.093..12536.823 rows=28965 loops=1)
Group Key: a.article_id
-> Bitmap Heap Scan on article_reports a (cost=20122.78..77122.91 rows=405436 width=36) (actual time=135.588..11974.774 rows=398242 loops=1)
Recheck Cond: (((site_id)::text = 'XYZ'::text) AND (date >= '2017-04-13'::date) AND (date <= '2017-06-28'::date))
Heap Blocks: exact=36911
-> Bitmap Index Scan on index_article_reports_on_site_id_and_article_id_and_date (cost=0.00..20021.42 rows=405436 width=0) (actual time=125.846..125.846 rows=398479 loops=1)"
Index Cond: (((site_id)::text = 'XYZ'::text) AND (date >= '2017-04-13'::date) AND (date <= '2017-06-28'::date))
-> Index Scan using articles_pkey on articles y (cost=0.42..8.44 rows=1 width=128) (actual time=0.014..0.014 rows=1 loops=20)
Index Cond: (id = q.article_id)
Planning time: 1.443 ms
Execution time: 12613.689 ms
Thanks in advance

There are two levels of "cache" that Postgres uses:
OS file cache
shared buffers.
Important: Postgres directly controls only the second one, and relies on the first one, which is under OS' control.
First thing I would check are these two settings in postgresql.conf:
effective_cache_size ā€“ usually I set it to ~3/4 of all RAM available. Notice that it's not a setting that tells Postgres how to allocate memory, it's just "an advice" to Postgres planner telling some estimate of OS file cache size
shared_buffers ā€“ usually I set it to 1/4 of RAM size. This is allocation setting.
Also, I'd check other memory-related settings (work_mem, maintenance_work_mem) to understand how much RAM might be consumed, so will my effective_cache_size estimation be correct at most times.
But if you just turned your Postgres on, the first queries will most probably be long because there is no data in OS file cache and in shared buffers. You can check it with advanced EXPLAIN options:
EXPLAIN (ANALYZE, BUFFERS) SELECT ...
-- you will see how many buffers were fetched from disk ("read") or from cache ("hit")
Here you can find good material on using EXPLAIN: http://www.dalibo.org/_media/understanding_explain.pdf
Additionally, there is an extension aiming to solve "cold cache" problem: pg_prewarm https://www.postgresql.org/docs/current/static/pgprewarm.html
Also, working with SSD disks instead of magnetic ones will mean that disk reads will be much faster.
Have fun and well working Postgres :-)

If it is the first query after inserting several rows you must run an
ANALYZE
in all the database or over the involved tables. Try executing it at database level.

Related

Improve Postgres performance

I am new to Postgres and sure Iā€™m doing something wrong.
So I just wondered if anybody had experienced something similar to my experiences below or could point me in the right direction to improve Postgres performance.
My initial goal was to speed up the analytical processing of my Datamarts in various Dashboards by moving from MS SQL Server to Postgres.
To get a sample query to compare speeds I ran query profiler on MS SQL Server whilst referencing a BI dashboard, which produced something similar to this (I know there are redundant columns in the sub query):
SELECT COUNT(*)
FROM (
SELECT
BM.Key_Date, BM.[Actual Date], BM.[Month]
,BM.[Month Number], BM.[Month Year], BM.[No of Working Days]
,SDI.Key_Delivery, SDI.[Order Number], SDI.[Quantity SKU]
,SDI.[Quantity Sales Unit], SDI.[FactSales - GBP], SDI.[NNSA Capsules]
,SFI.[Ship-to], SFI.[Sold-to], SFI.[Sales Force Type], SFI.Region
,SFI.[Top Level Account], SFI.[Customer Organisation]
,EX.Rate
,PDI.[Product Description], PDI.[Product Type Group], PDI.[Product Type],
PDI.[Main Product Categories], PDI.Section, PDI.Family
FROM Fact.SalesDataInvoiced AS SDI
JOIN Dimension.SalesforceInvoiced AS SFI
ON SDI.[Key_Ship-to]=SFI.[Key_Ship-to]
JOIN Dimension.BillingMonth AS BM
ON SDI.[Key_Billing Month]=BM.Key_Date
JOIN Dimension.ProductDataInvoiced AS PDI
ON SDI.[Key_Product Code]=PDI.[Key_Product Code]
CROSS JOIN Dimension.Exchange AS EX
WHERE BM.[Actual Date] BETWEEN '20160101' AND '20211001'
) AS a
GROUP BY [Product Type], [Product Type Group],[Main Product Categories]
I then installed Postgres 14 (on Centos 8) and MS SQL Server Developer 2017 (on windows 10) on separate identical laptops and created a Database and tables from the same csv data files to enable the replication of the above query.
Running a Postgres query with indexing performs massively slower than MS SQL without indexing.
Adding indexes to MS SQL produces results almost instantly.
Because of the difference in processing time I even installed Citus with Postgres14 and created Fact.SalesDataInvoiced as a columnar table (This made the processing time worse).
I have played about with memory settings in postgresql.conf but nothing seems to enable speeds comparable to MSSQL.
Explain Analyze shows that despite the indexes it always runs a sequential scan of all tables. Forcing indexed scans doesn't make any difference to processing time.
Would I be right in thinking Postgres would perform significantly better using a cluster and partitioning? Even if this is the case surely a simple query like the one I'm trying to run on a stand alone machine should be faster?
TABLE DETAILS
Dimension.BillingMonth
Records 120,
Primary Key is KeyDate,
Clustered Unique Index on KeyDate
Dimension.Exchange
Records 1
Dimension.ProductDataInvoiced
Records 275563,
Primary Key is KeyProduct,
Clustered Unique Index on KeyProduct
Dimension.SalesforceInvoiced
Records 377414,
Primary Key is KeyShipTo,
Clustered Unique Index on KeyShipTo
Fact.SalesDataInvoiced
Records 43807943,
Non-Clustered Unique Index on KeyShipTo, KeyProduct, KeyBillingMonth
Any help would be appreciated as previously mentioned I'm sure I must be missing something obvious.
Many thanks in advance.
David
Thank you for the responses. I have placed additional info below.
Forgot to add my postgres performance woes were after i'd carried out a Full Vacuum and Reindex. I performed these maintenance tasks after I had imported the data and created my indexes.
Output after querying pg_indexes
tablename
indexname
indexdef
BillingMonth
BillingMonth_pkey
CREATE UNIQUE INDEX BillingMonth_pkey ON public.BillingMonth USING btree (KeyDate)
ProductDataInvoiced
ProductDataInvoiced_pkey
CREATE UNIQUE INDEX ProductDataInvoiced_pkey ON public.ProductDataInvoiced USING btree (KeyProductCode)
SalesforceInvoiced
SalesforceInvoiced_pkey
CREATE UNIQUE INDEX SalesforceInvoiced_pkey ON public.SalesforceInvoiced USING btree (KeyShipTo)
SalesDataInvoiced
CI_SalesData
CREATE INDEX CI_SalesData ON public.SalesDataInvoiced USING btree (KeyShipTo, KeyProductCode, KeyBillingMonth)
Output After running EXPLAIN (ANALYZE, BUFFERS)
Finalize GroupAggregate (cost=1435439.30..1435565.71 rows=480 width=53) (actual time=25960.468..25973.326 rows=31 loops=1)
Group Key: pdi."ProductType", pdi."ProductTypeGroup", pdi."MainProductCategories"
Buffers: shared hit=71246 read=859119
-> Gather Merge (cost=1435439.30..1435551.31 rows=960 width=53) (actual time=25960.458..25973.282 rows=89 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=71246 read=859119
-> Sort (cost=1434439.28..1434440.48 rows=480 width=53) (actual time=25956.982..25956.989 rows=30 loops=3)
Sort Key: pdi."ProductType", pdi."ProductTypeGroup", pdi."MainProductCategories"
Sort Method: quicksort Memory: 28kB
Buffers: shared hit=71246 read=859119
Worker 0: Sort Method: quicksort Memory: 29kB
Worker 1: Sort Method: quicksort Memory: 29kB
-> Partial HashAggregate (cost=1434413.10..1434417.90 rows=480 width=53) (actual time=25956.878..25956.895 rows=30 loops=3)
Group Key: pdi."ProductType", pdi."ProductTypeGroup", pdi."MainProductCategories"
Batches: 1 Memory Usage: 49kB
Buffers: shared hit=71230 read=859119
Worker 0: Batches: 1 Memory Usage: 49kB
Worker 1: Batches: 1 Memory Usage: 49kB
-> Parallel Hash Join (cost=62124.74..1327935.46 rows=10647764 width=45) (actual time=285.864..19240.004 rows=14602648 loops=3)
Hash Cond: (sdi."KeyShipTo" = sfi."KeyShipTo")
Buffers: shared hit=71230 read=859119
-> Hash Join (cost=19648.48..1257508.51 rows=10647764 width=49) (actual time=204.794..12862.063 rows=14602648 loops=3)
Hash Cond: (sdi."KeyProductCode" = pdi."KeyProductCode")
Buffers: shared hit=32264 read=859119
-> Hash Join (cost=3.67..1091456.95 rows=10647764 width=8) (actual time=0.143..7076.104 rows=14602648 loops=3)
Hash Cond: (sdi."KeyBillingMonth" = bm."KeyDate")
Buffers: shared hit=197 read=859119
-> Parallel Seq Scan on "SalesData_Invoiced" sdi (cost=0.00..1041846.10 rows=18253310 width=12) (actual
time=0.071..2585.596 rows=14602648 loops=3)
Buffers: shared hit=194 read=859119
-> Hash (cost=2.80..2.80 rows=70 width=4) (actual time=0.049..0.050 rows=70 loops=3)
Hash Cond: (sdi."KeyBillingMonth" = bm."KeyDate")
Buffers: shared hit=197 read=859119
-> Parallel Seq Scan on "SalesData_Invoiced" sdi (cost=0.00..1041846.10 rows=18253310 width=12) (actual
time=0.071..2585.596 rows=14602648 loops=3)
Buffers: shared hit=194 read=859119
-> Hash (cost=2.80..2.80 rows=70 width=4) (actual time=0.049..0.050 rows=70 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
Buffers: shared hit=3
-> Seq Scan on "BillingMonth" bm (cost=0.00..2.80 rows=70 width=4) (actual time=0.012..0.028
rows=70 loops=3)
Filter: (("ActualDate" >= '2016-01-01'::date) AND ("ActualDate" <= '2021-10-01'::date))
Rows Removed by Filter: 50
Buffers: shared hit=3
-> Hash (cost=16200.27..16200.27 rows=275563 width=49) (actual time=203.237..203.238 rows=275563 loops=3)
Buckets: 524288 Batches: 1 Memory Usage: 26832kB
Buffers: shared hit=32067
-> Nested Loop (cost=0.00..16200.27 rows=275563 width=49) (actual time=0.034..104.143 rows=275563 loops=3)
Buffers: shared hit=32067
-> Seq Scan on "Exchange" ex (cost=0.00..1.01 rows=1 width=0) (actual time=0.024..0.024 rows=
1 loops=3)
Buffers: shared hit=3
-> Seq Scan on "ProductData_Invoiced" pdi (cost=0.00..13443.63 rows=275563 width=49) (actual
time=0.007..48.176 rows=275563 loops=3)
Buffers: shared hit=32064
-> Parallel Hash (cost=40510.56..40510.56 rows=157256 width=4) (actual time=79.536..79.536 rows=125805 loops=3)
Buckets: 524288 Batches: 1 Memory Usage: 18912kB
Buffers: shared hit=38938
-> Parallel Seq Scan on "Salesforce_Invoiced" sfi (cost=0.00..40510.56 rows=157256 width=4) (actual time=
0.011..42.968 rows=125805 loops=3)
Buffers: shared hit=38938
Planning:
Buffers: shared hit=426
Planning Time: 1.936 ms
Execution Time: 25973.709 ms
(55 rows)
Firstly, remember to run VACUUM ANALYZE after rebuilding indexes, or sometimes after importing large amount of data. (VACUUM FULL is mainly useful for the OS to reclaim disk space, and you'd still need to analyse afterwards, especially after rebuilding indexes.)
It seems from your query that your main table is SalesDataInvoiced (SDI) and that you'd want to use an index on KeyBillingMonth if possible (since it's the main restriction you're placing). In general, you'd also want indexes, at least on the other tables on the columns that are used for the joins.
As the documentation for multi-column indexes in PostgreSQL says:
A multicolumn B-tree index can be used with query conditions that involve any subset of the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns. The exact rule is that equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will be used to limit the portion of the index that is scanned. Constraints on columns to the right of these columns are checked in the index, so they save visits to the table proper, but they do not reduce the portion of the index that has to be scanned. For example, given an index on (a, b, c) and a query condition WHERE a = 5 AND b >= 42 AND c < 77, the index would have to be scanned from the first entry with a = 5 and b = 42 up through the last entry with a = 5. Index entries with c >= 77 would be skipped, but they'd still have to be scanned through. This index could in principle be used for queries that have constraints on b and/or c with no constraint on a ā€” but the entire index would have to be scanned, so in most cases the planner would prefer a sequential table scan over using the index.
In your example, the main column you'd want to use a constraint on (KeyBillingMonth) is in third position, so it's unlikely to be used.
CREATE INDEX CI_SalesData ON public.SalesDataInvoiced
USING btree (KeyShipTo, KeyProductCode, KeyBillingMonth)
Creating this should make it more likely to be used:
CREATE INDEX ON SalesDataInvoiced(KeyBillingMonth);
Then, run VACUUM ANALYZE and try your query again.
You may also want an index on BillingMonth(ActualDate), but that's not necessarily useful since there seems to be few rows (and most of them are returned in your query).
It's not clear what the BillingMonth table is for. If it's basically about truncating the ActualDate to have the first day of the month, you could for example get rid of the join on BillingMonth and use the constraint on SalesDataInvoiced.KeyBillingMonth directly. For example ... WHERE SDI.KeyBillingMonth BETWEEN '2016-01-01' AND '2021-10-01' ....
As a side-note, as far as I know, BETWEEN is inclusive for its upper bound. I'd imagine a query like this is meant to represent some monthly statistics, hence should probably not include what's on 2021-10-01 (but not the rest of that month).

LockRows plan node taking long time

I have the following query in Postgres (emulating a work queue):
DELETE FROM work_queue
WHERE id IN ( SELECT l.id
FROM work_queue l
WHERE l.delivered = 'f' and l.error = 'f' and l.archived = 'f'
ORDER BY created_at
LIMIT 5000
FOR UPDATE SKIP LOCKED );
While running the above concurrently (4 processes per second) along with a concurrent ingest at the rate of 10K records/second into work_queue, the query effectively bottlenecks on LockRow node.
Query plan output:
Delete on work_queue (cost=478.39..39609.09 rows=5000 width=67) (actual time=38734.995..38734.998 rows=0 loops=1)
-> Nested Loop (cost=478.39..39609.09 rows=5000 width=67) (actual time=36654.711..38507.393 rows=5000 loops=1)
-> HashAggregate (cost=477.96..527.96 rows=5000 width=98) (actual time=36654.690..36658.495 rows=5000 loops=1)
Group Key: "ANY_subquery".id
-> Subquery Scan on "ANY_subquery" (cost=0.43..465.46 rows=5000 width=98) (actual time=36600.963..36638.250 rows=5000 loops=1)
-> Limit (cost=0.43..415.46 rows=5000 width=51) (actual time=36600.958..36635.886 rows=5000 loops=1)
-> LockRows (cost=0.43..111701.83 rows=1345680 width=51) (actual time=36600.956..36635.039 rows=5000 loops=1)
-> Index Scan using work_queue_created_at_idx on work_queue l (cost=0.43..98245.03 rows=1345680 width=51) (actual time=779.706..2690.340 rows=250692 loops=1)
Filter: ((NOT delivered) AND (NOT error) AND (NOT archived))
-> Index Scan using work_queue_pkey on work_queue (cost=0.43..7.84 rows=1 width=43) (actual time=0.364..0.364 rows=1 loops=5000)
Index Cond: (id = "ANY_subquery".id)
Planning Time: 8.424 ms
Trigger for constraint work_queue_logs_work_queue_id_fkey: time=5490.925 calls=5000
Trigger work_queue_locked_trigger: time=2119.540 calls=1
Execution Time: 46346.471 ms
(corresponding visualization: https://explain.dalibo.com/plan/ZaZ)
Any ideas on improving this? Why should locking rows take so long in the presence of concurrent inserts? Note that if I do not have concurrent inserts into the work_queue table, the query is super fast.
We can see that the index scan returned 250692 rows in order to find 5000 to lock. So apparently we had to skip over 49 other queries worth of locked rows. That is not going to be very efficient, although if static it shouldn't be as slow as you see here. But it has to acquire a transient exclusive lock on a section of memory for each attempt. If it is fighting with many other processes for those locks, you can get a cascading collapse of performance.
If you are launching 4 such statements per second with no cap and without waiting for any previous ones to finish, then you have an unstable situation. The more you have running at one time, the more they fight each other and slow down. If the completion rate goes down but the launch interval does not, then you just get more processes fighting with more other processes and each getting slower. So once you get shoved over the edge, it might never recover on its own.
The role of concurrent insertions is probably just to provide enough noisy load on the system to give the collapse a chance to take a foothold. And of course without concurrent insertion, your deletes are doing to run out of things to delete pretty soon, at which point they will be very fast.

Large COUNT DISTINCT performs slowly in postgresql

I am running a big count(DISTINCT) query group by query against a table on postgresql 12. The table is roughly 32GB, 300MM rows. It is partitioned by YEAR. The groups are more or less exactly distributed:
EXPLAIN (ANALYZE,BUFFERS)
SELECT
date_trunc('month', condition_start_date::timestamp) as dt,
condition_source_value,
COUNT(DISTINCT person_id)
FROM synpuf5.condition_occurrence_yrpart
GROUP BY date_trunc('month', condition_start_date::timestamp), condition_source_value
ORDER BY COUNT(DISTINCT person_id) DESC LIMIT 10;
Here is the output of the query planner:
QUERY PLAN
Limit (cost=50052765961.82..50052765961.85 rows=10 width=21) (actual time=691022.306..691022.308 rows=10 loops=1)
Buffers: shared hit=3062256 read=222453
-> Sort (cost=50052765961.82..50052777188.87 rows=4490820 width=21) (actual time=690786.364..690786.364 rows=10 loops=1)
Sort Key: (count(DISTINCT condition_occurrence_yrpart_2007.person_id)) DESC
Sort Method: top-N heapsort Memory: 26kB
Buffers: shared hit=3062256 read=222453
-> GroupAggregate (cost=50049709699.80..50052668916.82 rows=4490820 width=21) (actual time=567099.326..690705.612 rows=360849 loops=1)
Group Key: (date_trunc('month'::text, (condition_occurrence_yrpart_2007.condition_start_date)::timestamp without time zone)), condition_occurrence_yrpart_2007.condition_source_value
Buffers: shared hit=3062253 read=222453
-> Sort (cost=50049709699.80..50050432663.48 rows=289185472 width=17) (actual time=567098.345..619461.044 rows=289182385 loops=1)
Sort Key: (date_trunc('month'::text, (condition_occurrence_yrpart_2007.condition_start_date)::timestamp without time zone)), condition_occurrence_yrpart_2007.condition_source_value
Sort Method: quicksort Memory: 30333184kB
Buffers: shared hit=3062246 read=222453
-> Append (cost=10000000000.00..50009068412.44 rows=289185472 width=17) (actual time=0.065..74222.771 rows=289182385 loops=1)
Buffers: shared hit=3062240 read=222453
-> Seq Scan on condition_occurrence_yrpart_2007 (cost=10000000000.00..10000001125.61 rows=42774 width=17) (actual time=0.064..13.756 rows=42774 loops=1)
Buffers: shared read=484
-> Seq Scan on condition_occurrence_yrpart_2008 (cost=10000000000.00..10002732063.72 rows=103678448 width=17) (actual time=0.039..21209.532 rows=103676930 loops=1)
Buffers: shared hit=954918 read=221969
-> Seq Scan on condition_occurrence_yrpart_2009 (cost=10000000000.00..10003024874.44 rows=114743696 width=17) (actual time=0.142..20191.131 rows=114743002 loops=1)
Buffers: shared hit=1303719
-> Seq Scan on condition_occurrence_yrpart_2010 (cost=10000000000.00..10001864406.36 rows=70720224 width=17) (actual time=0.050..12464.117 rows=70719679 loops=1)
Buffers: shared hit=803603
-> Seq Scan on condition_occurrence_yrpart_2011 (cost=10000000000.00..10000000014.95 rows=330 width=17) (actual time=0.022..0.022 rows=0 loops=1)
I have also heavily configured my postgresql to attempt to fit all the data in memory, including:
shared_buffers = 80GB
work_mem = 32GB
max_worker_processes = 32
max_parallel_workers_per_gather = 16
max_parallel_workers = 32
wal_compression = on
max_wal_size = 8GB
enable_seqscan = off
enable_partitionwise_join = on
enable_partitionwise_aggregate = on
parallel_tuple_cost = 0.01
parallel_setup_cost = 100.0
shared_preload_libraries = 'pg_prewarm'
effective_cache_size = 192GB
The VM I am running is quite behemoth. 256 GB ram, 32 cores. SSD which is where the postgres directory is housed...
Several questions here:
Why is it so slow?
Why is it not operating in parallel?
Why does performance not increase when I run again despite pg_prewarm?
Why is memory released when my session ends? I am using prewarm?
Why is it so slow?
Sorting 300 million rows takes a while, even with generous work_mem. More than 9 minutes of the query execution time are spent sorting for the GROUP BY.
Why is it not operating in parallel?
Because sorting cannot be parallelized in PostgreSQL.
Why does performance not increase when I run again despite pg_prewarm?
Because everything is already cached.
Why is memory released when my session ends? I am using prewarm?
The memory that your backend uses certainly will be freed when your session ends. The memory used for shared_buffers will not be freed, because that is the cache shared by all processes in the database. You don't want that memory freed.
This is a heavy query, and it takes some time. I don't think that can be improved.
You don't tell us what the partitioning expression is, but since it probably is not date_trunc('month', condition_start_date::timestamp), you don't get partitionwise aggregation despite enable_partitionwise_aggregate = on. PostgreSQL is not smart enough to infer that it could actually do that (assuming you partition on condition_start_date).
It is slow because doing stuff with 300 million rows takes some time.
It is not operating in parallel I think because COUNT(DISTINCT...) code is very old and has not seen much attention lately. It doesn't know how to use hash aggregation, nor operate in parallel. (In my hands, if I lower parallel_tuple_cost all the way to zero, it does operate in parallel, but the gather is below the massive sort and doesn't do any good. But I'm not working with your real data, so could get different results.)
You can get around the inflexibility of COUNT(DISTINCT...) by doing the DISTINCT and the COUNT in separate steps:
select dt, condition_source_value, count(person_id) from (
SELECT distinct
date_trunc('month', condition_start_date::timestamp) as dt,
condition_source_value,
person_id
FROM condition_occurrence_yrpart
) foo
GROUP BY dt, condition_source_value
ORDER BY COUNT(person_id) DESC LIMIT 10;
It still might not do the parallelization in the right spot, though.
I would rewrite the query like this:
SELECT date_trunc('month', condition_start_date::timestamp) as dt
, condition_source_value
, CNT
FROM (
SELECT CONDITION_START_DATE
, condition_source_value
, COUNT(PERSON_ID) AS CNT
FROM (SELECT CONDITION_START_DATE, CONDITION_SOURCE_VALUE, PERSON_ID
FROM synpuf5.condition_occurrence_yrpart
GROUP BY CONDITION_START_DATE, CONDITION_SOURCE_VALUE, PERSON_ID) A
GROUP BY CONDITION_START_DATE, CONDITION_SOURCE_VALUE
ORDER BY COUNT(PERSON_ID) DESC
LIMIT 10) B;
My answer is almost the same with JJANES' answer. But I think you had better reduce the number of DATE_TRUNC function calls. In the revised query, PostgreSQL would run DATE_TRUNC functioins just 10 times.
The following is the URL explaining the tuning concept of this query.
https://blog.naver.com/naivety1/222371990077

Postgresql 9.x: Index to optimize `xpath_exists` (XMLEXISTS) queries

We have queries of the form
select sum(acol)
where xpath_exists('/Root/KeyValue[Key="val"]/Value//text()', xmlcol)
What index can be built to speed up the where clause ?
A btree index created using
create index idx_01 using btree(xpath_exists('/Root/KeyValue[Key="val"]/Value//text()', xmlcol))
does not seem to be used at all.
EDIT
Setting enable_seqscan to off, the query using xpath_exists is much faster (one order of magnitude) and clearly shows using the corresponding index (the btree index built with xpath_exists).
Any clue why PostgreSQL would not be using the index and attempt a much slower sequential scan ?
Since I do not want to disable sequential scanning globally, I am back to square one and I am happily welcoming suggestions.
EDIT 2 - Explain plans
See below - Cost of first plan (seqscan off) is slightly higher but processing time much faster
b2box=# set enable_seqscan=off;
SET
b2box=# explain analyze
Select count(*)
from B2HEAD.item
where cluster = 'B2BOX' and ( ( xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()', content) ) ) offset 0 limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=22766.63..22766.64 rows=1 width=0) (actual time=606.042..606.042 rows=1 loops=1)
-> Aggregate (cost=22766.63..22766.64 rows=1 width=0) (actual time=606.039..606.039 rows=1 loops=1)
-> Bitmap Heap Scan on item (cost=1058.65..22701.38 rows=26102 width=0) (actual time=3.290..603.823 rows=4085 loops=1)
Filter: (xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()'::text, content, '{}'::text[]) AND ((cluster)::text = 'B2BOX'::text))
-> Bitmap Index Scan on item_counter_01 (cost=0.00..1052.13 rows=56515 width=0) (actual time=2.283..2.283 rows=4085 loops=1)
Index Cond: (xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()'::text, content, '{}'::text[]) = true)
Total runtime: 606.136 ms
(7 rows)
plan on explain.depesz.com
b2box=# set enable_seqscan=on;
SET
b2box=# explain analyze
Select count(*)
from B2HEAD.item
where cluster = 'B2BOX' and ( ( xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()', content) ) ) offset 0 limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=22555.71..22555.72 rows=1 width=0) (actual time=10864.163..10864.163 rows=1 loops=1)
-> Aggregate (cost=22555.71..22555.72 rows=1 width=0) (actual time=10864.160..10864.160 rows=1 loops=1)
-> Seq Scan on item (cost=0.00..22490.45 rows=26102 width=0) (actual time=33.574..10861.672 rows=4085 loops=1)
Filter: (xpath_exists('/MessageInfo[FinalRecipient="ABigBank"]//text()'::text, content, '{}'::text[]) AND ((cluster)::text = 'B2BOX'::text))
Rows Removed by Filter: 108945
Total runtime: 10864.242 ms
(6 rows)
plan on explain.depesz.com
Planner cost parameters
Cost of first plan (seqscan off) is slightly higher but processing time much faster
This tells me that your random_page_cost and seq_page_cost are probably wrong. You're likely on storage with fast random I/O - either because most of the database is cached in RAM or because you're using SSD, SAN with cache, or other storage where random I/O is inherently fast.
Try:
SET random_page_cost = 1;
SET seq_page_cost = 1.1;
to greatly reduce the cost param differences and then re-run. If that does the job consider changing those params in postgresql.conf..
Your row-count estimates are reasonable, so it doesn't look like a planner mis-estimation problem or a problem with bad table statistics.
Incorrect query
Your query is also incorrect. OFFSET 0 LIMIT 1 without an ORDER BY will produce unpredictable results unless you're guaranteed to have exactly one match, in which case the OFFSET ... LIMIT ... clauses are unnecessary and can be removed entirely.
You're usually much better off phrasing such queries as SELECT max(...) or SELECT min(...) where possible; PostgreSQL will tend to be able to use an index to just pluck off the desired value without doing an expensive table scan or an index scan and sort.
Tips
BTW, for future questions the PostgreSQL wiki has some good information in the performance category and a guide to asking Slow query questions.

PostgreSQL chooses not to use index despite improved performance

I had a DB in MySQL and am in the process of moving to PostgreSQL with a Django front-end.
I have a table of 650k-750k rows on which I perform the following query:
SELECT "MMG", "Gene", COUNT(*) FROM at_summary_typing WHERE "MMG" != '' GROUP BY "MMG", "Gene" ORDER BY COUNT(*);
In the MySQL this returns in ~0.5s. However when I switched to PostgreSQL the same query takes ~3s. I have put an index on MMG and Gene together to try and speed it up but when using EXPLAIN (analyse, buffers, verbose) I see the output shows the index is not used :
Sort (cost=59013.54..59053.36 rows=15927 width=14) (actual time=2880.222..2885.475 rows=39314 loops=1)
Output: "MMG", "Gene", (count(*))
Sort Key: (count(*))
Sort Method: external merge Disk: 3280kB
Buffers: shared hit=16093 read=11482, temp read=2230 written=2230
-> GroupAggregate (cost=55915.50..57901.90 rows=15927 width=14) (actual time=2179.809..2861.679 rows=39314 loops=1)
Output: "MMG", "Gene", count(*)
Buffers: shared hit=16093 read=11482, temp read=1819 written=1819
-> Sort (cost=55915.50..56372.29 rows=182713 width=14) (actual time=2179.782..2830.232 rows=180657 loops=1)
Output: "MMG", "Gene"
Sort Key: at_summary_typing."MMG", at_summary_typing."Gene"
Sort Method: external merge Disk: 8168kB
Buffers: shared hit=16093 read=11482, temp read=1819 written=1819
-> Seq Scan on public.at_summary_typing (cost=0.00..36821.60 rows=182713 width=14) (actual time=0.010..224.658 rows=180657 loops=1)
Output: "MMG", "Gene"
Filter: ((at_summary_typing."MMG")::text <> ''::text)
Rows Removed by Filter: 559071
Buffers: shared hit=16093 read=11482
Total runtime: 2888.804 ms
After some searching I found that I could force the use of the index by setting SET enable_seqscan = OFF; and the EXPLAIN now shows the following :
Sort (cost=1181591.18..1181631.00 rows=15927 width=14) (actual time=555.546..560.839 rows=39314 loops=1)
Output: "MMG", "Gene", (count(*))
Sort Key: (count(*))
Sort Method: external merge Disk: 3280kB
Buffers: shared hit=173219 read=87094 written=7, temp read=411 written=411
-> GroupAggregate (cost=0.42..1180479.54 rows=15927 width=14) (actual time=247.546..533.202 rows=39314 loops=1)
Output: "MMG", "Gene", count(*)
Buffers: shared hit=173219 read=87094 written=7
-> Index Only Scan using mm_gene_idx on public.at_summary_typing (cost=0.42..1178949.93 rows=182713 width=14) (actual time=247.533..497.771 rows=180657 loops=1)
Output: "MMG", "Gene"
Filter: ((at_summary_typing."MMG")::text <> ''::text)
Rows Removed by Filter: 559071
Heap Fetches: 739728
Buffers: shared hit=173219 read=87094 written=7
Total runtime: 562.735 ms
Performance now comparable with the MySQL.
The problem is that I understand that setting this is bad practice and that I should try and find a way to improve my query/encourage the use of the index automatically. However I'm very inexperienced with PostgreSQL and cannot work out how or why it is choosing to use a Seq Scan over an Index Scan in the first place.
why it is choosing to use a Seq Scan over an Index Scan in the first place
Because the seq scan is actually twice as fast as the index scan (224ms vs. 497ms) despite the fact that the index was nearly completely in the cache, but the table was not.
So choosing the seq scan was the right thing to do.
The bottleneck in the first plan is the sorting and grouping that needs to be done on disk.
The better strategy would be to increase work_mem to something more realistic than the really small default of 4MB. You might want to start with something like 16MB, by running
set work_mem=16MB;
before running your query. If that doesn't remove the "Sort Method: external merge Disk" steps, increase it work_mem further.
By increasing the work_mem it also is possible that Postgres switches to a hash aggregate instead of the sorting that it currently does which will probably be faster anyway (but isn't feasible if not enough memory is available)
Once you find a good value, you might want to make that permanent by putting the new value into postgresql.conf
Don't set this too high: that memory may be requested multiple times for each query.
If your where condition is static, you could also create a partial index matching that criteria:
create index on at_summary_typing ("MMG", "Gene")
where "MMG" <> '';
Don't forget to analyze the table to update the statistics.