Postgres optimize/replace DISTINCT - postgresql

Trying to select users with most "followed_by" joining to filter by "tag". Both tables have millions of records. Using distinct to only select unique users.
select distinct u.*
from users u join posts p
on u.id=p.user_id
where p.tags #> ARRAY['love']
order by u.followed_by desc nulls last limit 21
It runs over 16s, seems because of the 'distinct' causing a Seq Scan over 6+ million users. Here is the explain analyse
Limit (cost=15509958.30..15509959.09 rows=21 width=292) (actual time=16882.861..16883.753 rows=21 loops=1)
-> Unique (cost=15509958.30..15595560.30 rows=2282720 width=292) (actual time=16882.859..16883.749 rows=21 loops=1)
-> Sort (cost=15509958.30..15515665.10 rows=2282720 width=292) (actual time=16882.857..16883.424 rows=525 loops=1)
Sort Key: u.followed_by DESC NULLS LAST, u.id, u.username, u.fullna
Sort Method: external merge Disk: 583064kBme, u.follows, u
-> Gather (cost=1000.57..14956785.06 rows=2282720 width=292) (actual time=0.377..11506.001 rows=1680890 loops=1).media, u.profile_pic_url_hd, u.is_private, u.is_verified, u.biography, u.external_url, u.updated, u.location_id, u.final_post
Workers Planned: 9
Workers Launched: 9
-> Nested Loop (cost=0.57..14727513.06 rows=253636 width=292) (actual time=1.013..12031.634 rows=168089 loops=10)
-> Parallel Seq Scan on posts p (cost=0.00..13187797.79 rows=253636 width=8) (actual time=0.940..10872.630 rows=168089 loops=10)
Filter: (tags #> '{love}'::text[])
Rows Removed by Filter: 6251355
-> Index Scan using user_pk on users u (cost=0.57..6.06 rows=1 width=292) (actual time=0.006..0.006 rows=1 loops=1680890)
Index Cond: (id = p.user_id)
Planning time: 1.276 ms
Execution time: 16964.271 ms
Would appreciate tips on how to make this fast.
Update
Thanks to #a_horse_with_no_name, for "love" tags it became really fast
Limit (cost=1.14..4293986.91 rows=21 width=292) (actual time=1.735..31.613 rows=21 loops=1)
-> Nested Loop Semi Join (cost=1.14..10959887484.70 rows=53600 width=292) (actual time=1.733..31.607 rows=21 loops=1)
-> Index Scan using idx_followed_by on users u (cost=0.57..322693786.19 rows=232404560 width=292) (actual time=0.011..0.103 rows=32 loops=1)
-> Index Scan using fki_user_fk1 on posts p (cost=0.57..1943.85 rows=43 width=8) (actual time=0.983..0.983 rows=1 loops=32)
Index Cond: (user_id = u.id)
Filter: (tags #> '{love}'::text[])
Rows Removed by Filter: 1699
Planning time: 1.322 ms
Execution time: 31.656 ms
However for some other tags like "beautiful" it's better, but still little slow. It also takes a different execution path
Limit (cost=3893365.84..3893365.89 rows=21 width=292) (actual time=2813.876..2813.892 rows=21 loops=1)
-> Sort (cost=3893365.84..3893499.84 rows=53600 width=292) (actual time=2813.874..2813.887 rows=21 loops=1)
Sort Key: u.followed_by DESC NULLS LAST
Sort Method: top-N heapsort Memory: 34kB
-> Nested Loop (cost=3437011.27..3891920.70 rows=53600 width=292) (actual time=1130.847..2779.928 rows=35230 loops=1)
-> HashAggregate (cost=3437010.70..3437546.70 rows=53600 width=8) (actual time=1130.809..1148.209 rows=35230 loops=1)
Group Key: p.user_id
-> Bitmap Heap Scan on posts p (cost=10484.20..3434173.21 rows=1134993 width=8) (actual time=268.602..972.390 rows=814919 loops=1)
Recheck Cond: (tags #> '{beautiful}'::text[])
Heap Blocks: exact=347002
-> Bitmap Index Scan on idx_tags (cost=0.00..10200.45 rows=1134993 width=0) (actual time=168.453..168.453 rows=814919 loops=1)
Index Cond: (tags #> '{beautiful}'::text[])
-> Index Scan using user_pk on users u (cost=0.57..8.47 rows=1 width=292) (actual time=0.045..0.046 rows=1 loops=35230)
Index Cond: (id = p.user_id)
Planning time: 1.388 ms
Execution time: 2814.132 ms
I did have a gin index for 'tags' already in place

This should be faster:
select *
from users u
where exists (select *
from posts p
where u.id=p.user_id
and p.tags #> ARRAY['love'])
order by u.followed_by desc nulls last
limit 21;
If there are only a few (<10%) posts with that tag, an index on posts.tags should help as well:
create index using gin on posts (tags);

Related

Planner not using index order to sort the records using CTE

I am trying to pass some ids into an in-clause on a sorted index with the same order by condition but the query planner is explicitly sorting the data after performing index search. below are my queries.
Generate a temporary table.
SELECT a.n/20 as n, md5(a.n::TEXT) as b INTO temp_table
From generate_series(1, 100000) as a(n);
create an index
CREATE INDEX idx_temp_table ON temp_table(n ASC, b ASC);
In below query, planner uses index ordering and doesn't explicitly sorts the data.(expected)
EXPLAIN ANALYSE
SELECT * from
temp_table WHERE n = 10
ORDER BY n, b
limit 5;
Query Plan
QUERY PLAN Limit (cost=0.42..16.07 rows=5 width=36) (actual time=0.098..0.101 rows=5 loops=1)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..1565.17 rows=500 width=36) (actual time=0.095..0.098 rows=5 loops=1)
Index Cond: (n = 10)
Heap Fetches: 5 Planning time: 0.551 ms Execution time: 0.128 ms
but when i use one or more ids from a cte and pass them in clause then planner only uses index to fetch the values but explicitly sorts them afterwards (not expected).
EXPLAIN ANALYSE
WITH cte(x) AS (VALUES (10))
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
then planner uses below query plan
QUERY PLAN
QUERY PLAN
Limit (cost=85.18..85.20 rows=5 width=37) (actual time=0.073..0.075 rows=5 loops=1)
CTE cte
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)
-> Sort (cost=85.16..85.26 rows=40 width=37) (actual time=0.072..0.073 rows=5 loops=1)
Sort Key: temp_table.n, temp_table.b
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.47..84.50 rows=40 width=37) (actual time=0.037..0.056 rows=40 loops=1)
-> Unique (cost=0.05..0.06 rows=2 width=4) (actual time=0.009..0.010 rows=2 loops=1)
-> Sort (cost=0.05..0.06 rows=2 width=4) (actual time=0.009..0.010 rows=2 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.04 rows=2 width=4) (actual time=0.004..0.005 rows=2 loops=1)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..42.02 rows=20 width=37) (actual time=0.012..0.018 rows=20 loops=2)
Index Cond: (n = cte.x)
Heap Fetches: 40
Planning time: 0.166 ms
Execution time: 0.101 ms
I tried putting an explicit sorting while passing the ids in where clause so that sorted order in ids is maintained but still planner sorted explicitly
EXPLAIN ANALYSE
WITH cte(x) AS (VALUES (10))
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
Query plan
QUERY PLAN
Limit (cost=42.62..42.63 rows=5 width=37) (actual time=0.042..0.044 rows=5 loops=1)
CTE cte
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
-> Sort (cost=42.61..42.66 rows=20 width=37) (actual time=0.042..0.042 rows=5 loops=1)
Sort Key: temp_table.n, temp_table.b
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.46..42.28 rows=20 width=37) (actual time=0.025..0.033 rows=20 loops=1)
-> HashAggregate (cost=0.05..0.06 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
Group Key: cte.x
-> Sort (cost=0.03..0.04 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..42.02 rows=20 width=37) (actual time=0.014..0.020 rows=20 loops=1)
Index Cond: (n = cte.x)
Heap Fetches: 20
Planning time: 0.167 ms
Execution time: 0.074 ms
Can anyone explain why planner is using an explicit sort on the data? Is there a way to by pass this and make planner use the index sorting order so additional sorting on the records can be saved. In production, we have similar case but size of our selection is too big but only a handful of records needs to fetched with pagination. Thanks in anticipation!
It is actually a decision made by the planner, with a larger set of values(), Postgres will switch to a smarter plan, with the sort done before the merge.
select version();
\echo +++++ Original
EXPLAIN ANALYSE
WITH cte(x) AS (VALUES (10))
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
\echo +++++ TEN Values
EXPLAIN ANALYSE
WITH cte(x) AS (VALUES (10),(11),(12),(13),(14),(15),(16),(17),(18),(19)
)
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
\echo ++++++++ one row from table
EXPLAIN ANALYSE
WITH cte(x) AS (SELECT n FROM temp_table WHERE n = 10)
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
\echo ++++++++ one row from table TWO ctes
EXPLAIN ANALYSE
WITH val(x) AS (VALUES (10))
, cte(x) AS (
SELECT n FROM temp_table WHERE n IN (select x from val)
)
SELECT * from temp_table
WHERE n IN ( SELECT x from cte)
ORDER BY n, b
limit 5;
Resulting plans:
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 11.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit
(1 row)
+++++ Original
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=13.72..13.73 rows=5 width=37) (actual time=0.197..0.200 rows=5 loops=1)
CTE cte
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)
-> Sort (cost=13.71..13.76 rows=20 width=37) (actual time=0.194..0.194 rows=5 loops=1)
Sort Key: temp_table.n, temp_table.b
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.44..13.37 rows=20 width=37) (actual time=0.083..0.097 rows=20 loops=1)
-> HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.018..0.018 rows=1 loops=1)
Group Key: cte.x
-> CTE Scan on cte (cost=0.00..0.02 rows=1 width=4) (actual time=0.007..0.008 rows=1 loops=1)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..13.14 rows=20 width=37) (actual time=0.058..0.068 rows=20 loops=1)
Index Cond: (n = cte.x)
Heap Fetches: 20
Planning Time: 1.328 ms
Execution Time: 0.360 ms
(15 rows)
+++++ TEN Values
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.91..89.11 rows=5 width=37) (actual time=0.179..0.183 rows=5 loops=1)
CTE cte
-> Values Scan on "*VALUES*" (cost=0.00..0.12 rows=10 width=4) (actual time=0.001..0.007 rows=10 loops=1)
-> Merge Semi Join (cost=0.78..3528.72 rows=200 width=37) (actual time=0.178..0.181 rows=5 loops=1)
Merge Cond: (temp_table.n = cte.x)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.030..0.123 rows=204 loops=1)
Heap Fetches: 204
-> Sort (cost=0.37..0.39 rows=10 width=4) (actual time=0.023..0.023 rows=1 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.20 rows=10 width=4) (actual time=0.003..0.013 rows=10 loops=1)
Planning Time: 0.197 ms
Execution Time: 0.226 ms
(13 rows)
++++++++ one row from table
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=14.39..58.52 rows=5 width=37) (actual time=0.168..0.173 rows=5 loops=1)
CTE cte
-> Index Only Scan using idx_temp_table on temp_table temp_table_1 (cost=0.42..13.14 rows=20 width=4) (actual time=0.010..0.020 rows=20 loops=1)
Index Cond: (n = 10)
Heap Fetches: 20
-> Merge Semi Join (cost=1.25..3531.24 rows=400 width=37) (actual time=0.167..0.170 rows=5 loops=1)
Merge Cond: (temp_table.n = cte.x)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.025..0.101 rows=204 loops=1)
Heap Fetches: 204
-> Sort (cost=0.83..0.88 rows=20 width=4) (actual time=0.039..0.039 rows=1 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.40 rows=20 width=4) (actual time=0.012..0.031 rows=20 loops=1)
Planning Time: 0.243 ms
Execution Time: 0.211 ms
(15 rows)
++++++++ one row from table TWO ctes
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=14.63..58.76 rows=5 width=37) (actual time=0.224..0.229 rows=5 loops=1)
CTE val
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)
CTE cte
-> Nested Loop (cost=0.44..13.37 rows=20 width=4) (actual time=0.038..0.052 rows=20 loops=1)
-> HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=1)
Group Key: val.x
-> CTE Scan on val (cost=0.00..0.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1)
-> Index Only Scan using idx_temp_table on temp_table temp_table_1 (cost=0.42..13.14 rows=20 width=4) (actual time=0.029..0.038 rows=20 loops=1)
Index Cond: (n = val.x)
Heap Fetches: 20
-> Merge Semi Join (cost=1.25..3531.24 rows=400 width=37) (actual time=0.223..0.226 rows=5 loops=1)
Merge Cond: (temp_table.n = cte.x)
-> Index Only Scan using idx_temp_table on temp_table (cost=0.42..3276.30 rows=100000 width=37) (actual time=0.038..0.114 rows=204 loops=1)
Heap Fetches: 204
-> Sort (cost=0.83..0.88 rows=20 width=4) (actual time=0.082..0.082 rows=1 loops=1)
Sort Key: cte.x
Sort Method: quicksort Memory: 25kB
-> CTE Scan on cte (cost=0.00..0.40 rows=20 width=4) (actual time=0.040..0.062 rows=20 loops=1)
Planning Time: 0.362 ms
Execution Time: 0.313 ms
(21 rows)
Beware of CTEs!.
For the planner, CTEs are more or less black boxes, and very little is known about expected number of rows, statistics distribution, or ordering inside.
In cases where CTEs result in a bad plan (the original question is not such a case), a CTE can often be replaced by a (temp) view, which is seen by the planner in its full naked glory.
Update
Starting with version 11, CTEs are handled differently by the planner: if they do not have side effects, they are candidates for being merged with the main query. (but is still a good idea to check your query plans)
The optimizet isn't aware that the CTE is sorted. If you scan an index for multiple values and have an ORDER BY, PostgreSQL will always sort.
The only thing that comes to my mind is to create a temporary table with the values from the IN list and put an index on that temporary table. Then when you join with that table, PostgreSQL will be aware of the ordering and might for example choose a merge join that can use the indexes.
Of course that means a lot of overhead, and it could easily be that the original sort wins out.

Analyze: Why a query taking could take so long, seems costs are low?

I am having these results for analyze for a simple query that does not return more than 150 records from tables less than 200 records most of them, as I have a table that stores latest value and the other fields are FK of the data.
Update: see the new results from same query some our later. The site is not public and/or there should be not users right now as it is in development.
explain analyze
SELECT lv.station_id,
s.name AS station_name,
s.latitude,
s.longitude,
s.elevation,
lv.element_id,
e.symbol AS element_symbol,
u.symbol,
e.name AS element_name,
lv.last_datetime AS datetime,
lv.last_value AS valor,
s.basin_id,
s.municipality_id
FROM (((element_station lv /*350 records*/
JOIN stations s ON ((lv.station_id = s.id))) /*40 records*/
JOIN elements e ON ((lv.element_id = e.id))) /*103 records*/
JOIN units u ON ((e.unit_id = u.id))) /* 32 records */
WHERE s.id = lv.station_id AND e.id = lv.element_id AND lv.interval_id = 6 and
lv.last_datetime >= ((now() - '06:00:00'::interval) - '01:00:00'::interval)
I have already tried VACUUM and after that some is saved, but again after some times it goes up. I have implemented an index on the fields.
Nested Loop (cost=0.29..2654.66 rows=1 width=92) (actual time=1219.390..35296.253 rows=157 loops=1)
Join Filter: (e.unit_id = u.id)
Rows Removed by Join Filter: 4867
-> Nested Loop (cost=0.29..2652.93 rows=1 width=92) (actual time=1219.383..35294.083 rows=157 loops=1)
Join Filter: (lv.element_id = e.id)
Rows Removed by Join Filter: 16014
-> Nested Loop (cost=0.29..2648.62 rows=1 width=61) (actual time=1219.301..35132.373 rows=157 loops=1)
-> Seq Scan on element_station lv (cost=0.00..2640.30 rows=1 width=20) (actual time=1219.248..1385.517 rows=157 loops=1)
Filter: ((interval_id = 6) AND (last_datetime >= ((now() - '06:00:00'::interval) - '01:00:00'::interval)))
Rows Removed by Filter: 168
-> Index Scan using stations_pkey on stations s (cost=0.29..8.31 rows=1 width=45) (actual time=3.471..214.941 rows=1 loops=157)
Index Cond: (id = lv.station_id)
-> Seq Scan on elements e (cost=0.00..3.03 rows=103 width=35) (actual time=0.003..0.999 rows=103 loops=157)
-> Seq Scan on units u (cost=0.00..1.32 rows=32 width=8) (actual time=0.002..0.005 rows=32 loops=157)
Planning time: 8.312 ms
Execution time: 35296.427 ms
update, same query running it tonight; no changes:
Sort (cost=601.74..601.88 rows=55 width=92) (actual time=1.822..1.841 rows=172 loops=1)
Sort Key: lv.last_datetime DESC
Sort Method: quicksort Memory: 52kB
-> Nested Loop (cost=11.60..600.15 rows=55 width=92) (actual time=0.287..1.680 rows=172 loops=1)
-> Hash Join (cost=11.31..248.15 rows=55 width=51) (actual time=0.263..0.616 rows=172 loops=1)
Hash Cond: (e.unit_id = u.id)
-> Hash Join (cost=9.59..245.60 rows=75 width=51) (actual time=0.225..0.528 rows=172 loops=1)
Hash Cond: (lv.element_id = e.id)
-> Bitmap Heap Scan on element_station lv (cost=5.27..240.25 rows=75 width=20) (actual time=0.150..0.359 rows=172 loops=1)
Recheck Cond: ((last_datetime >= ((now() - '06:00:00'::interval) - '01:00:00'::interval)) AND (interval_id = 6))
Heap Blocks: exact=22
-> Bitmap Index Scan on element_station_latest (cost=0.00..5.25 rows=75 width=0) (actual time=0.136..0.136 rows=226 loops=1)
Index Cond: ((last_datetime >= ((now() - '06:00:00'::interval) - '01:00:00'::interval)) AND (interval_id = 6))
-> Hash (cost=3.03..3.03 rows=103 width=35) (actual time=0.062..0.062 rows=103 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 15kB
-> Seq Scan on elements e (cost=0.00..3.03 rows=103 width=35) (actual time=0.006..0.031 rows=103 loops=1)
-> Hash (cost=1.32..1.32 rows=32 width=8) (actual time=0.019..0.019 rows=32 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on units u (cost=0.00..1.32 rows=32 width=8) (actual time=0.003..0.005 rows=32 loops=1)
-> Index Scan using stations_pkey on stations s (cost=0.29..6.39 rows=1 width=45) (actual time=0.005..0.006 rows=1 loops=172)
Index Cond: (id = lv.station_id)
Planning time: 2.390 ms
Execution time: 2.009 ms
The problem is the misestimate of the number of rows in the sequential scan on element_station. Either autoanalyze has kicked in and calculated new statistics for the table or the data changed.
The problem is probably that PostgreSQL doesn't know the result of
((now() - '06:00:00'::interval) - '01:00:00'::interval)
at query planning time.
If that is possible for you, do it in two steps: First, calculate the expression above (either in PostgreSQL or on the client side). Then run the query with the result as a constant. That will make it easier for PostgreSQL to estimate the result count.

Postgres Table Slow Performance

We have a Product table in postgres DB. This is hosted on Heroku. We have 8 GB RAM and 250 GB disk space. 1000 IPOP allowed.
We are having proper indexes on columns.
Platform
PostgreSQL 9.5.12 on x86_64-pc-linux-gnu (Ubuntu 9.5.12-1.pgdg14.04+1), compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit
We are running a keywords search query on this table. We are having 2.8 millions records in this table. Our search query is too slow. Its giving us result in about 50 seconds. Which is too slow.
Query
SELECT
P .sfid AS prodsfid,
P .image_url__c image,
P .productcode sku,
P .Short_Description__c shortDesc,
P . NAME pname,
P .category__c,
P .price__c price,
P .description,
P .vendor_name__c vname,
P .vendor__c supSfid
FROM
staging.product2 P
JOIN (
SELECT
p1.sfid
FROM
staging.product2 p1
WHERE
p1. NAME ILIKE '%s%'
OR p1.productcode ILIKE '%s%'
) AS TEMP ON (P .sfid = TEMP .sfid)
WHERE
P .status__c = 'Available'
AND LOWER (
P .vendor_shipping_country__c
) = ANY (
VALUES
('us'),
('usa'),
('united states'),
('united states of america')
)
AND P .vendor_catalog_tier__c = ANY (
VALUES
('a1c37000000oljnAAA'),
('a1c37000000oljQAAQ'),
('a1c37000000oljQAAQ'),
('a1c37000000pT7IAAU'),
('a1c37000000omDjAAI'),
('a1c37000000oljMAAQ'),
('a1c37000000oljaAAA'),
('a1c37000000pT7SAAU'),
('a1c0R000000AFcVQAW'),
('a1c0R000000A1HAQA0'),
('a1c0R0000000OpWQAU'),
('a1c0R0000005TZMQA2'),
('a1c37000000oljdAAA'),
('a1c37000000ooTqAAI'),
('a1c37000000omLBAAY'),
('a1c0R0000005N8GQAU')
)
Here is the explain plan:
Nested Loop (cost=31.85..33886.54 rows=3681 width=750)
-> Hash Join (cost=31.77..31433.07 rows=4415 width=750)
Hash Cond: (lower((p.vendor_shipping_country__c)::text) = "*VALUES*".column1)
-> Nested Loop (cost=31.73..31423.67 rows=8830 width=761)
-> HashAggregate (cost=0.06..0.11 rows=16 width=32)
Group Key: "*VALUES*_1".column1
-> Values Scan on "*VALUES*_1" (cost=0.00..0.06 rows=16 width=32)
-> Bitmap Heap Scan on product2 p (cost=31.66..1962.32 rows=552 width=780)
Recheck Cond: ((vendor_catalog_tier__c)::text = "*VALUES*_1".column1)
Filter: ((status__c)::text = 'Available'::text)
-> Bitmap Index Scan on vendor_catalog_tier_prd_idx (cost=0.00..31.64 rows=1016 width=0)
Index Cond: ((vendor_catalog_tier__c)::text = "*VALUES*_1".column1)
-> Hash (cost=0.03..0.03 rows=4 width=32)
-> Unique (cost=0.02..0.03 rows=4 width=32)
-> Sort (cost=0.02..0.02 rows=4 width=32)
Sort Key: "*VALUES*".column1
-> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=4 width=32)
-> Index Scan using sfid_prd_idx on product2 p1 (cost=0.09..0.55 rows=1 width=19)
Index Cond: ((sfid)::text = (p.sfid)::text)
Filter: (((name)::text ~~* '%s%'::text) OR ((productcode)::text ~~* '%s%'::text))
Its returning around 140,576 records. By the way we need only top 5,000 records only. Will putting Limit help here?
Let me know how to make it fast and what is causing this slow.
EXPLAIN ANALYZE
#RaymondNijland Here is the explain analyze
Nested Loop (cost=31.83..33427.28 rows=4039 width=750) (actual time=1.903..4384.221 rows=140576 loops=1)
-> Hash Join (cost=31.74..30971.32 rows=4369 width=750) (actual time=1.852..1094.964 rows=164353 loops=1)
Hash Cond: (lower((p.vendor_shipping_country__c)::text) = "*VALUES*".column1)
-> Nested Loop (cost=31.70..30962.02 rows=8738 width=761) (actual time=1.800..911.738 rows=164353 loops=1)
-> HashAggregate (cost=0.06..0.11 rows=16 width=32) (actual time=0.012..0.019 rows=15 loops=1)
Group Key: "*VALUES*_1".column1
-> Values Scan on "*VALUES*_1" (cost=0.00..0.06 rows=16 width=32) (actual time=0.004..0.005 rows=16 loops=1)
-> Bitmap Heap Scan on product2 p (cost=31.64..1933.48 rows=546 width=780) (actual time=26.004..57.290 rows=10957 loops=15)
Recheck Cond: ((vendor_catalog_tier__c)::text = "*VALUES*_1".column1)
Filter: ((status__c)::text = 'Available'::text)
Rows Removed by Filter: 645
Heap Blocks: exact=88436
-> Bitmap Index Scan on vendor_catalog_tier_prd_idx (cost=0.00..31.61 rows=1000 width=0) (actual time=24.811..24.811 rows=11601 loops=15)
Index Cond: ((vendor_catalog_tier__c)::text = "*VALUES*_1".column1)
-> Hash (cost=0.03..0.03 rows=4 width=32) (actual time=0.032..0.032 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Unique (cost=0.02..0.03 rows=4 width=32) (actual time=0.026..0.027 rows=4 loops=1)
-> Sort (cost=0.02..0.02 rows=4 width=32) (actual time=0.026..0.026 rows=4 loops=1)
Sort Key: "*VALUES*".column1
Sort Method: quicksort Memory: 25kB
-> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=4 width=32) (actual time=0.001..0.002 rows=4 loops=1)
-> Index Scan using sfid_prd_idx on product2 p1 (cost=0.09..0.56 rows=1 width=19) (actual time=0.019..0.020 rows=1 loops=164353)
Index Cond: ((sfid)::text = (p.sfid)::text)
Filter: (((name)::text ~~* '%s%'::text) OR ((productcode)::text ~~* '%s%'::text))
Rows Removed by Filter: 0
Planning time: 2.488 ms
Execution time: 4391.378 ms
Another query version, with order by , but it seems very slow as well (140 seconds)
SELECT
P .sfid AS prodsfid,
P .image_url__c image,
P .productcode sku,
P .Short_Description__c shortDesc,
P . NAME pname,
P .category__c,
P .price__c price,
P .description,
P .vendor_name__c vname,
P .vendor__c supSfid
FROM
staging.product2 P
WHERE
P .status__c = 'Available'
AND P .vendor_shipping_country__c IN (
'us',
'usa',
'united states',
'united states of america'
)
AND P .vendor_catalog_tier__c IN (
'a1c37000000omDQAAY',
'a1c37000000omDTAAY',
'a1c37000000omDXAAY',
'a1c37000000omDYAAY',
'a1c37000000omDZAAY',
'a1c37000000omDdAAI',
'a1c37000000omDfAAI',
'a1c37000000omDiAAI',
'a1c37000000oml6AAA',
'a1c37000000oljPAAQ',
'a1c37000000oljRAAQ',
'a1c37000000oljWAAQ',
'a1c37000000oljXAAQ',
'a1c37000000oljZAAQ',
'a1c37000000oljcAAA',
'a1c37000000oljdAAA',
'a1c37000000oljlAAA',
'a1c37000000oljoAAA',
'a1c37000000oljqAAA',
'a1c37000000olnvAAA',
'a1c37000000olnwAAA',
'a1c37000000olnxAAA',
'a1c37000000olnyAAA',
'a1c37000000olo0AAA',
'a1c37000000olo1AAA',
'a1c37000000olo4AAA',
'a1c37000000olo8AAA',
'a1c37000000olo9AAA',
'a1c37000000oloCAAQ',
'a1c37000000oloFAAQ',
'a1c37000000oloIAAQ',
'a1c37000000oloJAAQ',
'a1c37000000oloMAAQ',
'a1c37000000oloNAAQ',
'a1c37000000oloSAAQ',
'a1c37000000olodAAA',
'a1c37000000oloeAAA',
'a1c37000000olzCAAQ',
'a1c37000000om0xAAA',
'a1c37000000ooV1AAI',
'a1c37000000oog8AAA',
'a1c37000000oogDAAQ',
'a1c37000000oonzAAA',
'a1c37000000oluuAAA',
'a1c37000000pT7SAAU',
'a1c37000000oljnAAA',
'a1c37000000olumAAA',
'a1c37000000oljpAAA',
'a1c37000000pUm2AAE',
'a1c37000000olo3AAA',
'a1c37000000oo1MAAQ',
'a1c37000000oo1vAAA',
'a1c37000000pWxgAAE',
'a1c37000000pYJkAAM',
'a1c37000000omDjAAI',
'a1c37000000ooTgAAI',
'a1c37000000op2GAAQ',
'a1c37000000one0AAA',
'a1c37000000oljYAAQ',
'a1c37000000pUlxAAE',
'a1c37000000oo9SAAQ',
'a1c37000000pcIYAAY',
'a1c37000000pamtAAA',
'a1c37000000pd2QAAQ',
'a1c37000000pdCOAAY',
'a1c37000000OpPaAAK',
'a1c37000000OphZAAS',
'a1c37000000olNkAAI'
)
ORDER BY p.productcode asc
LIMIT 5000
Here is the explain analyse for this:
Limit (cost=0.09..45271.54 rows=5000 width=750) (actual time=48593.355..86376.864 rows=5000 loops=1)
-> Index Scan using productcode_prd_idx on product2 p (cost=0.09..743031.39 rows=82064 width=750) (actual time=48593.353..86376.283 rows=5000 loops=1)
Filter: (((status__c)::text = 'Available'::text) AND ((vendor_shipping_country__c)::text = ANY ('{us,usa,"united states","united states of america"}'::text[])) AND ((vendor_catalog_tier__c)::text = ANY ('{a1c37000000omDQAAY,a1c37000000omDTAAY,a1c37000000omDXAAY,a1c37000000omDYAAY,a1c37000000omDZAAY,a1c37000000omDdAAI,a1c37000000omDfAAI,a1c37000000omDiAAI,a1c37000000oml6AAA,a1c37000000oljPAAQ,a1c37000000oljRAAQ,a1c37000000oljWAAQ,a1c37000000oljXAAQ,a1c37000000oljZAAQ,a1c37000000oljcAAA,a1c37000000oljdAAA,a1c37000000oljlAAA,a1c37000000oljoAAA,a1c37000000oljqAAA,a1c37000000olnvAAA,a1c37000000olnwAAA,a1c37000000olnxAAA,a1c37000000olnyAAA,a1c37000000olo0AAA,a1c37000000olo1AAA,a1c37000000olo4AAA,a1c37000000olo8AAA,a1c37000000olo9AAA,a1c37000000oloCAAQ,a1c37000000oloFAAQ,a1c37000000oloIAAQ,a1c37000000oloJAAQ,a1c37000000oloMAAQ,a1c37000000oloNAAQ,a1c37000000oloSAAQ,a1c37000000olodAAA,a1c37000000oloeAAA,a1c37000000olzCAAQ,a1c37000000om0xAAA,a1c37000000ooV1AAI,a1c37000000oog8AAA,a1c37000000oogDAAQ,a1c37000000oonzAAA,a1c37000000oluuAAA,a1c37000000pT7SAAU,a1c37000000oljnAAA,a1c37000000olumAAA,a1c37000000oljpAAA,a1c37000000pUm2AAE,a1c37000000olo3AAA,a1c37000000oo1MAAQ,a1c37000000oo1vAAA,a1c37000000pWxgAAE,a1c37000000pYJkAAM,a1c37000000omDjAAI,a1c37000000ooTgAAI,a1c37000000op2GAAQ,a1c37000000one0AAA,a1c37000000oljYAAQ,a1c37000000pUlxAAE,a1c37000000oo9SAAQ,a1c37000000pcIYAAY,a1c37000000pamtAAA,a1c37000000pd2QAAQ,a1c37000000pdCOAAY,a1c37000000OpPaAAK,a1c37000000OphZAAS,a1c37000000olNkAAI}'::text[])))
Rows Removed by Filter: 1707920
Planning time: 1.685 ms
Execution time: 86377.139 ms
Thanks
Aslam Bari
You might want to consider a GIN or GIST index on your staging.product2 table. Double-sided ILIKEs are slow and difficult to improve substantially. I've seen a GIN index improve a similar query by 60-80%.
See this doc.

Explain postgres query, why is the query that much longer with WHERE and LIMIT

I'm using postgres v9.6.5. I have a query which seems not that complicated and was wondering why is it so "slow" (it's not really that slow, but I don't have a lot of data actually - like a few thousand rows).
Here is the query:
SELECT o0.*
FROM "orders" AS o0
JOIN "balances" AS b1 ON b1."id" = o0."balance_id"
JOIN "users" AS u3 ON u3."id" = b1."user_id"
WHERE (u3."partner_id" = 3)
ORDER BY o0."id" DESC LIMIT 10;
And that's query plan:
Limit (cost=0.43..12.84 rows=10 width=148) (actual time=0.062..53.866 rows=4 loops=1)
-> Nested Loop (cost=0.43..4750.03 rows=3826 width=148) (actual time=0.061..53.864 rows=4 loops=1)
Join Filter: (b1.user_id = u3.id)
Rows Removed by Join Filter: 67404
-> Nested Loop (cost=0.43..3945.32 rows=17856 width=152) (actual time=0.025..38.457 rows=16852 loops=1)
-> Index Scan Backward using orders_pkey on orders o0 (cost=0.29..897.80 rows=17856 width=148) (actual time=0.016..11.558 rows=16852 loops=1)
-> Index Scan using balances_pkey on balances b1 (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=16852)
Index Cond: (id = o0.balance_id)
-> Materialize (cost=0.00..1.19 rows=3 width=4) (actual time=0.000..0.000 rows=4 loops=16852)
-> Seq Scan on users u3 (cost=0.00..1.18 rows=3 width=4) (actual time=0.023..0.030 rows=4 loops=1)
Filter: (partner_id = 3)
Rows Removed by Filter: 12
Planning time: 0.780 ms
Execution time: 54.053 ms
I actually tried without LIMIT and I got quite different plan:
Sort (cost=874.23..883.80 rows=3826 width=148) (actual time=11.361..11.362 rows=4 loops=1)
Sort Key: o0.id DESC
Sort Method: quicksort Memory: 26kB
-> Hash Join (cost=3.77..646.55 rows=3826 width=148) (actual time=11.300..11.346 rows=4 loops=1)
Hash Cond: (o0.balance_id = b1.id)
-> Seq Scan on orders o0 (cost=0.00..537.56 rows=17856 width=148) (actual time=0.012..8.464 rows=16852 loops=1)
-> Hash (cost=3.55..3.55 rows=18 width=4) (actual time=0.125..0.125 rows=24 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Hash Join (cost=1.21..3.55 rows=18 width=4) (actual time=0.046..0.089 rows=24 loops=1)
Hash Cond: (b1.user_id = u3.id)
-> Seq Scan on balances b1 (cost=0.00..1.84 rows=84 width=8) (actual time=0.011..0.029 rows=96 loops=1)
-> Hash (cost=1.18..1.18 rows=3 width=4) (actual time=0.028..0.028 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on users u3 (cost=0.00..1.18 rows=3 width=4) (actual time=0.014..0.021 rows=4 loops=1)
Filter: (partner_id = 3)
Rows Removed by Filter: 12
Planning time: 0.569 ms
Execution time: 11.420 ms
And also without WHERE (but with LIMIT):
Limit (cost=0.43..4.74 rows=10 width=148) (actual time=0.023..0.066 rows=10 loops=1)
-> Nested Loop (cost=0.43..7696.26 rows=17856 width=148) (actual time=0.022..0.065 rows=10 loops=1)
Join Filter: (b1.user_id = u3.id)
Rows Removed by Join Filter: 139
-> Nested Loop (cost=0.43..3945.32 rows=17856 width=152) (actual time=0.009..0.029 rows=10 loops=1)
-> Index Scan Backward using orders_pkey on orders o0 (cost=0.29..897.80 rows=17856 width=148) (actual time=0.007..0.015 rows=10 loops=1)
-> Index Scan using balances_pkey on balances b1 (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=10)
Index Cond: (id = o0.balance_id)
-> Materialize (cost=0.00..1.21 rows=14 width=4) (actual time=0.001..0.001 rows=15 loops=10)
-> Seq Scan on users u3 (cost=0.00..1.14 rows=14 width=4) (actual time=0.005..0.007 rows=16 loops=1)
Planning time: 0.286 ms
Execution time: 0.097 ms
As you can see, without WHERE it's much faster. Can someone provide me with some information where can I look for explanations for those plans to better understand them? And also what can I do to make those queries faster (or I shouldn't worry cause with like 100 times more data they will still be fast enough? - 50ms is fine for me tbh)
PostgreSQL thinks that it will be fastest if it scans orders in the correct order until it finds a matching users entry that satisfies the WHERE condition.
However, it seems that the data distribution is such that it has to scan almost 17000 orders before it finds a match.
Since PostgreSQL doesn't know how values correlate across tables, there is nothing much you can do to change that.
You can force PostgreSQL to plan the query without the LIMIT clause like this:
SELECT *
FROM (<your query without ORDER BY and LIMIT> OFFSET 0) q
ORDER BY id DESC LIMIT 10;
With a top-N-sort this should perform better.

postgres two column sort low performance

I've got a query that performs multiple joins. I try to get only those positions of each keyword that are latest in results.
Here is the query:
SELECT DISTINCT ON (p.keyword_id)
a.id AS account_id,
w.parent_id AS parent_id,
w.name AS name,
p.position AS position
FROM websites w
JOIN accounts a ON w.account_id = a.id
JOIN keywords k ON k.website_id = w.parent_id
JOIN positions p ON p.website_id = w.parent_id
WHERE a.amount > 0 AND w.parent_id NOTNULL AND (round((a.amount / a.payment_renewal_period), 2) BETWEEN 1 AND 19)
ORDER BY p.keyword_id, p.created_at DESC;
Plan with costs for that query is as follows:
Unique (cost=73673.65..76630.38 rows=264 width=40) (actual time=30777.117..49143.023 rows=259 loops=1)
-> Sort (cost=73673.65..75152.02 rows=591347 width=40) (actual time=30777.116..47352.373 rows=10891486 loops=1)
Sort Key: p.keyword_id, p.created_at DESC
Sort Method: external merge Disk: 512672kB
-> Merge Join (cost=219.59..812.26 rows=591347 width=40) (actual time=3.487..3827.028 rows=10891486 loops=1)
Merge Cond: (w.parent_id = k.website_id)
-> Nested Loop (cost=128.46..597.73 rows=1268 width=44) (actual time=3.378..108.915 rows=61582 loops=1)
-> Nested Loop (cost=2.28..39.86 rows=1 width=28) (actual time=0.026..0.216 rows=7 loops=1)
-> Index Scan using index_websites_on_parent_id on websites w (cost=0.14..15.08 rows=4 width=28) (actual time=0.004..0.023 rows=7 loops=1)
Index Cond: (parent_id IS NOT NULL)
-> Bitmap Heap Scan on accounts a (cost=2.15..6.18 rows=1 width=4) (actual time=0.019..0.020 rows=1 loops=7)
Recheck Cond: (id = w.account_id)
Filter: ((amount > '0'::numeric) AND (round((amount / (payment_renewal_period)::numeric), 2) >= '1'::numeric) AND (round((amount / (payment_renewal_period)::numeric), 2) <= '19'::numeric))
Heap Blocks: exact=7
-> Bitmap Index Scan on accounts_pkey (cost=0.00..2.15 rows=1 width=0) (actual time=0.006..0.006 rows=1 loops=7)
Index Cond: (id = w.account_id)
-> Bitmap Heap Scan on positions p (cost=126.18..511.57 rows=4631 width=16) (actual time=0.994..8.226 rows=8797 loops=7)
Recheck Cond: (website_id = w.parent_id)
Heap Blocks: exact=1004
-> Bitmap Index Scan on index_positions_on_5_columns (cost=0.00..125.02 rows=4631 width=0) (actual time=0.965..0.965 rows=8797 loops=7)
Index Cond: (website_id = w.parent_id)
-> Sort (cost=18.26..18.92 rows=264 width=4) (actual time=0.106..1013.966 rows=10891487 loops=1)
Sort Key: k.website_id
Sort Method: quicksort Memory: 37kB
-> Seq Scan on keywords k (cost=0.00..7.64 rows=264 width=4) (actual time=0.005..0.039 rows=263 loops=1)
Planning time: 1.081 ms
Execution time: 49184.222 ms
The thing is when I run query with w.id instead of w.parent_id in join positions part total cost decreases to
Unique (cost=3621.07..3804.99 rows=264 width=40) (actual time=128.430..139.550 rows=259 loops=1)
-> Sort (cost=3621.07..3713.03 rows=36784 width=40) (actual time=128.429..135.444 rows=40385 loops=1)
Sort Key: p.keyword_id, p.created_at DESC
Sort Method: external sort Disk: 2000kB
-> Merge Join (cost=128.73..831.59 rows=36784 width=40) (actual time=25.521..63.299 rows=40385 loops=1)
Merge Cond: (k.website_id = w.id)
-> Index Only Scan using index_keywords_on_website_id_deleted_at on keywords k (cost=0.27..24.23 rows=264 width=4) (actual time=0.137..0.274 rows=263 loops=1)
Heap Fetches: 156
-> Materialize (cost=128.46..606.85 rows=1268 width=44) (actual time=3.772..49.587 rows=72242 loops=1)
-> Nested Loop (cost=128.46..603.68 rows=1268 width=44) (actual time=3.769..30.530 rows=61582 loops=1)
-> Nested Loop (cost=2.28..45.80 rows=1 width=32) (actual time=0.047..0.204 rows=7 loops=1)
-> Index Scan using websites_pkey on websites w (cost=0.14..21.03 rows=4 width=32) (actual time=0.007..0.026 rows=7 loops=1)
Filter: (parent_id IS NOT NULL)
Rows Removed by Filter: 4
-> Bitmap Heap Scan on accounts a (cost=2.15..6.18 rows=1 width=4) (actual time=0.018..0.019 rows=1 loops=7)
Recheck Cond: (id = w.account_id)
Filter: ((amount > '0'::numeric) AND (round((amount / (payment_renewal_period)::numeric), 2) >= '1'::numeric) AND (round((amount / (payment_renewal_period)::numeric), 2) <= '19'::numeric))
Heap Blocks: exact=7
-> Bitmap Index Scan on accounts_pkey (cost=0.00..2.15 rows=1 width=0) (actual time=0.004..0.004 rows=1 loops=7)
Index Cond: (id = w.account_id)
-> Bitmap Heap Scan on positions p (cost=126.18..511.57 rows=4631 width=16) (actual time=0.930..2.341 rows=8797 loops=7)
Recheck Cond: (website_id = w.parent_id)
Heap Blocks: exact=1004
-> Bitmap Index Scan on index_positions_on_5_columns (cost=0.00..125.02 rows=4631 width=0) (actual time=0.906..0.906 rows=8797 loops=7)
Index Cond: (website_id = w.parent_id)
Planning time: 1.124 ms
Execution time: 157.167 ms
Indexes on websites
Indexes:
"websites_pkey" PRIMARY KEY, btree (id)
"index_websites_on_account_id" btree (account_id)
"index_websites_on_deleted_at" btree (deleted_at)
"index_websites_on_domain_id" btree (domain_id)
"index_websites_on_parent_id" btree (parent_id)
"index_websites_on_processed_at" btree (processed_at)
Indexes on positions
Indexes:
"positions_pkey" PRIMARY KEY, btree (id)
"index_positions_on_5_columns" UNIQUE, btree (website_id, keyword_id, created_at, engine_id, region_id)
"overlap_index" btree (keyword_id, created_at)
The second EXPLAIN output shows more than 200 times fewer rows, so it is hardly surprising that sorting is much faster.
You will notice that the sort spills to disk in both cases (Sort Method: external merge Disk: ...kB). If you can keep the sort in memory by raising work_mem, it will be much faster.
But the first sort is so large that you won't be able to fit it in memory.
Ideas to speed up the query:
An index on (keyword_id, created_at)for positions. Not sure if that helps though.
Do the filtering first, like this:
SELECT
a.id AS account_id,
w.parent_id AS parent_id,
w.name AS name,
p.position AS position
FROM (SELECT DISTINCT ON (keyword_id)
positions,
website_id,
keyword_id,
created_at
FROM positions
ORDER BY keyword_id, created_at DESC) p
JOIN ...
WHERE ...
ORDER BY p.keyword_id, p.created_at DESC;
Remark: The DISTINCT ON is somewhat strange, since you do not ORDER BY the values of the SELECT list, so the result values are not well defined.