Postgres index not used when using BETWEEN inside join - postgresql

I have the following table in my database:
business_db_dev=# \d schedules2
Table "public.schedules2"
Column | Type | Collation | Nullable | Default
-------------+--------------------------------+-----------+----------+----------------------------------------
id | bigint | | not null | nextval('schedules2_id_seq'::regclass)
monday | boolean | | not null |
tuesday | boolean | | not null |
wednesday | boolean | | not null |
thursday | boolean | | not null |
friday | boolean | | not null |
saturday | boolean | | not null |
sunday | boolean | | not null |
start1 | time(0) without time zone | | |
end1 | time(0) without time zone | | |
start2 | time(0) without time zone | | |
end2 | time(0) without time zone | | |
user_id | bigint | | not null |
inserted_at | timestamp(0) without time zone | | not null |
updated_at | timestamp(0) without time zone | | not null |
Indexes:
"schedules2_pkey" PRIMARY KEY, btree (id)
"schedules2_start1_end1_DESC_NULLS_LAST_index" btree (start1, end1 DESC NULLS LAST)
"schedules2_start2_end2_DESC_NULLS_LAST_index" btree (start2, end2 DESC NULLS LAST)
"schedules2_user_id_index" UNIQUE, btree (user_id)
Foreign-key constraints:
"schedules2_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
I also have other tables that I use to do a join with that one (users and strategies) which I will not post here for brevity, but if it is needed you can just ask and I will update the question with their structures too.
Giving this table, I'm trying to do the following query
select u.token
from strategies as st
inner join users as u on (st.user_id = u.id)
inner join schedules2 as sc on (st.user_id = sc.user_id)
where st.exchange = 'binance'
and st.market_pair = 'btc_usdt'
and st.timeframe = 'five_minutes'
and st.name = 'stoch_oscillator'
and st.inputs = '{5,3,3,80,20}'
and (sc.start1 is null or ('13:00:01'::time between sc.start1 and sc.end1) or ('13:00:01'::time between sc.start2 and sc.end2));
Running this query with explain analyze I got this result:
Nested Loop (cost=1.27..215.56 rows=16 width=6) (actual time=0.076..6.050 rows=942 loops=1)
Join Filter: (st.user_id = u.id)
-> Nested Loop (cost=0.98..197.89 rows=17 width=16) (actual time=0.070..3.650 rows=942 loops=1)
-> Index Only Scan using unique_strategy_and_user_id on strategies st (cost=0.69..7.29 rows=80 width=8) (actual time=0.056..1.083 rows=1000 loops=1)
Index Cond: ((exchange = 'binance'::text) AND (market_pair = 'btc_usdt'::text) AND (timeframe = 'five_minutes'::text) AND (name = 'stoch_oscillator'::text) AND (inputs = '{5,3,3,80,20}'::character varying[]))
Heap Fetches: 0
-> Index Scan using schedules2_user_id_index on schedules2 sc (cost=0.29..2.38 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1000)
Index Cond: (user_id = st.user_id)
Filter: ((start1 IS NULL) OR (('13:00:01'::time without time zone >= start1) AND ('13:00:01'::time without time zone <= end1)) OR (('13:00:01'::time without time zone >= start2) AND ('13:00:01'::time without time zone <= end2)))
Rows Removed by Filter: 0
-> Index Scan using users_pkey on users u (cost=0.29..1.03 rows=1 width=14) (actual time=0.002..0.002 rows=1 loops=942)
Index Cond: (id = sc.user_id)
Planning Time: 0.834 ms
Execution Time: 6.130 ms
The important part is this one:
Index Scan using schedules2_user_id_index on schedules2 sc (cost=0.29..2.38 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1000)
Index Cond: (user_id = st.user_id)
Filter: ((start1 IS NULL) OR (('13:00:01'::time without time zone >= start1) AND ('13:00:01'::time without time zone <= end1)) OR (('13:00:01'::time without time zone >= start2) AND ('13:00:01'::time without time zone <= end2)))
As you can see, Postgres is using Filter to check the values start1, end1, start2 and end2, but I was expecting that Postgres would use the two indexes I created for this exact condition:
"schedules2_start1_end1_DESC_NULLS_LAST_index" btree (start1, end1 DESC NULLS LAST)
"schedules2_start2_end2_DESC_NULLS_LAST_index" btree (start2, end2 DESC NULLS LAST)
Removing the join with schedules2 table and its condition basically halves the query time.
So, my question is, why is Postgres using Filter instead of my indexes, and how can I change the query or the indexes itself to optimize this query?
Edit: Note that the values used in the query (like '13:00:01'::time) are just examples, in my system this can be anything.

Index Cond: (user_id = st.user_id)
Filter: ....
Rows Removed by Filter: 0
The index it is already using is already perfect. It found no extra rows which had to be removed by the filter (or at least, so few that they rounded to zero). How could that be improved upon by using more indexes?

Related

In PostgreSQL what does hashed subplan mean?

I want to know how the optimizer rewrote the query and how to read the execution plan in PostgreSQL
Here is the sample code.
DROP TABLE ords;
CREATE TABLE ords (
ORD_ID INT NOT NULL,
ORD_PROD_ID VARCHAR(2) NOT NULL,
ETC_CONTENT VARCHAR(100));
ALTER TABLE ords ADD CONSTRAINT ords_PK PRIMARY KEY(ORD_ID);
CREATE INDEX ords_X01 ON ords(ORD_PROD_ID);
INSERT INTO ords
SELECT i
,chr(64+case when i <= 10 then i else 26 end)
,rpad('x',100,'x')
FROM generate_series(1,10000) a(i);
SELECT COUNT(*) FROM ords WHERE ORD_PROD_ID IN ('A','B','C');
DROP TABLE delivery;
CREATE TABLE delivery (
ORD_ID INT NOT NULL,
VEHICLE_ID VARCHAR(2) NOT NULL,
ETC_REMARKS VARCHAR(100));
ALTER TABLE delivery ADD CONSTRAINT delivery_PK primary key (ORD_ID, VEHICLE_ID);
CREATE INDEX delivery_X01 ON delivery(VEHICLE_ID);
INSERT INTO delivery
SELECT i
, chr(88 + case when i <= 10 then mod(i,2) else 2 end)
, rpad('x',100,'x')
FROM generate_series(1,10000) a(i);
analyze ords;
analyze delivery;
This is the SQL I am interested in.
SELECT *
FROM ords a
WHERE ( EXISTS (SELECT 1
FROM delivery b
WHERE a.ORD_ID = b.ORD_ID
AND b.VEHICLE_ID IN ('X','Y')
)
OR a.ORD_PROD_ID IN ('A','B','C')
);
Here is the execution plan
| Seq Scan on portal.ords a (actual time=0.038..2.027 rows=10 loops=1) |
| Output: a.ord_id, a.ord_prod_id, a.etc_content |
| Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR ((a.ord_prod_id)::text = ANY ('{A,B,C}'::text[]))) |
| Rows Removed by Filter: 9990 |
| Buffers: shared hit=181 |
| SubPlan 1 |
| -> Index Only Scan using delivery_pk on portal.delivery b (never executed) |
| Index Cond: (b.ord_id = a.ord_id) |
| Filter: ((b.vehicle_id)::text = ANY ('{X,Y}'::text[])) |
| Heap Fetches: 0 |
| SubPlan 2 |
| -> Index Scan using delivery_x01 on portal.delivery b_1 (actual time=0.023..0.025 rows=10 loops=1) |
| Output: b_1.ord_id |
| Index Cond: ((b_1.vehicle_id)::text = ANY ('{X,Y}'::text[])) |
| Buffers: shared hit=8 |
| Planning: |
| Buffers: shared hit=78 |
| Planning Time: 0.302 ms |
| Execution Time: 2.121 ms
I don't know how the optimizer transformed the SQL.
What is the final SQL the optimizer rewrote?
I have only one EXISTS sub-query in the SQL above, why are there two sub-plans?
What does "hashed Sub-Plan 2" mean?
I would appreciate it if anyone share a little knowledge with me.
You have the misconception that the optimizer rewrites the SQL statement. That is not the case. Rewriting the query is the job of the query rewriter, which for example replaces views with their definition. The optimizer comes up with a sequence of execution steps to compute the result. It produces a plan, not an SQL statement.
The optimizer plans two alternatives: either execute subplan 1 for each row found, or execute subplan 2 once (note that it is independent of a), build a hash table from the result and probe that hash for each row found in a.
At execution time, PostgreSQL decides to use the latter strategy, that is why subplan 1 is never executed.

Why is a MAX query with an equality filter on one other column so slow in Postgresql?

I'm running into an issue in PostgreSQL (version 9.6.10) with indexes not working to speed up a MAX query with a simple equality filter on another column. Logically it seems that a simple multicolumn index on (A, B DESC) should make the query super fast.
I can't for the life of me figure out why I can't get a query to be performant regardless of what indexes are defined.
The table definition has the following:
- A primary key foo VARCHAR PRIMARY KEY (not used in the query)
- A UUID field that is NOT NULL called bar UUID
- A sequential_id column that was created as a BIGSERIAL UNIQUE type
Here's what the relevant columns look like exactly (with names modified for privacy):
Table "public.foo"
Column | Type | Modifiers
----------------------+--------------------------+--------------------------------------------------------------------------------
foo_uid | character varying | not null
bar_uid | uuid | not null
sequential_id | bigint | not null default nextval('foo_sequential_id_seq'::regclass)
Indexes:
"foo_pkey" PRIMARY KEY, btree (foo_uid)
"foo_bar_uid_sequential_id_idx", btree (bar_uid, sequential_id DESC)
"foo_sequential_id_key" UNIQUE CONSTRAINT, btree (sequential_id)
Despite having the index listed above on (bar_uid, sequential_id DESC), the following query requires an index scan and takes 100-300ms with a few million rows in the database.
The Query (get the max sequential_id for a given bar_uid):
SELECT MAX(sequential_id)
FROM foo
WHERE bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f';
The EXPLAIN ANALYZE result doesn't use the proper index. Also, for some reason it checks if sequential_id IS NOT NULL even though it's declared as not null.
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.75..0.76 rows=1 width=8) (actual time=321.110..321.110 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.43..0.75 rows=1 width=8) (actual time=321.106..321.106 rows=1 loops=1)
-> Index Scan Backward using foo_sequential_id_key on foo (cost=0.43..98936.43 rows=308401 width=8) (actual time=321.106..321.106 rows=1 loops=1)
Index Cond: (sequential_id IS NOT NULL)
Filter: (bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f'::uuid)
Rows Removed by Filter: 920761
Planning time: 0.196 ms
Execution time: 321.127 ms
(9 rows)
I can add a seemingly unnecessary GROUP BY to this query, and that speeds it up a bit, but it's still really slow for a query that should be near instantaneous with indexes defined:
SELECT MAX(sequential_id)
FROM foo
WHERE bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f'
GROUP BY bar_uid;
The EXPLAIN (ANALYZE, BUFFERS) result:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=8510.54..65953.61 rows=6 width=24) (actual time=234.529..234.530 rows=1 loops=1)
Group Key: bar_uid
Buffers: shared hit=1 read=11909
-> Bitmap Heap Scan on foo (cost=8510.54..64411.55 rows=308401 width=24) (actual time=65.259..201.969 rows=309023 loops=1)
Recheck Cond: (bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f'::uuid)
Heap Blocks: exact=10385
Buffers: shared hit=1 read=11909
-> Bitmap Index Scan on foo_bar_uid_sequential_id_idx (cost=0.00..8433.43 rows=308401 width=0) (actual time=63.549..63.549 rows=309023 loops=1)
Index Cond: (bar_uid = 'fa61424d-389f-4e75-ba2d-b77e6bb8491f'::uuid)
Buffers: shared read=1525
Planning time: 3.067 ms
Execution time: 234.589 ms
(12 rows)
Does anyone have any idea what's blocking this query from being on the order of 10 milliseconds? This should logically be instantaneous with the right index defined. It should only require the time to follow links to the leaf value in the B-Tree.
Someone asked:
What do you get for SELECT * FROM pg_stats WHERE tablename = 'foo' and attname = 'bar_uid';?
schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram
------------+------------------------+-------------+-----------+-----------+-----------+------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------+------------------+-------------+-------------------+------------------------+----------------------
public | foo | bar_uir | f | 0 | 16 | 6 | {fa61424d-389f-4e75-ba2d-b77e6bb8491f,5c5dcae9-1b7e-4413-99a1-62fde2b89c32,50b1e842-fc32-4c2c-b00f-4a17c3c1c5fa,7ff1999c-c0ea-b700-343f-9a737f6ad659,f667b353-e199-4890-9ffd-4940ea11fe2c,b24ce968-29fd-4587-ba1f-227036ee3135} | {0.203733,0.203167,0.201567,0.195867,0.1952,0.000466667} | | -0.158093 | | |
(1 row)

slow order by "field" and limit

I have simple query that must get 1 record from table with about 14m records:
EXPLAIN ANALYZE SELECT "projects_toolresult"."id",
"projects_toolresult"."tool_id",
"projects_toolresult"."status",
"projects_toolresult"."updated_at",
"projects_toolresult"."created_at" FROM
"projects_toolresult" WHERE
("projects_toolresult"."status" = 1 AND
"projects_toolresult"."tool_id" = 21)
ORDER BY "projects_toolresult"."updated_at"
DESC LIMIT 1;
And it is weird that when I order query by updated_at field my query executes 60 sec.
Limit (cost=0.43..510.94 rows=1 width=151) (actual
time=56754.932..56754.932 rows=0 loops=1)
-> Index Scan using projects_to_updated_266459_idx on projects_toolresult (cost=0.43..1800549.09 rows=3527 width=151) (actual time=56754.930..56754.930 rows=0 loops=1)
Filter: ((status = 1) AND (tool_id = 21))
Rows Removed by Filter: 13709343 Planning time: 0.236 ms Execution time: 56754.968 ms (6 rows)
No matter if it will be ASC or DESC
But if I do ORDER BY RAND() or without order:
Limit (cost=23496.10..23496.10 rows=1 width=151) (actual time=447.532..447.532 rows=0 loops=1)
-> Sort (cost=23496.10..23505.20 rows=3642 width=151) (actual time=447.530..447.530 rows=0 loops=1)
Sort Key: (random())
Sort Method: quicksort Memory: 25kB
-> Index Scan using projects_toolresult_tool_id_34a3bb16 on projects_toolresult (cost=0.56..23477.89 rows=3642 width=151) (actual time=447.513..447.513 rows=0 loops=1)
Index Cond: (tool_id = 21)
Filter: (status = 1)
Rows Removed by Filter: 6097
Planning time: 0.224 ms
Execution time: 447.571 ms
(10 rows)
It working fast.
I have index on updated_at and status fields(I also tried without too). I did upgrade for default postgres settings, increased values with this generator: https://pgtune.leopard.in.ua/#/
And this is what happens when this queries in action.
Postgres version 9.5
My table and indexes:
id | integer | not null default nextval('projects_toolresult_id_seq'::regclass)
status | smallint | not null
object_id | integer | not null
created_at | timestamp with time zone | not null
content_type_id | integer | not null
tool_id | integer | not null
updated_at | timestamp with time zone | not null
output_data | text | not null
Indexes:
"projects_toolresult_pkey" PRIMARY KEY, btree (id)
"projects_toolresult_content_type_id_object_i_71ee2c2e_uniq" UNIQUE CONSTRAINT, btree (content_type_id, object_id, tool_id)
"projects_to_created_cee389_idx" btree (created_at)
"projects_to_tool_id_ec7856_idx" btree (tool_id, status)
"projects_to_updated_266459_idx" btree (updated_at)
"projects_toolresult_content_type_id_9924d905" btree (content_type_id)
"projects_toolresult_tool_id_34a3bb16" btree (tool_id)
Check constraints:
"projects_toolresult_object_id_check" CHECK (object_id >= 0)
"projects_toolresult_status_check" CHECK (status >= 0)
Foreign-key constraints:
"projects_toolresult_content_type_id_9924d905_fk_django_co" FOREIGN KEY (content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED
"projects_toolresult_tool_id_34a3bb16_fk_projects_tool_id" FOREIGN KEY (tool_id) REFERENCES projects_tool(id) DEFERRABLE INITIALLY DEFERRED
You are filtering your data on status and tool_id, and sorting on updated_at but you have no single index for all three of those columns.
Add an index, like so:
CREATE INDEX ON projects_toolresult (status, tool_id, updated_at);

PostgreSQL high CPU usage

I have a table as following schema:
location=# \d locations
Table "public.locations"
Column | Type | Modifiers
-----------+--------------------------+--------------------------------------------------------
id | integer | not null default nextval('locations_id_seq'::regclass)
phone | text | not null
longitude | text | not null
latitude | text | not null
date | text | not null
createdAt | timestamp with time zone |
updatedAt | timestamp with time zone |
Indexes:
"locations_pkey" PRIMARY KEY, btree (id)
"createdAt_idx" btree ("createdAt")
"phone_idx" btree (phone)
and it has 14928439 of rows:
location=# select count(*) from locations;
count
----------
14928439
I have a http api for query user latest uploaded coordinate by phone, But it is slowly query with sql: select * from "locations" where "phone" = '15828354860' order by "createdAt" desc limit 1;
And then I EXPLAIN it:
location=# EXPLAIN ANALYZE select * from "locations" where "phone" = '15828354860' order by "createdAt" desc limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..5.22 rows=1 width=74) (actual time=4779.584..4779.584 rows=1 loops=1)
-> Index Scan Backward using "createdAt_idx" on locations (cost=0.43..663339.70 rows=138739 width=74) (actual time=4779.583..4779.583 rows=1 loops=1)
Filter: (phone = '15828354860'::text)
Rows Removed by Filter: 2027962
Planning time: 0.101 ms
Execution time: 4779.612 ms
(6 rows)
it execution 4.7s, how to improve the query speed?

A slow sql statments , is there any way to optmize it?

Our application has a very slow statement, it takes more than 11 second, so I want to know is there any way to optimize it ?
The SQL statement
SELECT id FROM mapfriends.cell_forum_topic WHERE id in (
SELECT topicid FROM mapfriends.cell_forum_item WHERE skyid=103230293 GROUP BY topicid )
AND categoryid=29 AND hidden=false ORDER BY restoretime DESC LIMIT 10 OFFSET 0;
id
---------
2471959
2382296
1535967
2432006
2367281
2159706
1501759
1549304
2179763
1598043
(10 rows)
Time: 11444.976 ms
Plan
friends=> explain SELECT id FROM friends.cell_forum_topic WHERE id in (
friends(> SELECT topicid FROM friends.cell_forum_item WHERE skyid=103230293 GROUP BY topicid)
friends-> AND categoryid=29 AND hidden=false ORDER BY restoretime DESC LIMIT 10 OFFSET 0;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1443.15..1443.15 rows=2 width=12)
-> Sort (cost=1443.15..1443.15 rows=2 width=12)
Sort Key: cell_forum_topic.restoretime
-> Nested Loop (cost=1434.28..1443.14 rows=2 width=12)
-> HashAggregate (cost=1434.28..1434.30 rows=2 width=4)
-> Index Scan using cell_forum_item_idx_skyid on cell_forum_item (cost=0.00..1430.49 rows=1516 width=4)
Index Cond: (skyid = 103230293)
-> Index Scan using cell_forum_topic_pkey on cell_forum_topic (cost=0.00..4.40 rows=1 width=12)
Index Cond: (cell_forum_topic.id = cell_forum_item.topicid)
Filter: ((NOT cell_forum_topic.hidden) AND (cell_forum_topic.categoryid = 29))
(10 rows)
Time: 1.109 ms
Indexes
friends=> \d cell_forum_item
Table "friends.cell_forum_item"
Column | Type | Modifiers
---------+--------------------------------+--------------------------------------------------------------
id | integer | not null default nextval('cell_forum_item_id_seq'::regclass)
topicid | integer | not null
skyid | integer | not null
content | character varying(200) |
addtime | timestamp(0) without time zone | default now()
ischeck | boolean |
Indexes:
"cell_forum_item_pkey" PRIMARY KEY, btree (id)
"cell_forum_item_idx" btree (topicid, skyid)
"cell_forum_item_idx_1" btree (topicid, id)
"cell_forum_item_idx_skyid" btree (skyid)
friends=> \d cell_forum_topic
Table "friends.cell_forum_topic"
Column | Type | Modifiers
-------------+--------------------------------+-------------------------------------------------------------------------------------
-
id | integer | not null default nextval(('"friends"."cell_forum_topic_id_seq"'::text)::regclass)
categoryid | integer | not null
topic | character varying | not null
content | character varying | not null
skyid | integer | not null
addtime | timestamp(0) without time zone | default now()
reference | integer | default 0
restore | integer | default 0
restoretime | timestamp(0) without time zone | default now()
locked | boolean | default false
settop | boolean | default false
hidden | boolean | default false
feature | boolean | default false
picid | integer | default 29249
managerid | integer |
imageid | integer | default 0
pass | boolean | default false
ischeck | boolean |
Indexes:
"cell_forum_topic_pkey" PRIMARY KEY, btree (id)
"idx_cell_forum_topic_1" btree (categoryid, settop, hidden, restoretime, skyid)
"idx_cell_forum_topic_2" btree (categoryid, hidden, restoretime, skyid)
"idx_cell_forum_topic_3" btree (categoryid, hidden, restoretime)
"idx_cell_forum_topic_4" btree (categoryid, hidden, restore)
"idx_cell_forum_topic_5" btree (categoryid, hidden, restoretime, feature)
"idx_cell_forum_topic_6" btree (categoryid, settop, hidden, restoretime)
Explain analyze
mapfriends=> explain analyze SELECT id FROM mapfriends.cell_forum_topic
mapfriends-> join (SELECT topicid FROM mapfriends.cell_forum_item WHERE skyid=103230293 GROUP BY topicid) as tmp
mapfriends-> on mapfriends.cell_forum_topic.id=tmp.topicid
mapfriends-> where categoryid=29 AND hidden=false ORDER BY restoretime DESC LIMIT 10 OFFSET 0;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------
Limit (cost=1446.89..1446.90 rows=2 width=12) (actual time=18016.006..18016.013 rows=10 loops=1)
-> Sort (cost=1446.89..1446.90 rows=2 width=12) (actual time=18016.001..18016.002 rows=10 loops=1)
Sort Key: cell_forum_topic.restoretime
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=1438.02..1446.88 rows=2 width=12) (actual time=16988.492..18015.869 rows=20 loops=1)
-> HashAggregate (cost=1438.02..1438.04 rows=2 width=4) (actual time=15446.735..15447.243 rows=610 loops=1)
-> Index Scan using cell_forum_item_idx_skyid on cell_forum_item (cost=0.00..1434.22 rows=1520 width=4) (actual time=302.378..15429.782 rows=7133 loops=1)
Index Cond: (skyid = 103230293)
-> Index Scan using cell_forum_topic_pkey on cell_forum_topic (cost=0.00..4.40 rows=1 width=12) (actual time=4.210..4.210 rows=0 loops=610)
Index Cond: (cell_forum_topic.id = cell_forum_item.topicid)
Filter: ((NOT cell_forum_topic.hidden) AND (cell_forum_topic.categoryid = 29))
Total runtime: 18019.461 ms
Could you give us some more information about the tables (the statistics) and the configuration?
SELECT version();
SELECT category, name, setting FROM pg_settings WHERE name IN('effective_cache_size', 'enable_seqscan', 'shared_buffers');
SELECT * FROM pg_stat_user_tables WHERE relname IN('cell_forum_topic', 'cell_forum_item');
SELECT * FROM pg_stat_user_indexes WHERE relname IN('cell_forum_topic', 'cell_forum_item');
SELECT * FROM pg_stats WHERE tablename IN('cell_forum_topic', 'cell_forum_item');
And before getting this data, use ANALYZE.
It looks like you have a problem with an index, this is where all the query spends all it's time:
-> Index Scan using cell_forum_item_idx_skyid on
cell_forum_item (cost=0.00..1434.22
rows=1520 width=4) (actual
time=302.378..15429.782 rows=7133
loops=1)
If you use VACUUM FULL on a regular basis (NOT RECOMMENDED!), index bloat might be your problem. A REINDEX might be a good idea, just to be sure:
REINDEX TABLE cell_forum_item;
And talking about indexes, you can drop a couple of them, these are obsolete:
"idx_cell_forum_topic_6" btree (categoryid, settop, hidden, restoretime)
"idx_cell_forum_topic_3" btree (categoryid, hidden, restoretime)
Other indexes have the same data and can be used by the database as well.
It looks like you have a couple of problems:
autovacuum is turned off or it's way
behind. That last autovacuum was on
2010-12-02 and you have 256734 dead
tuples in one table and 451430 dead
ones in the other.... You have to do
something about this, this is a
serious problem.
When autovacuum is working again, you
have to do a VACUUM FULL and a
REINDEX to force a table rewrite and
get rid of all empty space in your
tables.
after fixing the vacuum-problem, you
have to analyze as well: the database
expects 1520 results but it gets 7133
results. This could be a problem with
statistics, maybe you have to
increase the STATISTICS.
The query itself needs some rewriting
as well: It gets 7133 results but it
needs only 610 results. Over 90% of
the results are lost... And getting
these 7133 takes a lot of time, over
15 seconds. Get rid of the subquery by using a JOIN without the GROUP BY or use EXISTS, also without the GROUP BY.
But first get autovacuum back on track, before you get new or other problems.
the problem isn't due to lack of caching of the query plan but most likely due to the choice of plan due to lack of appropriate indexes