Seemingly Random Delay in queries - postgresql

This is a follow up to this issue I posted a while ago.
I have the following code:
SET work_mem = '16MB';
SELECT s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
FROM rm_o_resource_usage_instance_splits_new s
INNER JOIN rm_o_resource_usage r ON s.usage_id = r.id
INNER JOIN scheduledactivities sa ON s.activity_index = sa.activity_index AND r.schedule_id = sa.solution_id and s.solution = sa.solution_id
WHERE r.schedule_id = 10
ORDER BY r.resource_id, s.start_date
When I run EXPLAIN (ANALYZE, BUFFERS) I get the following:
Sort (cost=3724.02..3724.29 rows=105 width=89) (actual time=245.802..247.573 rows=22302 loops=1)
Sort Key: r.resource_id, s.start_date
Sort Method: quicksort Memory: 6692kB
Buffers: shared hit=198702 read=5993 written=612
-> Nested Loop (cost=703.76..3720.50 rows=105 width=89) (actual time=1.898..164.741 rows=22302 loops=1)
Buffers: shared hit=198702 read=5993 written=612
-> Hash Join (cost=703.34..3558.54 rows=105 width=101) (actual time=1.815..11.259 rows=22302 loops=1)
Hash Cond: (s.usage_id = r.id)
Buffers: shared hit=3 read=397 written=2
-> Bitmap Heap Scan on rm_o_resource_usage_instance_splits_new s (cost=690.61..3486.58 rows=22477 width=69) (actual time=1.782..5.820 rows=22302 loops=1)
Recheck Cond: (solution = 10)
Heap Blocks: exact=319
Buffers: shared hit=2 read=396 written=2
-> Bitmap Index Scan on rm_o_resource_usage_instance_splits_new_solution_idx (cost=0.00..685.00 rows=22477 width=0) (actual time=1.609..1.609 rows=22302 loops=1)
Index Cond: (solution = 10)
Buffers: shared hit=2 read=77
-> Hash (cost=12.66..12.66 rows=5 width=48) (actual time=0.023..0.023 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1 read=1
-> Bitmap Heap Scan on rm_o_resource_usage r (cost=4.19..12.66 rows=5 width=48) (actual time=0.020..0.020 rows=1 loops=1)
Recheck Cond: (schedule_id = 10)
Heap Blocks: exact=1
Buffers: shared hit=1 read=1
-> Bitmap Index Scan on rm_o_resource_usage_sched (cost=0.00..4.19 rows=5 width=0) (actual time=0.017..0.017 rows=1 loops=1)
Index Cond: (schedule_id = 10)
Buffers: shared read=1
-> Index Scan using scheduledactivities_activity_index_idx on scheduledactivities sa (cost=0.42..1.53 rows=1 width=16) (actual time=0.004..0.007 rows=1 loops=22302)
Index Cond: (activity_index = s.activity_index)
Filter: (solution_id = 10)
Rows Removed by Filter: 5
Buffers: shared hit=198699 read=5596 written=610
Planning time: 7.070 ms
Execution time: 248.691 ms
Every time I run EXPLAIN, I get roughly the same results. The Execution Time is always between 170ms and 250ms, which, to me is perfectly fine. However, when this query is run through a C++ project (using PQexec(conn, query) where conn is a dedicated connection, and query is the above query), the time it takes seems to vary widely. In general, the query is very quick, and you don't notice a delay. The problem is, that on occasion, this query will take 2 to 3 minutes to complete.
If I open the pgadmin, and have a look at the "server activity" for the database, there's about 30 or so connections, mostly sitting at "idle". The above query's connection is marked as "active", and will stay as "active" for several minutes.
I am at a loss of why it randomly takes several minutes to complete the same query, with no change in data in the DB either. I have tried increasing the work_mem which didn't make any difference (nor did I really expect it to). Any help or suggestions would be greatly appreciated.
There isn't any more specific tags, but I'm currently using Postgres 10.11, but it's also been an issue on other versions of 10.x. System is a Xeon quad-core # 3.4Ghz, with SSD and 24GB of memory.
Per jjanes's suggestion, I put in the auto_explain. Eventually go this output:
duration: 128057.373 ms
plan:
Query Text: SET work_mem = '32MB';SELECT s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset FROM rm_o_resource_usage_instance_splits_new s INNER JOIN rm_o_resource_usage r ON s.usage_id = r.id INNER JOIN scheduledactivities sa ON s.activity_index = sa.activity_index AND r.schedule_id = sa.solution_id and s.solution = sa.solution_id WHERE r.schedule_id = 12642 ORDER BY r.resource_id, s.start_date
Sort (cost=14.36..14.37 rows=1 width=98) (actual time=128042.083..128043.287 rows=21899 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
Sort Key: r.resource_id, s.start_date
Sort Method: quicksort Memory: 6585kB
Buffers: shared hit=21198435 read=388 dirtied=119
-> Nested Loop (cost=0.85..14.35 rows=1 width=98) (actual time=4.995..127958.935 rows=21899 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
Join Filter: (s.activity_index = sa.activity_index)
Rows Removed by Join Filter: 705476285
Buffers: shared hit=21198435 read=388 dirtied=119
-> Nested Loop (cost=0.42..9.74 rows=1 width=110) (actual time=0.091..227.705 rows=21899 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, s.solution, r.resource_id, r.schedule_id
Inner Unique: true
Join Filter: (s.usage_id = r.id)
Buffers: shared hit=22102 read=388 dirtied=119
-> Index Scan using rm_o_resource_usage_instance_splits_new_solution_idx on public.rm_o_resource_usage_instance_splits_new s (cost=0.42..8.44 rows=1 width=69) (actual time=0.082..17.418 rows=21899 loops=1)
Output: s.start_time, s.end_time, s.resources, s.activity_index, s.usage_id, s.start_date, s.end_date, s.solution
Index Cond: (s.solution = 12642)
Buffers: shared hit=203 read=388 dirtied=119
-> Seq Scan on public.rm_o_resource_usage r (cost=0.00..1.29 rows=1 width=57) (actual time=0.002..0.002 rows=1 loops=21899)
Output: r.id, r.schedule_id, r.resource_id
Filter: (r.schedule_id = 12642)
Rows Removed by Filter: 26
Buffers: shared hit=21899
-> Index Scan using scheduled_activities_idx on public.scheduledactivities sa (cost=0.42..4.60 rows=1 width=16) (actual time=0.006..4.612 rows=32216 loops=21899)
Output: sa.usedresourceset, sa.activity_index, sa.solution_id
Index Cond: (sa.solution_id = 12642)
Buffers: shared hit=21176333",,,,,,,,,""
EDIT: Full definitions of the tables are below:
CREATE TABLE public.rm_o_resource_usage_instance_splits_new
(
start_time integer NOT NULL,
end_time integer NOT NULL,
resources jsonb NOT NULL,
activity_index integer NOT NULL,
usage_id bigint NOT NULL,
start_date text COLLATE pg_catalog."default" NOT NULL,
end_date text COLLATE pg_catalog."default" NOT NULL,
solution bigint NOT NULL,
CONSTRAINT rm_o_resource_usage_instance_splits_new_pkey PRIMARY KEY (start_time, activity_index, usage_id),
CONSTRAINT rm_o_resource_usage_instance_splits_new_solution_fkey FOREIGN KEY (solution)
REFERENCES public.rm_o_schedule_stats (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE,
CONSTRAINT rm_o_resource_usage_instance_splits_new_usage_id_fkey FOREIGN KEY (usage_id)
REFERENCES public.rm_o_resource_usage (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_instance_splits_new_activity_idx
ON public.rm_o_resource_usage_instance_splits_new USING btree
(activity_index ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_instance_splits_new_solution_idx
ON public.rm_o_resource_usage_instance_splits_new USING btree
(solution ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_instance_splits_new_usage_idx
ON public.rm_o_resource_usage_instance_splits_new USING btree
(usage_id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE TABLE public.rm_o_resource_usage
(
id bigint NOT NULL DEFAULT nextval('rm_o_resource_usage_id_seq'::regclass),
schedule_id bigint NOT NULL,
resource_id text COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT rm_o_resource_usage_pkey PRIMARY KEY (id),
CONSTRAINT rm_o_resource_usage_schedule_id_fkey FOREIGN KEY (schedule_id)
REFERENCES public.rm_o_schedule_stats (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_idx
ON public.rm_o_resource_usage USING btree
(id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_sched
ON public.rm_o_resource_usage USING btree
(schedule_id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE TABLE public.scheduledactivities
(
id bigint NOT NULL DEFAULT nextval('scheduledactivities_id_seq'::regclass),
solution_id bigint NOT NULL,
activity_id text COLLATE pg_catalog."default" NOT NULL,
sequence_index integer,
startminute integer,
finishminute integer,
issue text COLLATE pg_catalog."default",
activity_index integer NOT NULL,
is_objective boolean NOT NULL,
usedresourceset integer DEFAULT '-1'::integer,
start timestamp without time zone,
finish timestamp without time zone,
is_ore boolean,
is_ignored boolean,
CONSTRAINT scheduled_activities_pkey PRIMARY KEY (id),
CONSTRAINT scheduledactivities_solution_id_fkey FOREIGN KEY (solution_id)
REFERENCES public.rm_o_schedule_stats (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
CREATE INDEX scheduled_activities_activity_id_idx
ON public.scheduledactivities USING btree
(activity_id COLLATE pg_catalog."default" ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX scheduled_activities_id_idx
ON public.scheduledactivities USING btree
(id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX scheduled_activities_idx
ON public.scheduledactivities USING btree
(solution_id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX scheduledactivities_activity_index_idx
ON public.scheduledactivities USING btree
(activity_index ASC NULLS LAST)
TABLESPACE pg_default;
EDIT: Additional output from auto_explain after adding index on scheduledactivities (solution_id, activity_index)
Output: s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
Sort Key: r.resource_id, s.start_date
Sort Method: quicksort Memory: 6283kB
Buffers: shared hit=20159117 read=375 dirtied=190
-> Nested Loop (cost=0.85..10.76 rows=1 width=100) (actual time=5.518..122489.627 rows=20761 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
Join Filter: (s.activity_index = sa.activity_index)
Rows Removed by Join Filter: 668815615
Buffers: shared hit=20159117 read=375 dirtied=190
-> Nested Loop (cost=0.42..5.80 rows=1 width=112) (actual time=0.057..217.563 rows=20761 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, s.solution, r.resource_id, r.schedule_id
Inner Unique: true
Join Filter: (s.usage_id = r.id)
Buffers: shared hit=20947 read=375 dirtied=190
-> Index Scan using rm_o_resource_usage_instance_splits_new_solution_idx on public.rm_o_resource_usage_instance_splits_new s (cost=0.42..4.44 rows=1 width=69) (actual time=0.049..17.622 rows=20761 loops=1)
Output: s.start_time, s.end_time, s.resources, s.activity_index, s.usage_id, s.start_date, s.end_date, s.solution
Index Cond: (s.solution = 12644)
Buffers: shared hit=186 read=375 dirtied=190
-> Seq Scan on public.rm_o_resource_usage r (cost=0.00..1.35 rows=1 width=59) (actual time=0.002..0.002 rows=1 loops=20761)
Output: r.id, r.schedule_id, r.resource_id
Filter: (r.schedule_id = 12644)
Rows Removed by Filter: 22
Buffers: shared hit=20761
-> Index Scan using scheduled_activities_idx on public.scheduledactivities sa (cost=0.42..4.94 rows=1 width=16) (actual time=0.007..4.654 rows=32216 loops=20761)
Output: sa.usedresourceset, sa.activity_index, sa.solution_id
Index Cond: (sa.solution_id = 12644)
Buffers: shared hit=20138170",,,,,,,,,""
The easiest way to reproduce the issue is to add more values to the three tables. I didn't delete any, only did a few thousand INSERTs.

-> Index Scan using .. s (cost=0.42..8.44 rows=1 width=69) (actual time=0.082..17.418 rows=21899 loops=1)
Index Cond: (s.solution = 12642)
The planner thinks it will find 1 row, and instead finds 21899. That error can pretty clearly lead to bad plans. And a single equality condition should be estimated quite accurately, so I'd say the statistics on your table are way off. It could be that the autovac launcher is tuned poorly so it doesn't run often enough, or it could be that select parts of your data change very rapidly (did you just insert 21899 rows with s.solution = 12642 immediately before running the query?) and so the stats can't be kept accurate enough.
-> Nested Loop ...
Join Filter: (s.activity_index = sa.activity_index)
Rows Removed by Join Filter: 705476285
-> ...
-> Index Scan using scheduled_activities_idx on public.scheduledactivities sa (cost=0.42..4.60 rows=1 width=16) (actual time=0.006..4.612 rows=32216 loops=21899)
Output: sa.usedresourceset, sa.activity_index, sa.solution_id
Index Cond: (sa.solution_id = 12642)
If you can't get it to use the Hash Join, you can at least reduce the harm of the Nested Loop by building an index on scheduledactivities (solution_id, activity_index). That way the activity_index criterion could be part of the Index Condition, rather than being a Join Filter. You could probably then drop the index exclusively on solution_id, as there is little point in maintaining both indexes.

The SQL statement of the fast plan is using WHERE r.schedule_id = 10 and returns about 22000 rows (with estimated 105).
The SQL statement of the slow plan is using WHERE r.schedule_id = 12642 and returns about 21000 rows (with estimated only 1).
The slow plan is using nested loops instead of hash joins: maybe because there is a bad estimation for joins: estimated rows is 1 but actual rows is 21899.
For example in this step:
Nested Loop (cost=0.42..9.74 rows=1 width=110) (actual time=0.091..227.705 rows=21899 loops=1)
If data does not change there is maybe a statistic issue (skew data) for some columns.

Related

PostgreSQL table indexing

I want to index my tables for the following query:
select
t.*
from main_transaction t
left join main_profile profile on profile.id = t.profile_id
left join main_customer customer on (customer.id = profile.user_id)
where
(upper(t.request_no) like upper(('%'||#requestNumber||'%')) or OR upper(c.phone) LIKE upper(concat('%',||#phoneNumber||,'%')))
and t.service_type = 'SERVICE_1'
and t.status = 'SUCCESS'
and t.mode = 'AUTO'
and t.transaction_type = 'WITHDRAW'
and customer.client = 'corp'
and t.pub_date>='2018-09-05' and t.pub_date<='2018-11-05'
order by t.pub_date desc, t.id asc
LIMIT 1000;
This is how I tried to index my tables:
CREATE INDEX main_transaction_pr_id ON main_transaction (profile_id);
CREATE INDEX main_profile_user_id ON main_profile (user_id);
CREATE INDEX main_customer_client ON main_customer (client);
CREATE INDEX main_transaction_gin_req_no ON main_transaction USING gin (upper(request_no) gin_trgm_ops);
CREATE INDEX main_customer_gin_phone ON main_customer USING gin (upper(phone) gin_trgm_ops);
CREATE INDEX main_transaction_general ON main_transaction (service_type, status, mode, transaction_type); --> don't know if this one is true!!
After indexing like above my query is spending over 4.5 seconds for just selecting 1000 rows!
I am selecting from the following table which has 34 columns including 3 FOREIGN KEYs and it has over 3 million data rows:
CREATE TABLE main_transaction (
id integer NOT NULL DEFAULT nextval('main_transaction_id_seq'::regclass),
description character varying(255) NOT NULL,
request_no character varying(18),
account character varying(50),
service_type character varying(50),
pub_date" timestamptz(6) NOT NULL,
"service_id" varchar(50) COLLATE "pg_catalog"."default",
....
);
I am also joining two tables (main_profile, main_customer) for searching customer.phone and for selecting customer.client. To get to the main_customer table from main_transaction table, I can only go by main_profile
My question is how can I index my table too increase performance for above query?
Please, do not use UNION for OR for this case (upper(t.request_no) like upper(('%'||#requestNumber||'%')) or OR upper(c.phone) LIKE upper(concat('%',||#phoneNumber||,'%'))) instead can we use case when condition? Because, I have to convert my PostgreSQL query into Hibernate JPA! And I don't know how to convert UNION except Hibernate - Native SQL which I am not allowed to use.
Explain:
Limit (cost=411601.73..411601.82 rows=38 width=1906) (actual time=3885.380..3885.381 rows=1 loops=1)
-> Sort (cost=411601.73..411601.82 rows=38 width=1906) (actual time=3885.380..3885.380 rows=1 loops=1)
Sort Key: t.pub_date DESC, t.id
Sort Method: quicksort Memory: 27kB
-> Hash Join (cost=20817.10..411600.73 rows=38 width=1906) (actual time=3214.473..3885.369 rows=1 loops=1)
Hash Cond: (t.profile_id = profile.id)
Join Filter: ((upper((t.request_no)::text) ~~ '%20181104-2158-2723948%'::text) OR (upper((customer.phone)::text) ~~ '%20181104-2158-2723948%'::text))
Rows Removed by Join Filter: 593118
-> Seq Scan on main_transaction t (cost=0.00..288212.28 rows=205572 width=1906) (actual time=0.068..1527.677 rows=593119 loops=1)
Filter: ((pub_date >= '2016-09-05 00:00:00+05'::timestamp with time zone) AND (pub_date <= '2018-11-05 00:00:00+05'::timestamp with time zone) AND ((service_type)::text = 'SERVICE_1'::text) AND ((status)::text = 'SUCCESS'::text) AND ((mode)::text = 'AUTO'::text) AND ((transaction_type)::text = 'WITHDRAW'::text))
Rows Removed by Filter: 2132732
-> Hash (cost=17670.80..17670.80 rows=180984 width=16) (actual time=211.211..211.211 rows=181516 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 3166kB
-> Hash Join (cost=6936.09..17670.80 rows=180984 width=16) (actual time=46.846..183.689 rows=181516 loops=1)
Hash Cond: (customer.id = profile.user_id)
-> Seq Scan on main_customer customer (cost=0.00..5699.73 rows=181106 width=16) (actual time=0.013..40.866 rows=181618 loops=1)
Filter: ((client)::text = 'corp'::text)
Rows Removed by Filter: 16920
-> Hash (cost=3680.04..3680.04 rows=198404 width=8) (actual time=46.087..46.087 rows=198404 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 2966kB
-> Seq Scan on main_profile profile (cost=0.00..3680.04 rows=198404 width=8) (actual time=0.008..20.099 rows=198404 loops=1)
Planning time: 0.757 ms
Execution time: 3885.680 ms
With the restriction to not use UNION, you won't get a good plan.
You can slightly speed up processing with the following indexes:
main_transaction ((service_type::text), (status::text), (mode::text),
(transaction_type::text), pub_date)
main_customer ((client::text))
These should at least get rid of the sequential scans, but the hash join that takes the lion's share of the processing time will remain.

PostgreSQL: improve the performance when counting the distinct

I am currently working on improving the performance of our db. And I need some help from you.
I have a table and its index like this
CREATE TABLE public.ar
(
id integer NOT NULL DEFAULT nextval('id_seq'::regclass),
user_id integer NOT NULL,
duration double precision,
is_idle boolean NOT NULL,
activity_id integer NOT NULL,
device_id integer NOT NULL,
calendar_id integer,
on_taskness integer,
week_id integer,
some_other_column_below,
CONSTRAINT id_ PRIMARY KEY (id),
CONSTRAINT a_unique_key UNIQUE (user_id, device_id, start_time_local, start_time_utc, end_time_local, end_time_utc)
)
CREATE INDEX ar_idx
ON public.ar USING btree
(week_id, calendar_id, user_id, activity_id, duration, on_taskness, is_idle)
TABLESPACE pg_default;
Then I am trying to run a query like this
EXPLAIN ANALYZE
SELECT COUNT(*)
FROM (
SELECT ar.user_id
FROM ar
WHERE ar.user_id = ANY(array[some_data]) -- data size is 352
AND ROUND(ar.duration) >0 AND ar.is_idle = false
AND ar.week_id = ANY(ARRAY[some_data]) -- data size is 37
AND ar.calendar_id = ANY(array[some_data]) -- data size is 16716
GROUP by ar.user_id
) tmp;
And below is the explain result
Aggregate (cost=31389954.72..31389954.73 rows=1 width=8) (actual time=252020.695..252020.695 rows=1 loops=1)
-> Group (cost=31389032.69..31389922.37 rows=2588 width=4) (actual time=251089.270..252020.659 rows=351 loops=1)
Group Key: ar.user_id
-> Sort (cost=31389032.69..31389477.53 rows=177935 width=4) (actual time=251089.268..251776.202 rows=6993358 loops=1)
Sort Key: ar.user_id
Sort Method: external merge Disk: 95672kB
-> Bitmap Heap Scan on ar (cost=609015.18..31371079.88 rows=177935 width=4) (actual time=1670.413..248939.440 rows=6993358 loops=1)
Recheck Cond: ((week_id = ANY ('{some_data}'::integer[])) AND (user_id = ANY ('{some_data}'::integer[])))
Rows Removed by Index Recheck: 2081028
Filter: ((NOT is_idle) AND (round(duration) > '0'::double precision) AND (calendar_id = ANY ('{some_data}'::integer[])))
Rows Removed by Filter: 534017
Heap Blocks: exact=29551 lossy=313127
-> BitmapAnd (cost=609015.18..609015.18 rows=1357521 width=0) (actual time=1666.334..1666.334 rows=0 loops=1)
-> Bitmap Index Scan on test_index_only_scan_idx (cost=0.00..272396.77 rows=6970353 width=0) (actual time=614.366..614.366 rows=7269830 loops=1)
Index Cond: ((week_id = ANY ('{some_data}'::integer[])) AND (is_idle = false))
-> Bitmap Index Scan on unique_key (cost=0.00..336529.20 rows=9948573 width=0) (actual time=1041.999..1041.999 rows=14959355 loops=1)
Index Cond: (user_id = ANY ('{some_data}'::integer[]))
Planning time: 25.563 ms
Execution time: 252029.237 ms
I used distinct as well, and the result is the same.
So my questions are below.
The ar_idx contains user_id, but when searching for rows, why does it use the unique_key instead of the index I created?
I thought group by will not do the sort(that is why I did not choose distinct), but why does the sort happen in the explain analyze?
The running time is pretty long(more than 4 minutes). How do I make it faster? Is the index wrong? Or anything else I can do.
Be advised, the ar table contains 51585203 rows.
Any help will be appreciated. Thx.
---------------------------update--------------------------
After I created this index, everything goes really fast now. I don't understand why, anyone can explain this to me?
CREATE INDEX ar_1_idx
ON public.ar USING btree
(calendar_id, user_id)
TABLESPACE pg_default;
And I changed the old index to
CREATE INDEX ar_idx
ON public.ar USING btree
(week_id, calendar, user_id, activity_id, duration, on_taskness, start_time_local, end_time_local) WHERE is_idle IS FALSE
TABLESPACE pg_default;
-----updated analyze results-----------
Aggregate (cost=31216435.97..31216435.98 rows=1 width=8) (actual time=13206.941..13206.941 rows=1 loops=1)
Buffers: shared hit=25940518 read=430315, temp read=31079 written=31079
-> Group (cost=31215436.80..31216403.88 rows=2567 width=4) (actual time=12239.336..13206.894 rows=351 loops=1)
Group Key: ar.user_id
Buffers: shared hit=25940518 read=430315, temp read=31079 written=31079
-> Sort (cost=31215436.80..31215920.34 rows=193417 width=4) (actual time=12239.334..12932.801 rows=6993358 loops=1)
Sort Key: ar.user_id
Sort Method: external merge Disk: 95664kB
Buffers: shared hit=25940518 read=430315, temp read=31079 written=31079
-> Index Scan using ar_1_idx on activity_report ar (cost=0.56..31195807.48 rows=193417 width=4) (actual time=0.275..10387.051 rows=6993358 loops=1)
Index Cond: ((calendar_id = ANY ('{some_data}'::integer[])) AND (user_id = ANY ('{some_data}'::integer[])))
Filter: ((NOT is_idle) AND (round(duration) > '0'::double precision) AND (week_id = ANY ('{some_data}'::integer[])))
Rows Removed by Filter: 590705
Buffers: shared hit=25940518 read=430315
Planning time: 25.577 ms
Execution time: 13217.611 ms

Is there a way to use pg_trgm like operator with btree indexes on PostgreSQL?

I have two tables:
table_1 with ~1 million lines, with columns id_t1: integer, c1_t1: varchar, etc.
table_2 with ~50 million lines, with columns id_t2: integer, ref_id_t1: integer, c1_t2: varchar, etc.
ref_id_t1 is filled with id_t1 values , however they are not linked by a foreign key as table_2 doesn't know about table_1.
I need to do a request on both table like the following:
SELECT * FROM table_1 t1 WHERE t1.c1_t1= 'A' AND t1.id_t1 IN
(SELECT t2.ref_id_t1 FROM table_2 t2 WHERE t2.c1_t2 LIKE '%abc%');
Without any change or with basic indexes the request takes about a minute to complete as a sequencial scan is peformed on table_2. To prevent this I created a GIN idex with gin_trgm_ops option:
CREATE EXTENSION pg_trgm;
CREATE INDEX c1_t2_gin_index ON table_2 USING gin (c1_t2, gin_trgm_ops);
However this does not solve the problem as the inner request still takes a very long time.
EXPLAIN ANALYSE SELECT t2.ref_id_t1 FROM table_2 t2 WHERE t2.c1_t2 LIKE '%abc%'
Gives the following
Bitmap Heap Scan on table_2 t2 (cost=664.20..189671.00 rows=65058 width=4) (actual time=5101.286..22854.838 rows=69631 loops=1)
Recheck Cond: ((c1_t2 )::text ~~ '%1.1%'::text)
Rows Removed by Index Recheck: 49069703
Heap Blocks: exact=611548
-> Bitmap Index Scan on gin_trg (cost=0.00..647.94 rows=65058 width=0) (actual time=4911.125..4911.125 rows=49139334 loops=1)
Index Cond: ((c1_t2)::text ~~ '%1.1%'::text)
Planning time: 0.529 ms
Execution time: 22863.017 ms
The Bitmap Index Scan is fast, but as we need t2.ref_id_t1 PostgreSQL needs to perform an Bitmap Heap Scan which is not quick on 65000 lines of data.
The solution to avoid the Bitmap Heap Scan would be to perform an Index Only Scan. This is possible using multiple column with btree indexes, see https://www.postgresql.org/docs/9.6/static/indexes-index-only-scans.html
If I change the request like to search the begining of c1_t2, even with the inner request returning 90000 lines, and if I create a btree index on c1_t2 and ref_id_t1 the request takes just over a second.
CREATE INDEX c1_t2_ref_id_t1_index
ON table_2 USING btree
(c1_t2 varchar_pattern_ops ASC NULLS LAST, ref_id_t1 ASC NULLS LAST)
EXPLAIN ANALYSE SELECT * FROM table_1 t1 WHERE t1.c1_t1= 'A' AND t1.id_t1 IN
(SELECT t2.ref_id_t1 FROM table_2 t2 WHERE t2.c1_t2 LIKE 'aaa%');
Hash Join (cost=56561.99..105233.96 rows=1 width=2522) (actual time=953.647..1068.488 rows=36 loops=1)
Hash Cond: (t1.id_t1 = t2.ref_id_t1)
-> Seq Scan on table_1 t1 (cost=0.00..48669.65 rows=615 width=2522) (actual time=0.088..667.576 rows=790 loops=1)
Filter: (c1_t1 = 'A')
Rows Removed by Filter: 1083798
-> Hash (cost=56553.74..56553.74 rows=660 width=4) (actual time=400.657..400.657 rows=69632 loops=1)
Buckets: 131072 (originally 1024) Batches: 1 (originally 1) Memory Usage: 3472kB
-> HashAggregate (cost=56547.14..56553.74 rows=660 width=4) (actual time=380.280..391.871 rows=69632 loops=1)
Group Key: t2.ref_id_t1
-> Index Only Scan using c1_t2_ref_id_t1_index on table_2 t2 (cost=0.56..53907.28 rows=1055943 width=4) (actual time=0.014..202.034 rows=974737 loops=1)
Index Cond: ((c1_t2 ~>=~ 'aaa'::text) AND (c1_t2 ~<~ 'chb'::text))
Filter: ((c1_t2 )::text ~~ 'aaa%'::text)
Heap Fetches: 0
Planning time: 1.512 ms
Execution time: 1069.712 ms
However this is not possible with gin indexes, as these indexes don't store all data in the key.
Is there a way to use pg_trmg like extension with btree index so we can have index only scan with LIKE '%abc%' requests?

Query on postgres 9.5 several times slower than on postgres 9.1

I am running performance tests on two systems that have the same postgres database,
with the same data. One system has postgres 9.1 and the other postgres 9.5.
There are only slight differences in the data are caused only by slightly different timings (time ordering) of concurrent
insertions, and they should not be significant (less than 0.1 percent of the number of rows).
The postgres configuration on both system is the same (given below).
The query, the database schema and the query plans both on postgres 9.5 and 9.1 are also given below.
I am consistently getting several times slower query execution on postgres
postgres 9.5: Execution time: 280.777 ms
postgres 9.1: Total runtime: 66.566 ms
which I can not understand.
In this case postgres 9.5 is several times slower.
It uses a different query plan compared to the postgres 9.1.
Any help with explaining why the performance is different, or
directions where to look into will be greatly appreciated.
Postgres configuration
The configuration of postgres 9.1 is:
------------------------------------------------------------------------------
shared_buffers = 2048MB
temp_buffers = 8MB
work_mem = 18MB
maintenance_work_mem = 512MB
max_stack_depth = 2MB
------------------------------------------------------------------------------
wal_level = minimal
wal_buffers = 8MB
checkpoint_segments = 32
checkpoint_timeout = 5min
checkpoint_completion_target = 0.9
checkpoint_warning = 30s
------------------------------------------------------------------------------
effective_cache_size = 5120MB
The configuration of the postgres 9.5 is:
------------------------------
shared_buffers = 2048MB
temp_buffers = 8MB
work_mem = 80MB
maintenance_work_mem = 512MB
dynamic_shared_memory_type = posix
------------------------------
wal_level = minimal
wal_buffers = 8MB
max_wal_size = 5GB
wal_keep_segments = 32
effective_cache_size = 4GB
default_statistics_target = 100
------------------------------
QUERY:
EXPLAIN ANALYZE SELECT bricks.id AS gid,TIMESTAMP '2017-07-03 06:00:00' AS time,
intensity, bricks.weather_type, polygons.geometry AS geom,
starttime, endtime, ST_AsGeoJSON(polygons.geometry) as geometry_json,
weather_types.name as weather_type_literal, notification_type, temp.level as level
FROM bricks
INNER JOIN weather_types
ON weather_types.id = bricks.weather_type
INNER JOIN polygons
ON bricks.polygon_id=polygons.id
JOIN notifications
ON bricks.notification_id = notifications.id
JOIN bricks_notif_type_priority prio
ON (bricks.polygon_id = prio.polygon_id
AND bricks.weather_type = prio.weather_type
AND notifications.notification_type = prio.priority_notif_type)
JOIN (VALUES (14, 1),
(4, 2),
(5, 3),
(1, 4),
(2, 5),
(3, 6),
(13, 7),
(15,8)) as order_list (id, ordering)
ON bricks.weather_type = order_list.id
JOIN (VALUES
(15,1,1,2,'warning'::notification_type_enum),
(15,2,2,3,'warning'::notification_type_enum),
(15,3,3,999999,'warning'::notification_type_enum),
(13,1,1,2,'warning'::notification_type_enum),
(13,2,2,3,'warning'::notification_type_enum),
(13,3,3,99999,'warning'::notification_type_enum),
(5,1,1,3,'warning'::notification_type_enum),
(5,2,3,5,'warning'::notification_type_enum),
(5,3,5,99999,'warning'::notification_type_enum),
(4,1,15,25,'warning'::notification_type_enum),
(4,2,25,35,'warning'::notification_type_enum),
(4,3,35,99999,'warning'::notification_type_enum),
(3,1,75,100,'warning'::notification_type_enum),
(3,2,100,130,'warning'::notification_type_enum),
(3,3,130,99999,'warning'::notification_type_enum),
(2,1,30,50,'warning'::notification_type_enum),
(2,2,50,100,'warning'::notification_type_enum),
(2,3,100,99999,'warning'::notification_type_enum),
(1,1,18,50,'warning'::notification_type_enum),
(1,2,50,300,'warning'::notification_type_enum),
(1,3,300,999999,'warning'::notification_type_enum),
(15,1,1,2,'autowarn'::notification_type_enum),
(15,2,2,3,'autowarn'::notification_type_enum),
(15,3,3,999999,'autowarn'::notification_type_enum),
(13,1,1,2,'autowarn'::notification_type_enum),
(13,2,2,3,'autowarn'::notification_type_enum),
(13,3,3,99999,'autowarn'::notification_type_enum),
(5,1,10,20,'autowarn'::notification_type_enum),
(5,2,20,50,'autowarn'::notification_type_enum),
(5,3,50,99999,'autowarn'::notification_type_enum),
(4,1,15,25,'autowarn'::notification_type_enum),
(4,2,25,35,'autowarn'::notification_type_enum),
(4,3,35,99999,'autowarn'::notification_type_enum),
(3,1,75,100,'autowarn'::notification_type_enum),
(3,2,100,130,'autowarn'::notification_type_enum),
(3,3,130,99999,'autowarn'::notification_type_enum),
(2,1,30,50,'autowarn'::notification_type_enum),
(2,2,50,100,'autowarn'::notification_type_enum),
(2,3,100,99999,'autowarn'::notification_type_enum),
(1,1,18,50,'autowarn'::notification_type_enum),
(1,2,50,300,'autowarn'::notification_type_enum),
(1,3,300,999999,'autowarn'::notification_type_enum),
(15,1,1,2,'forewarn'::notification_type_enum),
(15,2,2,3,'forewarn'::notification_type_enum),
(15,3,3,999999,'forewarn'::notification_type_enum),
(13,1,1,2,'forewarn'::notification_type_enum),
(13,2,2,3,'forewarn'::notification_type_enum),
(13,3,3,99999,'forewarn'::notification_type_enum),
(5,1,1,3,'forewarn'::notification_type_enum),
(5,2,3,5,'forewarn'::notification_type_enum),
(5,3,5,99999,'forewarn'::notification_type_enum),
(4,1,15,25,'forewarn'::notification_type_enum),
(4,2,25,35,'forewarn'::notification_type_enum),
(4,3,35,99999,'forewarn'::notification_type_enum),
(3,1,75,100,'forewarn'::notification_type_enum),
(3,2,100,130,'forewarn'::notification_type_enum),
(3,3,130,99999,'forewarn'::notification_type_enum),
(2,1,30,50,'forewarn'::notification_type_enum),
(2,2,50,100,'forewarn'::notification_type_enum),
(2,3,100,99999,'forewarn'::notification_type_enum),
(1,1,18,50,'forewarn'::notification_type_enum),
(1,2,50,300,'forewarn'::notification_type_enum),
(1,3,300,999999,'forewarn'::notification_type_enum),
(15,1,1,2,'auto-forewarn'::notification_type_enum),
(15,2,2,3,'auto-forewarn'::notification_type_enum),
(15,3,3,999999,'auto-forewarn'::notification_type_enum),
(13,1,1,2,'auto-forewarn'::notification_type_enum),
(13,2,2,3,'auto-forewarn'::notification_type_enum),
(13,3,3,99999,'auto-forewarn'::notification_type_enum),
(5,1,10,20,'auto-forewarn'::notification_type_enum),
(5,2,20,50,'auto-forewarn'::notification_type_enum),
(5,3,50,99999,'auto-forewarn'::notification_type_enum),
(4,1,15,25,'auto-forewarn'::notification_type_enum),
(4,2,25,35,'auto-forewarn'::notification_type_enum),
(4,3,35,99999,'auto-forewarn'::notification_type_enum),
(3,1,75,100,'auto-forewarn'::notification_type_enum),
(3,2,100,130,'auto-forewarn'::notification_type_enum),
(3,3,130,99999,'auto-forewarn'::notification_type_enum),
(2,1,30,50,'auto-forewarn'::notification_type_enum),
(2,2,50,100,'auto-forewarn'::notification_type_enum),
(2,3,100,99999,'auto-forewarn'::notification_type_enum),
(1,1,18,50,'auto-forewarn'::notification_type_enum),
(1,2,50,300,'auto-forewarn'::notification_type_enum),
(1,3,300,999999,'auto-forewarn'::notification_type_enum)
) AS temp (weather_type,level,min,max,ntype)
ON bricks.weather_type = temp.weather_type
AND notifications.notification_type=temp.ntype
AND intensity >= temp.min
AND intensity < temp.max
AND temp.level in (1,2,3)
WHERE polygons.set = 0
AND '2017-07-03 06:00:00' BETWEEN starttime AND endtime
AND weather_types.name in ('rain','snowfall','storm','freezingRain','thunderstorm')
ORDER BY
(CASE notifications.notification_type
WHEN 'forewarn' THEN 0
WHEN 'auto-forewarn' then 0
ELSE 1
END) ASC,
temp.level ASC,
order_list.ordering DESC
LIMIT 10000000
TABLE information:
Table "public.notifications"
Column | Type | Modifiers
-------------------+-----------------------------+------------------------------------------------------------
id | integer | not null default nextval('notifications_id_seq'::regclass)
weather_type | integer |
macro_blob | bytea | not null
logproxy_id | integer | not null
sent_time | timestamp without time zone | not null
max_endtime | timestamp without time zone | not null
notification_type | notification_type_enum | not null
Indexes:
"notifications_pkey" PRIMARY KEY, btree (id)
"notifications_unique_logproxy_id" UNIQUE CONSTRAINT, btree (logproxy_id)
"notifications_max_endtime_idx" btree (max_endtime)
"notifications_weather_type_idx" btree (weather_type)
Foreign-key constraints:
"notifications_weather_type_fkey" FOREIGN KEY (weather_type) REFERENCES weather_types(id) ON DELETE CASCADE
Referenced by:
TABLE "bricks_notif_extent" CONSTRAINT "bricks_notif_extent_notification_id_fkey" FOREIGN KEY (notification_id) REFERENCES notifications(id) ON DELETE CASCADE
TABLE "bricks" CONSTRAINT "bricks_notification_id_fkey" FOREIGN KEY (notification_id) REFERENCES notifications(id) ON DELETE CASCADE
Table "public.bricks"
Column | Type | Modifiers
-----------------+-----------------------------+-----------------------------------------------------
id | bigint | not null default nextval('bricks_id_seq'::regclass)
polygon_id | integer |
terrain_types | integer | not null
weather_type | integer |
intensity | integer | not null
starttime | timestamp without time zone | not null
endtime | timestamp without time zone | not null
notification_id | integer |
Indexes:
"bricks_pkey" PRIMARY KEY, btree (id)
"bricks_notification_id_idx" btree (notification_id)
"bricks_period_idx" gist (cube(date_part('epoch'::text, starttime), date_part('epoch'::text, endtime)))
"bricks_polygon_idx" btree (polygon_id)
"bricks_weather_type_idx" btree (weather_type)
Foreign-key constraints:
"bricks_notification_id_fkey" FOREIGN KEY (notification_id) REFERENCES notifications(id) ON DELETE CASCADE
"bricks_polygon_id_fkey" FOREIGN KEY (polygon_id) REFERENCES polygons(id) ON DELETE CASCADE
"bricks_weather_type_fkey" FOREIGN KEY (weather_type) REFERENCES weather_types(id) ON DELETE CASCADE
Table "public.polygons"
Column | Type | Modifiers
----------+-----------------------+-------------------------------------------------------
id | integer | not null default nextval('polygons_id_seq'::regclass)
country | character(2) | not null
set | integer | not null
geometry | geometry |
zone_id | character varying(32) |
Indexes:
"polygons_pkey" PRIMARY KEY, btree (id)
"polygons_geometry_idx" gist (geometry)
"polygons_zone_id_idx" btree (zone_id)
Check constraints:
"enforce_dims_geometry" CHECK (st_ndims(geometry) = 2)
"enforce_geotype_geometry" CHECK (geometrytype(geometry) = 'MULTIPOLYGON'::text OR geometry IS NULL)
"enforce_srid_geometry" CHECK (st_srid(geometry) = 4326)
Referenced by:
TABLE "bricks_notif_type_priority" CONSTRAINT "bricks_notif_type_priority_polygon_id_fkey" FOREIGN KEY (polygon_id) REFERENCES polygons(id) ON DELETE CASCADE
TABLE "bricks" CONSTRAINT "bricks_polygon_id_fkey" FOREIGN KEY (polygon_id) REFERENCES polygons(id) ON DELETE CASCADE
TABLE "polygon_contains" CONSTRAINT "polygon_contains_contained_id_fkey" FOREIGN KEY (contained_id) REFERENCES polygons(id) ON DELETE CASCADE
TABLE "polygon_contains" CONSTRAINT "polygon_contains_id_fkey" FOREIGN KEY (id) REFERENCES polygons(id) ON DELETE CASCADE
Table "public.weather_types"
Column | Type | Modifiers
--------+-----------------------+------------------------------------------------------------
id | integer | not null default nextval('weather_types_id_seq'::regclass)
name | character varying(32) | not null
Indexes:
"weather_types_pkey" PRIMARY KEY, btree (id)
"weather_type_unique_name" UNIQUE CONSTRAINT, btree (name)
Referenced by:
TABLE "bricks_notif_type_priority" CONSTRAINT "bricks_notif_type_priority_weather_type_fkey" FOREIGN KEY (weather_type) REFERENCES weather_types(id) ON DELETE CASCADE
TABLE "bricks" CONSTRAINT "bricks_weather_type_fkey" FOREIGN KEY (weather_type) REFERENCES weather_types(id) ON DELETE CASCADE
TABLE "notifications" CONSTRAINT "notifications_weather_type_fkey" FOREIGN KEY (weather_type) REFERENCES weather_types(id) ON DELETE CASCADE
Table "public.bricks_notif_type_priority"
Column | Type | Modifiers
---------------------+------------------------+-------------------------------------------------------------------------
id | integer | not null default nextval('bricks_notif_type_priority_id_seq'::regclass)
polygon_id | integer | not null
weather_type | integer | not null
priority_notif_type | notification_type_enum | not null
Indexes:
"bricks_notif_type_priority_pkey" PRIMARY KEY, btree (id)
"bricks_notif_type_priority_poly_idx" btree (polygon_id, weather_type)
Foreign-key constraints:
"bricks_notif_type_priority_polygon_id_fkey" FOREIGN KEY (polygon_id) REFERENCES polygons(id) ON DELETE CASCADE
"bricks_notif_type_priority_weather_type_fkey" FOREIGN KEY (weather_type) REFERENCES weather_types(id) ON DELETE CASCADE
The query plan on postgres 9.5:
Limit (cost=1339.71..1339.72 rows=1 width=2083) (actual time=280.390..280.429 rows=224 loops=1)
-> Sort (cost=1339.71..1339.72 rows=1 width=2083) (actual time=280.388..280.404 rows=224 loops=1)
Sort Key: (CASE notifications.notification_type WHEN 'forewarn'::notification_type_enum THEN 0 WHEN 'auto-forewarn'::notification_type_enum THEN 0 ELSE 1 END), "*VALUES*_1".column2, "*VALUES*".column2 DESC
Sort Method: quicksort Memory: 929kB
-> Nested Loop (cost=437.79..1339.70 rows=1 width=2083) (actual time=208.373..278.926 rows=224 loops=1)
Join Filter: (bricks.polygon_id = polygons.id)
-> Nested Loop (cost=437.50..1339.30 rows=1 width=62) (actual time=186.122..221.536 rows=307 loops=1)
Join Filter: ((weather_types.id = prio.weather_type) AND (notifications.notification_type = prio.priority_notif_type))
Rows Removed by Join Filter: 655
-> Nested Loop (cost=437.08..1336.68 rows=1 width=74) (actual time=5.522..209.237 rows=652 loops=1)
Join Filter: ("*VALUES*_1".column5 = notifications.notification_type)
Rows Removed by Join Filter: 1956
-> Merge Join (cost=436.66..1327.38 rows=4 width=74) (actual time=5.277..195.569 rows=2608 loops=1)
Merge Cond: (bricks.weather_type = weather_types.id)
-> Nested Loop (cost=435.33..1325.89 rows=28 width=60) (actual time=5.232..193.652 rows=2608 loops=1)
Join Filter: ((bricks.intensity >= "*VALUES*_1".column3) AND (bricks.intensity < "*VALUES*_1".column4))
Rows Removed by Join Filter: 5216
-> Merge Join (cost=1.61..1.67 rows=1 width=28) (actual time=0.093..0.181 rows=84 loops=1)
Merge Cond: ("*VALUES*".column1 = "*VALUES*_1".column1)
-> Sort (cost=0.22..0.24 rows=8 width=8) (actual time=0.022..0.024 rows=8 loops=1)
Sort Key: "*VALUES*".column1
Sort Method: quicksort Memory: 25kB
-> Values Scan on "*VALUES*" (cost=0.00..0.10 rows=8 width=8) (actual time=0.005..0.007 rows=8 loops=1)
-> Sort (cost=1.39..1.40 rows=3 width=20) (actual time=0.067..0.097 rows=84 loops=1)
Sort Key: "*VALUES*_1".column1
Sort Method: quicksort Memory: 31kB
-> Values Scan on "*VALUES*_1" (cost=0.00..1.36 rows=3 width=20) (actual time=0.009..0.041 rows=84 loops=1)
Filter: (column2 = ANY ('{1,2,3}'::integer[]))
-> Bitmap Heap Scan on bricks (cost=433.72..1302.96 rows=1417 width=40) (actual time=0.568..2.277 rows=93 loops=84)
Recheck Cond: (weather_type = "*VALUES*".column1)
Filter: (('2017-07-03 06:00:00'::timestamp without time zone >= starttime) AND ('2017-07-03 06:00:00'::timestamp without time zone <= endtime))
Rows Removed by Filter: 6245
Heap Blocks: exact=18528
-> Bitmap Index Scan on bricks_weather_type_idx (cost=0.00..433.37 rows=8860 width=0) (actual time=0.536..0.536 rows=6361 loops=84)
Index Cond: (weather_type = "*VALUES*".column1)
-> Sort (cost=1.33..1.35 rows=5 width=14) (actual time=0.040..0.418 rows=2609 loops=1)
Sort Key: weather_types.id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on weather_types (cost=0.00..1.28 rows=5 width=14) (actual time=0.009..0.013 rows=5 loops=1)
Filter: ((name)::text = ANY ('{rain,snowfall,storm,freezingRain,thunderstorm}'::text[]))
Rows Removed by Filter: 12
-> Index Scan using notifications_pkey on notifications (cost=0.41..2.31 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=2608)
Index Cond: (id = bricks.notification_id)
-> Index Scan using bricks_notif_type_priority_poly_idx on bricks_notif_type_priority prio (cost=0.42..2.61 rows=1 width=12) (actual time=0.017..0.017 rows=1 loops=652)
Index Cond: ((polygon_id = bricks.polygon_id) AND (weather_type = bricks.weather_type))
-> Index Scan using polygons_pkey on polygons (cost=0.29..0.38 rows=1 width=2033) (actual time=0.021..0.022 rows=1 loops=307)
Index Cond: (id = prio.polygon_id)
Filter: (set = 0)
Rows Removed by Filter: 0
Planning time: 27.326 ms
Execution time: 280.777 ms
The query plan on postgres 9.1:
Limit (cost=2486.29..2486.30 rows=1 width=8219) (actual time=66.273..66.301 rows=224 loops=1)
-> Sort (cost=2486.29..2486.30 rows=1 width=8219) (actual time=66.272..66.281 rows=224 loops=1)
Sort Key: (CASE notifications.notification_type WHEN 'forewarn'::notification_type_enum THEN 0 WHEN 'auto-forewarn'::notification_type_enum THEN 0 ELSE 1 END), "*VALUES*".column2, "*VALUES*".column2
Sort Method: quicksort Memory: 1044kB
-> Nested Loop (cost=171.27..2486.28 rows=1 width=8219) (actual time=22.064..65.335 rows=224 loops=1)
Join Filter: ((bricks.intensity >= "*VALUES*".column3) AND (bricks.intensity < "*VALUES*".column4) AND (weather_types.id = "*VALUES*".column1) AND (notifications.notification_type = "*VALUES*".column5))
-> Nested Loop (cost=171.27..2484.85 rows=1 width=8231) (actual time=16.632..24.503 rows=224 loops=1)
Join Filter: (prio.priority_notif_type = notifications.notification_type)
-> Nested Loop (cost=171.27..2482.66 rows=1 width=8231) (actual time=3.318..22.309 rows=681 loops=1)
-> Nested Loop (cost=171.27..2479.52 rows=1 width=98) (actual time=3.175..19.252 rows=962 loops=1)
-> Nested Loop (cost=171.27..2471.14 rows=1 width=86) (actual time=3.144..14.751 rows=652 loops=1)
-> Hash Join (cost=0.20..29.73 rows=1 width=46) (actual time=0.025..0.039 rows=5 loops=1)
Hash Cond: (weather_types.id = "*VALUES*".column1)
-> Seq Scan on weather_types (cost=0.00..29.50 rows=5 width=38) (actual time=0.007..0.015 rows=5 loops=1)
Filter: ((name)::text = ANY ('{rain,snowfall,storm,freezingRain,thunderstorm}'::text[]))
-> Hash (cost=0.10..0.10 rows=8 width=8) (actual time=0.006..0.006 rows=8 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Values Scan on "*VALUES*" (cost=0.00..0.10 rows=8 width=8) (actual time=0.002..0.004 rows=8 loops=1)
-> Bitmap Heap Scan on bricks (cost=171.07..2352.98 rows=7074 width=40) (actual time=0.718..2.917 rows=130 loops=5)
Recheck Cond: (weather_type = weather_types.id)
Filter: (('2017-07-03 06:00:00'::timestamp without time zone >= starttime) AND ('2017-07-03 06:00:00'::timestamp without time zone <= endtime))
-> Bitmap Index Scan on bricks_weather_type_idx (cost=0.00..170.71 rows=8861 width=0) (actual time=0.644..0.644 rows=8906 loops=5)
Index Cond: (weather_type = weather_types.id)
-> Index Scan using bricks_notif_type_priority_poly_idx on bricks_notif_type_priority prio (cost=0.00..8.37 rows=1 width=12) (actual time=0.006..0.006 rows=1 loops=652)
Index Cond: ((polygon_id = bricks.polygon_id) AND (weather_type = weather_types.id))
-> Index Scan using polygons_pkey on polygons (cost=0.00..3.13 rows=1 width=8145) (actual time=0.003..0.003 rows=1 loops=962)
Index Cond: (id = bricks.polygon_id)
Filter: (set = 0)
-> Index Scan using notifications_pkey on notifications (cost=0.00..2.17 rows=1 width=8) (actual time=0.002..0.003 rows=1 loops=681)
Index Cond: (id = bricks.notification_id)
-> Values Scan on "*VALUES*" (cost=0.00..1.36 rows=3 width=20) (actual time=0.001..0.028 rows=84 loops=224)
Filter: (column2 = ANY ('{1,2,3}'::integer[]))
Total runtime: 66.566 ms
(33 rows)
work_mem = 80MB
By assigning more memory to this parameter you can indeed speed some queries by performing larger operations in-memory but it can also encourage the query planner to choose less optimized paths. Try to tweak/decrease this parameter and rerun the query planner.
Correspondingly you might want to run vacuum analyze on your 9.5 database.
I believe you would have noticed the query plan has changed significantly b/w 9.5 and 9.1
It seems to me there are 2 pronged issues with the new query plan.
Too Many rows, refer -
-> Sort (cost=1.33..1.35 rows=5 width=14) (actual time=0.040..0.418 rows=2609 loops=1)
Sort Key: weather_types.id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on weather_types (cost=0.00..1.28 rows=5 width=14) (actual time=0.009..0.013 rows=5 loops=1)
Filter: ((name)::text = ANY ('{rain,snowfall,storm,freezingRain,thunderstorm}'::text[]))
Rows Removed by Filter: 12
This leads to high I/O and high CPU usage (to sort)
both are useless towards the end goal (refer the line -
Rows Removed by Join Filter: 655
Rows Removed by Join Filter: 1956
Rows Removed by Join Filter: 5216
There have been few changes in 9.1 to 9.5 towards query planning, but not sure which one has impacted in here.
The solution as listed in URL - https://www.datadoghq.com/blog/100x-faster-postgres-performance-by-changing-1-line/ works on 9.1 (refer the line in URL and your query plan for 9.1 -
-> Values Scan on "*VALUES*"
If you can share the data set for other tables, it would be easier to find the which join/where clause caused this.

PostgreSQL efficient query with filter over boolean

There's table with 15M rows holding user's inbox data
user_id | integer | not null
subject | character varying(255) | not null
...
last_message_id | integer |
last_message_at | timestamp with time zone |
deleted_at | timestamp with time zone |
Here's slow query in nutshell:
SELECT *
FROM dialogs
WHERE user_id = 1234
AND deleted_at IS NULL
LIMIT 21
Full query:
(irrelevant fields deleted)
SELECT "dialogs"."id", "dialogs"."subject", "dialogs"."product_id", "dialogs"."user_id", "dialogs"."participant_id", "dialogs"."thread_id", "dialogs"."last_message_id", "dialogs"."last_message_at", "dialogs"."read_at", "dialogs"."deleted_at", "products"."id", ... , T4."id", ... , "messages"."id", ...,
FROM "dialogs"
LEFT OUTER JOIN "products" ON ("dialogs"."product_id" = "products"."id")
INNER JOIN "auth_user" T4 ON ("dialogs"."participant_id" = T4."id")
LEFT OUTER JOIN "messages" ON ("dialogs"."last_message_id" = "messages"."id")
WHERE ("dialogs"."deleted_at" IS NULL AND "dialogs"."user_id" = 9069)
ORDER BY "dialogs"."last_message_id" DESC
LIMIT 21;
EXPLAIN:
Limit (cost=1.85..28061.24 rows=21 width=1693) (actual time=4.700..93087.871 rows=17 loops=1)
-> Nested Loop Left Join (cost=1.85..9707215.30 rows=7265 width=1693) (actual time=4.699..93087.861 rows=17 loops=1)
-> Nested Loop (cost=1.41..9647421.07 rows=7265 width=1457) (actual time=4.689..93062.481 rows=17 loops=1)
-> Nested Loop Left Join (cost=0.99..9611285.66 rows=7265 width=1115) (actual time=4.676..93062.292 rows=17 loops=1)
-> Index Scan Backward using dialogs_last_message_id on dialogs (cost=0.56..9554417.92 rows=7265 width=102) (actual time=4.629..93062.050 rows=17 loops=1)
Filter: ((deleted_at IS NULL) AND (user_id = 9069))
Rows Removed by Filter: 6852907
-> Index Scan using products_pkey on products (cost=0.43..7.82 rows=1 width=1013) (actual time=0.012..0.012 rows=1 loops=17)
Index Cond: (dialogs.product_id = id)
-> Index Scan using auth_user_pkey on auth_user t4 (cost=0.42..4.96 rows=1 width=342) (actual time=0.009..0.010 rows=1 loops=17)
Index Cond: (id = dialogs.participant_id)
-> Index Scan using messages_pkey on messages (cost=0.44..8.22 rows=1 width=236) (actual time=1.491..1.492 rows=1 loops=17)
Index Cond: (dialogs.last_message_id = id)
Total runtime: 93091.494 ms
(14 rows)
OFFSET is not used
There's index on user_id field.
Index on deleted_at isn't used because of high selectivity (90% values are actually NULL). Partial index (... WHERE deleted_at IS NULL) won't help either.
It gets especially slow if query hits some part of results that were created long time ago. Then query has to filter and discard millions of rows in between.
List of indexes:
Indexes:
"dialogs_pkey" PRIMARY KEY, btree (id)
"dialogs_deleted_at_d57b320e_uniq" btree (deleted_at) WHERE deleted_at IS NULL
"dialogs_last_message_id" btree (last_message_id)
"dialogs_participant_id" btree (participant_id)
"dialogs_product_id" btree (product_id)
"dialogs_thread_id" btree (thread_id)
"dialogs_user_id" btree (user_id)
Currently I'm thinking about querying only recent data (i.e. ... WHERE last_message_at > <date 3-6 month ago> with appropriate index (BRIN?).
What is best practice to speed up such queries?
As posted in comments:
Start by creating a partial index on (user_id, last_message_id) with a condition WHERE deleted_at IS NULL
Per your answer, this seems to be quite effective :-)
So, here's the results of solutions I tried
1) Index (user_id) WHERE deleted_at IS NULL is used in rare cases, depending on certain values user_id in WHERE user_id = ? condition. Most of the time query has to filter out rows as previously.
2) The greatest speedup was achieved using
(user_id, last_message_id) WHERE deleted_at IS NULL index. While it's 2.5x bigger than other tested indexes, it's used all the time and is very fast. Here's resulting query plan
Limit (cost=1.72..270.45 rows=11 width=1308) (actual time=0.105..0.468 rows=8 loops=1)
-> Nested Loop Left Join (cost=1.72..247038.21 rows=10112 width=1308) (actual time=0.104..0.465 rows=8 loops=1)
-> Nested Loop (cost=1.29..164532.13 rows=10112 width=1072) (actual time=0.071..0.293 rows=8 loops=1)
-> Nested Loop Left Join (cost=0.86..116292.45 rows=10112 width=736) (actual time=0.057..0.198 rows=8 loops=1)
-> Index Scan Backward using dialogs_user_id_last_message_id_d57b320e on dialogs (cost=0.43..38842.21 rows=10112 width=102) (actual time=0.038..0.084 rows=8 loops=1)
Index Cond: (user_id = 9069)
-> Index Scan using products_pkey on products (cost=0.43..7.65 rows=1 width=634) (actual time=0.012..0.012 rows=1 loops=8)
Index Cond: (dialogs.product_id = id)
-> Index Scan using auth_user_pkey on auth_user t4 (cost=0.42..4.76 rows=1 width=336) (actual time=0.010..0.010 rows=1 loops=8)
Index Cond: (id = dialogs.participant_id)
-> Index Scan using messages_pkey on messages (cost=0.44..8.15 rows=1 width=236) (actual time=0.019..0.020 rows=1 loops=8)
Index Cond: (dialogs.last_message_id = id)
Total runtime: 0.678 ms
Thanks #jcaron. Your suggestion should be an accepted answer.