I am currently working on improving the performance of our db. And I need some help from you.
I have a table and its index like this
CREATE TABLE public.ar
(
id integer NOT NULL DEFAULT nextval('id_seq'::regclass),
user_id integer NOT NULL,
duration double precision,
is_idle boolean NOT NULL,
activity_id integer NOT NULL,
device_id integer NOT NULL,
calendar_id integer,
on_taskness integer,
week_id integer,
some_other_column_below,
CONSTRAINT id_ PRIMARY KEY (id),
CONSTRAINT a_unique_key UNIQUE (user_id, device_id, start_time_local, start_time_utc, end_time_local, end_time_utc)
)
CREATE INDEX ar_idx
ON public.ar USING btree
(week_id, calendar_id, user_id, activity_id, duration, on_taskness, is_idle)
TABLESPACE pg_default;
Then I am trying to run a query like this
EXPLAIN ANALYZE
SELECT COUNT(*)
FROM (
SELECT ar.user_id
FROM ar
WHERE ar.user_id = ANY(array[some_data]) -- data size is 352
AND ROUND(ar.duration) >0 AND ar.is_idle = false
AND ar.week_id = ANY(ARRAY[some_data]) -- data size is 37
AND ar.calendar_id = ANY(array[some_data]) -- data size is 16716
GROUP by ar.user_id
) tmp;
And below is the explain result
Aggregate (cost=31389954.72..31389954.73 rows=1 width=8) (actual time=252020.695..252020.695 rows=1 loops=1)
-> Group (cost=31389032.69..31389922.37 rows=2588 width=4) (actual time=251089.270..252020.659 rows=351 loops=1)
Group Key: ar.user_id
-> Sort (cost=31389032.69..31389477.53 rows=177935 width=4) (actual time=251089.268..251776.202 rows=6993358 loops=1)
Sort Key: ar.user_id
Sort Method: external merge Disk: 95672kB
-> Bitmap Heap Scan on ar (cost=609015.18..31371079.88 rows=177935 width=4) (actual time=1670.413..248939.440 rows=6993358 loops=1)
Recheck Cond: ((week_id = ANY ('{some_data}'::integer[])) AND (user_id = ANY ('{some_data}'::integer[])))
Rows Removed by Index Recheck: 2081028
Filter: ((NOT is_idle) AND (round(duration) > '0'::double precision) AND (calendar_id = ANY ('{some_data}'::integer[])))
Rows Removed by Filter: 534017
Heap Blocks: exact=29551 lossy=313127
-> BitmapAnd (cost=609015.18..609015.18 rows=1357521 width=0) (actual time=1666.334..1666.334 rows=0 loops=1)
-> Bitmap Index Scan on test_index_only_scan_idx (cost=0.00..272396.77 rows=6970353 width=0) (actual time=614.366..614.366 rows=7269830 loops=1)
Index Cond: ((week_id = ANY ('{some_data}'::integer[])) AND (is_idle = false))
-> Bitmap Index Scan on unique_key (cost=0.00..336529.20 rows=9948573 width=0) (actual time=1041.999..1041.999 rows=14959355 loops=1)
Index Cond: (user_id = ANY ('{some_data}'::integer[]))
Planning time: 25.563 ms
Execution time: 252029.237 ms
I used distinct as well, and the result is the same.
So my questions are below.
The ar_idx contains user_id, but when searching for rows, why does it use the unique_key instead of the index I created?
I thought group by will not do the sort(that is why I did not choose distinct), but why does the sort happen in the explain analyze?
The running time is pretty long(more than 4 minutes). How do I make it faster? Is the index wrong? Or anything else I can do.
Be advised, the ar table contains 51585203 rows.
Any help will be appreciated. Thx.
---------------------------update--------------------------
After I created this index, everything goes really fast now. I don't understand why, anyone can explain this to me?
CREATE INDEX ar_1_idx
ON public.ar USING btree
(calendar_id, user_id)
TABLESPACE pg_default;
And I changed the old index to
CREATE INDEX ar_idx
ON public.ar USING btree
(week_id, calendar, user_id, activity_id, duration, on_taskness, start_time_local, end_time_local) WHERE is_idle IS FALSE
TABLESPACE pg_default;
-----updated analyze results-----------
Aggregate (cost=31216435.97..31216435.98 rows=1 width=8) (actual time=13206.941..13206.941 rows=1 loops=1)
Buffers: shared hit=25940518 read=430315, temp read=31079 written=31079
-> Group (cost=31215436.80..31216403.88 rows=2567 width=4) (actual time=12239.336..13206.894 rows=351 loops=1)
Group Key: ar.user_id
Buffers: shared hit=25940518 read=430315, temp read=31079 written=31079
-> Sort (cost=31215436.80..31215920.34 rows=193417 width=4) (actual time=12239.334..12932.801 rows=6993358 loops=1)
Sort Key: ar.user_id
Sort Method: external merge Disk: 95664kB
Buffers: shared hit=25940518 read=430315, temp read=31079 written=31079
-> Index Scan using ar_1_idx on activity_report ar (cost=0.56..31195807.48 rows=193417 width=4) (actual time=0.275..10387.051 rows=6993358 loops=1)
Index Cond: ((calendar_id = ANY ('{some_data}'::integer[])) AND (user_id = ANY ('{some_data}'::integer[])))
Filter: ((NOT is_idle) AND (round(duration) > '0'::double precision) AND (week_id = ANY ('{some_data}'::integer[])))
Rows Removed by Filter: 590705
Buffers: shared hit=25940518 read=430315
Planning time: 25.577 ms
Execution time: 13217.611 ms
Related
In some cases, PostgreSQL does not filter out window function partitions until they are calculated, while in a very similar scenario PostgreSQL filters row before performing window function calculation.
Tables used for minimal STR - log is the main data table, each row contains either increment or absolute value. Absolute value resets the current counter with a new base value. Window functions need to process all logs for a given account_id to calculate the correct running total. View uses a subquery to ensure that underlying log rows are not filtered by ts, otherwise, this would break the window function.
CREATE TABLE account(
id serial,
name VARCHAR(100)
);
CREATE TABLE log(
id serial,
absolute int,
incremental int,
account_id int,
ts timestamp,
PRIMARY KEY(id),
CONSTRAINT fk_account
FOREIGN KEY(account_id)
REFERENCES account(id)
);
CREATE FUNCTION get_running_total_func(
aggregated_total int,
absolute int,
incremental int
) RETURNS int
LANGUAGE sql IMMUTABLE CALLED ON NULL INPUT AS
$$
SELECT
CASE
WHEN absolute IS NOT NULL THEN absolute
ELSE COALESCE(aggregated_total, 0) + incremental
END
$$;
CREATE AGGREGATE get_running_total(integer, integer) (
sfunc = get_running_total_func,
stype = integer
);
Slow view:
CREATE VIEW test_view
(
log_id,
running_value,
account_id,
ts
)
AS
SELECT log_running.* FROM
(SELECT
log.id,
get_running_total(
log.absolute,
log.incremental
)
OVER(
PARTITION BY log.account_id
ORDER BY log.ts RANGE UNBOUNDED PRECEDING
),
account.id,
ts
FROM log log JOIN account account ON log.account_id=account.id
) AS log_running;
CREATE VIEW
postgres=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE account_id=1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on log_running (cost=12734.02..15981.48 rows=1 width=20) (actual time=7510.851..16122.404 rows=20 loops=1)
Filter: (log_running.id_1 = 1)
Rows Removed by Filter: 99902
-> WindowAgg (cost=12734.02..14732.46 rows=99922 width=32) (actual time=7510.830..14438.783 rows=99922 loops=1)
-> Sort (cost=12734.02..12983.82 rows=99922 width=28) (actual time=7510.628..9312.399 rows=99922 loops=1)
Sort Key: log.account_id, log.ts
Sort Method: external merge Disk: 3328kB
-> Hash Join (cost=143.50..2042.24 rows=99922 width=28) (actual time=169.941..5431.650 rows=99922 loops=1)
Hash Cond: (log.account_id = account.id)
-> Seq Scan on log (cost=0.00..1636.22 rows=99922 width=24) (actual time=0.063..1697.802 rows=99922 loops=1)
-> Hash (cost=81.00..81.00 rows=5000 width=4) (actual time=169.837..169.865 rows=5000 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 240kB
-> Seq Scan on account (cost=0.00..81.00 rows=5000 width=4) (actual time=0.017..84.639 rows=5000 loops=1)
Planning Time: 0.199 ms
Execution Time: 16127.275 ms
(15 rows)
Fast view - only change is account.id -> log.account_id (!):
CREATE VIEW test_view
(
log_id,
running_value,
account_id,
ts
)
AS
SELECT log_running.* FROM
(SELECT
log.id,
get_running_total(
log.absolute,
log.incremental
)
OVER(
PARTITION BY log.account_id
ORDER BY log.ts RANGE UNBOUNDED PRECEDING
),
log.account_id,
ts
FROM log log JOIN account account ON log.account_id=account.id
) AS log_running;
CREATE VIEW
postgres=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE account_id=1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on log_running (cost=1894.96..1895.56 rows=20 width=20) (actual time=34.718..45.958 rows=20 loops=1)
-> WindowAgg (cost=1894.96..1895.36 rows=20 width=28) (actual time=34.691..45.307 rows=20 loops=1)
-> Sort (cost=1894.96..1895.01 rows=20 width=24) (actual time=34.367..35.925 rows=20 loops=1)
Sort Key: log.ts
Sort Method: quicksort Memory: 26kB
-> Nested Loop (cost=0.28..1894.53 rows=20 width=24) (actual time=0.542..34.066 rows=20 loops=1)
-> Index Only Scan using account_pkey on account (cost=0.28..8.30 rows=1 width=4) (actual time=0.025..0.054 rows=1 loops=1)
Index Cond: (id = 1)
Heap Fetches: 1
-> Seq Scan on log (cost=0.00..1886.03 rows=20 width=24) (actual time=0.195..32.937 rows=20 loops=1)
Filter: (account_id = 1)
Rows Removed by Filter: 99902
Planning Time: 0.297 ms
Execution Time: 47.300 ms
(14 rows)
Is it a bug in PostgreSQL implementation? It seems that this change in view definition shouldn't affect performance at all, PostgreSQL should be able to filter data before applying window function for all data set.
Select query scans all the child tables. can we make the query-optimizer to scan the right child table?
Example:
Created a parent table and two child table using the inheritance concept in Postgres 9.6, ignored constraints to make it simple
create table student(id INTEGER, name varchar(10), result varchar(1) );
create table student_pass() inherits (student);
create table student_fail() inherits (student);
Index
create index student_result_idx on student (result);
create index student_result_idx2 on student_pass (result) where result='P';
create index student_result_idx3 on student_fail (result) where result='F';
Procedure
CREATE OR REPLACE FUNCTION student_partition()
RETURNS TRIGGER AS $$
BEGIN
IF (new.result = 'P')THEN
INSERT INTO student_pass VALUES (NEW.*);
ELSE
INSERT INTO student_fail VALUES (NEW.*);
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
Trigger
CREATE TRIGGER insert_trigger BEFORE INSERT ON student
FOR EACH ROW EXECUTE procedure student_partition();
Insert
insert into student values
(1,'aaa','P'),
(2,'bbb','F');
insert happens in their respective tables as expected
Select
select * from student where result='P';
The problem here is when I do select it scans all the tables. How to make the query-optimizer smart enough to pick the right child table?
Do we need the where condition in the index as the entire table is going to be either 'P' or 'F'?
Output of EXPLAIN(analyze, buffers) select * from student where result='P'
Append (cost=0.00..37.94 rows=11 width=50) (actual time=0.016..0.042 rows=2 loops=1)
Buffers: shared hit=4
-> Seq Scan on student (cost=0.00..2.30 rows=1 width=50) (actual time=0.015..0.017 rows=1 loops=1)
Filter: ((result)::text = 'P'::text)
Rows Removed by Filter: 1
Buffers: shared hit=1
-> Bitmap Heap Scan on student_pass (cost=4.17..12.64 rows=5 width=50) (actual time=0.013..0.014 rows=1 loops=1)
Recheck Cond: ((result)::text = 'P'::text)
Heap Blocks: exact=1
Buffers: shared hit=2
-> Bitmap Index Scan on student_result_idx2 (cost=0.00..4.17 rows=5 width=0) (actual time=0.007..0.007 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on student_fail (cost=0.00..23.00 rows=5 width=50) (actual time=0.007..0.007 rows=0 loops=1)
Filter: ((result)::text = 'P'::text)
Rows Removed by Filter: 1
Buffers: shared hit=1
Planning time: 0.447 ms
Execution time: 0.120 ms
Adding constraints helps
alter table student_pass add constraint pass_cst check (result ='P');
alter table student_fail add constraint fail_cst check (result not in ('P'));
Output of EXPLAIN(analyze, buffers) select * from student where result='P'
Append (cost=0.00..23.00 rows=6 width=50) (actual time=0.299..0.303 rows=2 loops=1)
Buffers: shared read=1
-> Seq Scan on student (cost=0.00..0.00 rows=1 width=50) (actual time=0.004..0.004 rows=0 loops=1)
Filter: ((result)::text = 'P'::text)
-> Seq Scan on student_pass (cost=0.00..23.00 rows=5 width=50) (actual time=0.294..0.296 rows=2 loops=1)
Filter: ((result)::text = 'P'::text)
Buffers: shared read=1
Planning time: 10.488 ms
Execution time: 0.361 ms
Query-optimiser skipped student_fail table
This is a follow up to this issue I posted a while ago.
I have the following code:
SET work_mem = '16MB';
SELECT s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
FROM rm_o_resource_usage_instance_splits_new s
INNER JOIN rm_o_resource_usage r ON s.usage_id = r.id
INNER JOIN scheduledactivities sa ON s.activity_index = sa.activity_index AND r.schedule_id = sa.solution_id and s.solution = sa.solution_id
WHERE r.schedule_id = 10
ORDER BY r.resource_id, s.start_date
When I run EXPLAIN (ANALYZE, BUFFERS) I get the following:
Sort (cost=3724.02..3724.29 rows=105 width=89) (actual time=245.802..247.573 rows=22302 loops=1)
Sort Key: r.resource_id, s.start_date
Sort Method: quicksort Memory: 6692kB
Buffers: shared hit=198702 read=5993 written=612
-> Nested Loop (cost=703.76..3720.50 rows=105 width=89) (actual time=1.898..164.741 rows=22302 loops=1)
Buffers: shared hit=198702 read=5993 written=612
-> Hash Join (cost=703.34..3558.54 rows=105 width=101) (actual time=1.815..11.259 rows=22302 loops=1)
Hash Cond: (s.usage_id = r.id)
Buffers: shared hit=3 read=397 written=2
-> Bitmap Heap Scan on rm_o_resource_usage_instance_splits_new s (cost=690.61..3486.58 rows=22477 width=69) (actual time=1.782..5.820 rows=22302 loops=1)
Recheck Cond: (solution = 10)
Heap Blocks: exact=319
Buffers: shared hit=2 read=396 written=2
-> Bitmap Index Scan on rm_o_resource_usage_instance_splits_new_solution_idx (cost=0.00..685.00 rows=22477 width=0) (actual time=1.609..1.609 rows=22302 loops=1)
Index Cond: (solution = 10)
Buffers: shared hit=2 read=77
-> Hash (cost=12.66..12.66 rows=5 width=48) (actual time=0.023..0.023 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1 read=1
-> Bitmap Heap Scan on rm_o_resource_usage r (cost=4.19..12.66 rows=5 width=48) (actual time=0.020..0.020 rows=1 loops=1)
Recheck Cond: (schedule_id = 10)
Heap Blocks: exact=1
Buffers: shared hit=1 read=1
-> Bitmap Index Scan on rm_o_resource_usage_sched (cost=0.00..4.19 rows=5 width=0) (actual time=0.017..0.017 rows=1 loops=1)
Index Cond: (schedule_id = 10)
Buffers: shared read=1
-> Index Scan using scheduledactivities_activity_index_idx on scheduledactivities sa (cost=0.42..1.53 rows=1 width=16) (actual time=0.004..0.007 rows=1 loops=22302)
Index Cond: (activity_index = s.activity_index)
Filter: (solution_id = 10)
Rows Removed by Filter: 5
Buffers: shared hit=198699 read=5596 written=610
Planning time: 7.070 ms
Execution time: 248.691 ms
Every time I run EXPLAIN, I get roughly the same results. The Execution Time is always between 170ms and 250ms, which, to me is perfectly fine. However, when this query is run through a C++ project (using PQexec(conn, query) where conn is a dedicated connection, and query is the above query), the time it takes seems to vary widely. In general, the query is very quick, and you don't notice a delay. The problem is, that on occasion, this query will take 2 to 3 minutes to complete.
If I open the pgadmin, and have a look at the "server activity" for the database, there's about 30 or so connections, mostly sitting at "idle". The above query's connection is marked as "active", and will stay as "active" for several minutes.
I am at a loss of why it randomly takes several minutes to complete the same query, with no change in data in the DB either. I have tried increasing the work_mem which didn't make any difference (nor did I really expect it to). Any help or suggestions would be greatly appreciated.
There isn't any more specific tags, but I'm currently using Postgres 10.11, but it's also been an issue on other versions of 10.x. System is a Xeon quad-core # 3.4Ghz, with SSD and 24GB of memory.
Per jjanes's suggestion, I put in the auto_explain. Eventually go this output:
duration: 128057.373 ms
plan:
Query Text: SET work_mem = '32MB';SELECT s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset FROM rm_o_resource_usage_instance_splits_new s INNER JOIN rm_o_resource_usage r ON s.usage_id = r.id INNER JOIN scheduledactivities sa ON s.activity_index = sa.activity_index AND r.schedule_id = sa.solution_id and s.solution = sa.solution_id WHERE r.schedule_id = 12642 ORDER BY r.resource_id, s.start_date
Sort (cost=14.36..14.37 rows=1 width=98) (actual time=128042.083..128043.287 rows=21899 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
Sort Key: r.resource_id, s.start_date
Sort Method: quicksort Memory: 6585kB
Buffers: shared hit=21198435 read=388 dirtied=119
-> Nested Loop (cost=0.85..14.35 rows=1 width=98) (actual time=4.995..127958.935 rows=21899 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
Join Filter: (s.activity_index = sa.activity_index)
Rows Removed by Join Filter: 705476285
Buffers: shared hit=21198435 read=388 dirtied=119
-> Nested Loop (cost=0.42..9.74 rows=1 width=110) (actual time=0.091..227.705 rows=21899 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, s.solution, r.resource_id, r.schedule_id
Inner Unique: true
Join Filter: (s.usage_id = r.id)
Buffers: shared hit=22102 read=388 dirtied=119
-> Index Scan using rm_o_resource_usage_instance_splits_new_solution_idx on public.rm_o_resource_usage_instance_splits_new s (cost=0.42..8.44 rows=1 width=69) (actual time=0.082..17.418 rows=21899 loops=1)
Output: s.start_time, s.end_time, s.resources, s.activity_index, s.usage_id, s.start_date, s.end_date, s.solution
Index Cond: (s.solution = 12642)
Buffers: shared hit=203 read=388 dirtied=119
-> Seq Scan on public.rm_o_resource_usage r (cost=0.00..1.29 rows=1 width=57) (actual time=0.002..0.002 rows=1 loops=21899)
Output: r.id, r.schedule_id, r.resource_id
Filter: (r.schedule_id = 12642)
Rows Removed by Filter: 26
Buffers: shared hit=21899
-> Index Scan using scheduled_activities_idx on public.scheduledactivities sa (cost=0.42..4.60 rows=1 width=16) (actual time=0.006..4.612 rows=32216 loops=21899)
Output: sa.usedresourceset, sa.activity_index, sa.solution_id
Index Cond: (sa.solution_id = 12642)
Buffers: shared hit=21176333",,,,,,,,,""
EDIT: Full definitions of the tables are below:
CREATE TABLE public.rm_o_resource_usage_instance_splits_new
(
start_time integer NOT NULL,
end_time integer NOT NULL,
resources jsonb NOT NULL,
activity_index integer NOT NULL,
usage_id bigint NOT NULL,
start_date text COLLATE pg_catalog."default" NOT NULL,
end_date text COLLATE pg_catalog."default" NOT NULL,
solution bigint NOT NULL,
CONSTRAINT rm_o_resource_usage_instance_splits_new_pkey PRIMARY KEY (start_time, activity_index, usage_id),
CONSTRAINT rm_o_resource_usage_instance_splits_new_solution_fkey FOREIGN KEY (solution)
REFERENCES public.rm_o_schedule_stats (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE,
CONSTRAINT rm_o_resource_usage_instance_splits_new_usage_id_fkey FOREIGN KEY (usage_id)
REFERENCES public.rm_o_resource_usage (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_instance_splits_new_activity_idx
ON public.rm_o_resource_usage_instance_splits_new USING btree
(activity_index ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_instance_splits_new_solution_idx
ON public.rm_o_resource_usage_instance_splits_new USING btree
(solution ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_instance_splits_new_usage_idx
ON public.rm_o_resource_usage_instance_splits_new USING btree
(usage_id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE TABLE public.rm_o_resource_usage
(
id bigint NOT NULL DEFAULT nextval('rm_o_resource_usage_id_seq'::regclass),
schedule_id bigint NOT NULL,
resource_id text COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT rm_o_resource_usage_pkey PRIMARY KEY (id),
CONSTRAINT rm_o_resource_usage_schedule_id_fkey FOREIGN KEY (schedule_id)
REFERENCES public.rm_o_schedule_stats (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_idx
ON public.rm_o_resource_usage USING btree
(id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX rm_o_resource_usage_sched
ON public.rm_o_resource_usage USING btree
(schedule_id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE TABLE public.scheduledactivities
(
id bigint NOT NULL DEFAULT nextval('scheduledactivities_id_seq'::regclass),
solution_id bigint NOT NULL,
activity_id text COLLATE pg_catalog."default" NOT NULL,
sequence_index integer,
startminute integer,
finishminute integer,
issue text COLLATE pg_catalog."default",
activity_index integer NOT NULL,
is_objective boolean NOT NULL,
usedresourceset integer DEFAULT '-1'::integer,
start timestamp without time zone,
finish timestamp without time zone,
is_ore boolean,
is_ignored boolean,
CONSTRAINT scheduled_activities_pkey PRIMARY KEY (id),
CONSTRAINT scheduledactivities_solution_id_fkey FOREIGN KEY (solution_id)
REFERENCES public.rm_o_schedule_stats (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
CREATE INDEX scheduled_activities_activity_id_idx
ON public.scheduledactivities USING btree
(activity_id COLLATE pg_catalog."default" ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX scheduled_activities_id_idx
ON public.scheduledactivities USING btree
(id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX scheduled_activities_idx
ON public.scheduledactivities USING btree
(solution_id ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX scheduledactivities_activity_index_idx
ON public.scheduledactivities USING btree
(activity_index ASC NULLS LAST)
TABLESPACE pg_default;
EDIT: Additional output from auto_explain after adding index on scheduledactivities (solution_id, activity_index)
Output: s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
Sort Key: r.resource_id, s.start_date
Sort Method: quicksort Memory: 6283kB
Buffers: shared hit=20159117 read=375 dirtied=190
-> Nested Loop (cost=0.85..10.76 rows=1 width=100) (actual time=5.518..122489.627 rows=20761 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset
Join Filter: (s.activity_index = sa.activity_index)
Rows Removed by Join Filter: 668815615
Buffers: shared hit=20159117 read=375 dirtied=190
-> Nested Loop (cost=0.42..5.80 rows=1 width=112) (actual time=0.057..217.563 rows=20761 loops=1)
Output: s.start_date, s.end_date, s.resources, s.activity_index, s.solution, r.resource_id, r.schedule_id
Inner Unique: true
Join Filter: (s.usage_id = r.id)
Buffers: shared hit=20947 read=375 dirtied=190
-> Index Scan using rm_o_resource_usage_instance_splits_new_solution_idx on public.rm_o_resource_usage_instance_splits_new s (cost=0.42..4.44 rows=1 width=69) (actual time=0.049..17.622 rows=20761 loops=1)
Output: s.start_time, s.end_time, s.resources, s.activity_index, s.usage_id, s.start_date, s.end_date, s.solution
Index Cond: (s.solution = 12644)
Buffers: shared hit=186 read=375 dirtied=190
-> Seq Scan on public.rm_o_resource_usage r (cost=0.00..1.35 rows=1 width=59) (actual time=0.002..0.002 rows=1 loops=20761)
Output: r.id, r.schedule_id, r.resource_id
Filter: (r.schedule_id = 12644)
Rows Removed by Filter: 22
Buffers: shared hit=20761
-> Index Scan using scheduled_activities_idx on public.scheduledactivities sa (cost=0.42..4.94 rows=1 width=16) (actual time=0.007..4.654 rows=32216 loops=20761)
Output: sa.usedresourceset, sa.activity_index, sa.solution_id
Index Cond: (sa.solution_id = 12644)
Buffers: shared hit=20138170",,,,,,,,,""
The easiest way to reproduce the issue is to add more values to the three tables. I didn't delete any, only did a few thousand INSERTs.
-> Index Scan using .. s (cost=0.42..8.44 rows=1 width=69) (actual time=0.082..17.418 rows=21899 loops=1)
Index Cond: (s.solution = 12642)
The planner thinks it will find 1 row, and instead finds 21899. That error can pretty clearly lead to bad plans. And a single equality condition should be estimated quite accurately, so I'd say the statistics on your table are way off. It could be that the autovac launcher is tuned poorly so it doesn't run often enough, or it could be that select parts of your data change very rapidly (did you just insert 21899 rows with s.solution = 12642 immediately before running the query?) and so the stats can't be kept accurate enough.
-> Nested Loop ...
Join Filter: (s.activity_index = sa.activity_index)
Rows Removed by Join Filter: 705476285
-> ...
-> Index Scan using scheduled_activities_idx on public.scheduledactivities sa (cost=0.42..4.60 rows=1 width=16) (actual time=0.006..4.612 rows=32216 loops=21899)
Output: sa.usedresourceset, sa.activity_index, sa.solution_id
Index Cond: (sa.solution_id = 12642)
If you can't get it to use the Hash Join, you can at least reduce the harm of the Nested Loop by building an index on scheduledactivities (solution_id, activity_index). That way the activity_index criterion could be part of the Index Condition, rather than being a Join Filter. You could probably then drop the index exclusively on solution_id, as there is little point in maintaining both indexes.
The SQL statement of the fast plan is using WHERE r.schedule_id = 10 and returns about 22000 rows (with estimated 105).
The SQL statement of the slow plan is using WHERE r.schedule_id = 12642 and returns about 21000 rows (with estimated only 1).
The slow plan is using nested loops instead of hash joins: maybe because there is a bad estimation for joins: estimated rows is 1 but actual rows is 21899.
For example in this step:
Nested Loop (cost=0.42..9.74 rows=1 width=110) (actual time=0.091..227.705 rows=21899 loops=1)
If data does not change there is maybe a statistic issue (skew data) for some columns.
How can i optimize this query?
I want to find all rows from firms2branches by project_id that exists in firms and accounts_premium. My tables:
-- 50000 rows
CREATE TABLE firms
(
id bigserial NOT NULL,
firm_id bigint NOT NULL,
CONSTRAINT firms_pkey PRIMARY KEY (id)
)
-- 2 300 000 rows
CREATE TABLE firms2branches
(
firm_id bigint NOT NULL,
branch_id bigint NOT NULL,
project_id bigint NOT NULL
)
CREATE INDEX firms2branches_firm_id_idx ON firms2branches USING btree(firm_id);
-- 6500 rows
CREATE TABLE accounts_premium
(
firm_id bigint NOT NULL,
is_active boolean NOT NULL DEFAULT false,
CONSTRAINT accounts_premium_pkey PRIMARY KEY (firm_id)
)
CREATE INDEX accounts_premium_is_active_idx ON accounts_premium USING btree(is_active);
Query (with cold cache):
EXPLAIN (ANALYZE)
SELECT firms2branches.branch_id,
firms2branches.firm_id
FROM firms2branches
JOIN firms ON firms.firm_id = firms2branches.firm_id
JOIN accounts_premium ON accounts_premium.firm_id = firms.id AND accounts_premium.is_active = TRUE
WHERE firms2branches.project_id = 21
Result (https://explain.depesz.com/s/oVNH):
Nested Loop (cost=22.12..6958.10 rows=355 width=16) (actual time=151.123..417.764 rows=31 loops=1)
Buffers: shared hit=7176 read=3371
-> Nested Loop (cost=21.69..3100.40 rows=1435 width=8) (actual time=0.905..58.314 rows=1378 loops=1)
Buffers: shared hit=3250 read=961
-> Bitmap Heap Scan on accounts_premium (cost=21.40..226.90 rows=1435 width=8) (actual time=0.615..1.211 rows=1378 loops=1)
Filter: is_active
Heap Blocks: exact=61
Buffers: shared hit=61 read=6
-> Bitmap Index Scan on accounts_premium_is_active_idx (cost=0.00..21.04 rows=1435 width=0) (actual time=0.594..0.594 rows=1435 loops=1)
Index Cond: (is_active = true)
Buffers: shared read=6
-> Index Scan using firms_pkey on firms (cost=0.29..1.90 rows=1 width=16) (actual time=0.040..0.041 rows=1 loops=1378)
Index Cond: (id = accounts_premium.firm_id)
Buffers: shared hit=3189 read=955
-> Index Scan using firms2branches_firm_id_idx on firms2branches (cost=0.43..2.59 rows=1 width=16) (actual time=0.259..0.260 rows=0 loops=1378)
Index Cond: (firm_id = firms.firm_id)
Filter: (project_id = 21::bigint)
Rows Removed by Filter: 2
Buffers: shared hit=3926 read=2410
Planning time: 6.164 ms
Execution time: 417.843 ms
I have several large tables in Postgres 9.2 (millions of rows) where I need to generate a unique code based on the combination of two fields, 'source' (varchar) and 'id' (int). I can do this by generating row_numbers over the result of:
SELECT source,id FROM tablename GROUP BY source,id
but the results can take a while to process. It has been recommended that if the fields are indexed, and there are a proportionally small number of index values (which is my case), that a loose index scan may be a better option: http://wiki.postgresql.org/wiki/Loose_indexscan
WITH RECURSIVE
t AS (SELECT min(col) AS col FROM tablename
UNION ALL
SELECT (SELECT min(col) FROM tablename WHERE col > t.col) FROM t WHERE t.col IS NOT NULL)
SELECT col FROM t WHERE col IS NOT NULL
UNION ALL
SELECT NULL WHERE EXISTS(SELECT * FROM tablename WHERE col IS NULL);
The example operates on a single field though. Trying to return more than one field generates an error: subquery must return only one column. One possibility might be to try retrieving an entire ROW - e.g. SELECT ROW(min(source),min(id)..., but then I'm not sure what the syntax of the WHERE statement would need to look like to work with individual row elements.
The question is: can the recursion-based code be modified to work with more than one column, and if so, how? I'm committed to using Postgres, but it looks like MySQL has implemented loose index scans for more than one column: http://dev.mysql.com/doc/refman/5.1/en/group-by-optimization.html
As recommended, I'm attaching my EXPLAIN ANALYZE results.
For my situation - where I'm selecting distinct values for 2 columns using GROUP BY, it's the following:
HashAggregate (cost=1645408.44..1654099.65 rows=869121 width=34) (actual time=35411.889..36008.475 rows=1233080 loops=1)
-> Seq Scan on tablename (cost=0.00..1535284.96 rows=22024696 width=34) (actual time=4413.311..25450.840 rows=22025768 loops=1)
Total runtime: 36127.789 ms
(3 rows)
I don't know how to do a 2-column index scan (that's the question), but for purposes of comparison, using a GROUP BY on one column, I get:
HashAggregate (cost=1590346.70..1590347.69 rows=99 width=8) (actual time=32310.706..32310.722 rows=100 loops=1)
-> Seq Scan on tablename (cost=0.00..1535284.96 rows=22024696 width=8) (actual time=4764.609..26941.832 rows=22025768 loops=1)
Total runtime: 32350.899 ms
(3 rows)
But for a loose index scan on one column, I get:
Result (cost=181.28..198.07 rows=101 width=8) (actual time=0.069..1.935 rows=100 loops=1)
CTE t
-> Recursive Union (cost=1.74..181.28 rows=101 width=8) (actual time=0.062..1.855 rows=101 loops=1)
-> Result (cost=1.74..1.75 rows=1 width=0) (actual time=0.061..0.061 rows=1 loops=1)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..1.74 rows=1 width=8) (actual time=0.057..0.057 rows=1 loops=1)
-> Index Only Scan using tablename_id on tablename (cost=0.00..38379014.12 rows=22024696 width=8) (actual time=0.055..0.055 rows=1 loops=1)
Index Cond: (id IS NOT NULL)
Heap Fetches: 0
-> WorkTable Scan on t (cost=0.00..17.75 rows=10 width=8) (actual time=0.017..0.017 rows=1 loops=101)
Filter: (id IS NOT NULL)
Rows Removed by Filter: 0
SubPlan 3
-> Result (cost=1.75..1.76 rows=1 width=0) (actual time=0.016..0.016 rows=1 loops=100)
InitPlan 2 (returns $3)
-> Limit (cost=0.00..1.75 rows=1 width=8) (actual time=0.016..0.016 rows=1 loops=100)
-> Index Only Scan using tablename_id on tablename (cost=0.00..12811462.41 rows=7341565 width=8) (actual time=0.015..0.015 rows=1 loops=100)
Index Cond: ((id IS NOT NULL) AND (id > t.id))
Heap Fetches: 0
-> Append (cost=0.00..16.79 rows=101 width=8) (actual time=0.067..1.918 rows=100 loops=1)
-> CTE Scan on t (cost=0.00..2.02 rows=100 width=8) (actual time=0.067..1.899 rows=100 loops=1)
Filter: (id IS NOT NULL)
Rows Removed by Filter: 1
-> Result (cost=13.75..13.76 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
One-Time Filter: $5
InitPlan 5 (returns $5)
-> Index Only Scan using tablename_id on tablename (cost=0.00..13.75 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
Index Cond: (id IS NULL)
Heap Fetches: 0
Total runtime: 2.040 ms
The full table definition looks like this:
CREATE TABLE tablename
(
source character(25),
id bigint NOT NULL,
time_ timestamp without time zone,
height numeric,
lon numeric,
lat numeric,
distance numeric,
status character(3),
geom geometry(PointZ,4326),
relid bigint
)
WITH (
OIDS=FALSE
);
CREATE INDEX tablename_height
ON public.tablename
USING btree
(height);
CREATE INDEX tablename_geom
ON public.tablename
USING gist
(geom);
CREATE INDEX tablename_id
ON public.tablename
USING btree
(id);
CREATE INDEX tablename_lat
ON public.tablename
USING btree
(lat);
CREATE INDEX tablename_lon
ON public.tablename
USING btree
(lon);
CREATE INDEX tablename_relid
ON public.tablename
USING btree
(relid);
CREATE INDEX tablename_sid
ON public.tablename
USING btree
(source COLLATE pg_catalog."default", id);
CREATE INDEX tablename_source
ON public.tablename
USING btree
(source COLLATE pg_catalog."default");
CREATE INDEX tablename_time
ON public.tablename
USING btree
(time_);
Answer selection:
I took some time in comparing the approaches that were provided. It's at times like this that I wish that more than one answer could be accepted, but in this case, I'm giving the tick to #jjanes. The reason for this is that his solution matches the question as originally posed more closely, and I was able to get some insights as to the form of the required WHERE statement. In the end, the HashAggregate is actually the fastest approach (for me), but that's due to the nature of my data, not any problems with the algorithms. I've attached the EXPLAIN ANALYZE for the different approaches below, and will be giving +1 to both jjanes and joop.
HashAggregate:
HashAggregate (cost=1018669.72..1029722.08 rows=1105236 width=34) (actual time=24164.735..24686.394 rows=1233080 loops=1)
-> Seq Scan on tablename (cost=0.00..908548.48 rows=22024248 width=34) (actual time=0.054..14639.931 rows=22024982 loops=1)
Total runtime: 24787.292 ms
Loose Index Scan modification
CTE Scan on t (cost=13.84..15.86 rows=100 width=112) (actual time=0.916..250311.164 rows=1233080 loops=1)
Filter: (source IS NOT NULL)
Rows Removed by Filter: 1
CTE t
-> Recursive Union (cost=0.00..13.84 rows=101 width=112) (actual time=0.911..249295.872 rows=1233081 loops=1)
-> Limit (cost=0.00..0.04 rows=1 width=34) (actual time=0.910..0.911 rows=1 loops=1)
-> Index Only Scan using tablename_sid on tablename (cost=0.00..965442.32 rows=22024248 width=34) (actual time=0.908..0.908 rows=1 loops=1)
Heap Fetches: 0
-> WorkTable Scan on t (cost=0.00..1.18 rows=10 width=112) (actual time=0.201..0.201 rows=1 loops=1233081)
Filter: (source IS NOT NULL)
Rows Removed by Filter: 0
SubPlan 1
-> Limit (cost=0.00..0.05 rows=1 width=34) (actual time=0.100..0.100 rows=1 loops=1233080)
-> Index Only Scan using tablename_sid on tablename (cost=0.00..340173.38 rows=7341416 width=34) (actual time=0.100..0.100 rows=1 loops=1233080)
Index Cond: (ROW(source, id) > ROW(t.source, t.id))
Heap Fetches: 0
SubPlan 2
-> Limit (cost=0.00..0.05 rows=1 width=34) (actual time=0.099..0.099 rows=1 loops=1233080)
-> Index Only Scan using tablename_sid on tablename (cost=0.00..340173.38 rows=7341416 width=34) (actual time=0.098..0.098 rows=1 loops=1233080)
Index Cond: (ROW(source, id) > ROW(t.source, t.id))
Heap Fetches: 0
Total runtime: 250491.559 ms
Merge Anti Join
Merge Anti Join (cost=0.00..12099015.26 rows=14682832 width=42) (actual time=48.710..541624.677 rows=1233080 loops=1)
Merge Cond: ((src.source = nx.source) AND (src.id = nx.id))
Join Filter: (nx.time_ > src.time_)
Rows Removed by Join Filter: 363464177
-> Index Only Scan using tablename_pkey on tablename src (cost=0.00..1060195.27 rows=22024248 width=42) (actual time=48.566..5064.551 rows=22024982 loops=1)
Heap Fetches: 0
-> Materialize (cost=0.00..1115255.89 rows=22024248 width=42) (actual time=0.011..40551.997 rows=363464177 loops=1)
-> Index Only Scan using tablename_pkey on tablename nx (cost=0.00..1060195.27 rows=22024248 width=42) (actual time=0.008..8258.890 rows=22024982 loops=1)
Heap Fetches: 0
Total runtime: 541750.026 ms
Rather hideous, but this seems to work:
WITH RECURSIVE
t AS (
select a,b from (select a,b from foo order by a,b limit 1) asdf union all
select (select a from foo where (a,b) > (t.a,t.b) order by a,b limit 1),
(select b from foo where (a,b) > (t.a,t.b) order by a,b limit 1)
from t where t.a is not null)
select * from t where t.a is not null;
I don't really understand why the "is not nulls" are needed, as where do the nulls come from in the first place?
DROP SCHEMA zooi CASCADE;
CREATE SCHEMA zooi ;
SET search_path=zooi,public,pg_catalog;
CREATE TABLE tablename
( source character(25) NOT NULL
, id bigint NOT NULL
, time_ timestamp without time zone NOT NULL
, height numeric
, lon numeric
, lat numeric
, distance numeric
, status character(3)
, geom geometry(PointZ,4326)
, relid bigint
, PRIMARY KEY (source,id,time_) -- <<-- Primary key here
) WITH ( OIDS=FALSE);
-- invent some bogus data
INSERT INTO tablename(source,id,time_)
SELECT 'SRC_'|| (gs%10)::text
,gs/10
,gt
FROM generate_series(1,1000) gs
, generate_series('2013-12-01', '2013-12-07', '1hour'::interval) gt
;
Select unique values for two key fields:
VACUUM ANALYZE tablename;
EXPLAIN ANALYZE
SELECT source,id,time_
FROM tablename src
WHERE NOT EXISTS (
SELECT * FROM tablename nx
WHERE nx.source =src.source
AND nx.id = src.id
AND time_ > src.time_
)
;
Generates this plan here (Pg-9.3):
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Hash Anti Join (cost=4981.00..12837.82 rows=96667 width=42) (actual time=547.218..1194.335 rows=1000 loops=1)
Hash Cond: ((src.source = nx.source) AND (src.id = nx.id))
Join Filter: (nx.time_ > src.time_)
Rows Removed by Join Filter: 145000
-> Seq Scan on tablename src (cost=0.00..2806.00 rows=145000 width=42) (actual time=0.010..210.810 rows=145000 loops=1)
-> Hash (cost=2806.00..2806.00 rows=145000 width=42) (actual time=546.497..546.497 rows=145000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 9063kB
-> Seq Scan on tablename nx (cost=0.00..2806.00 rows=145000 width=42) (actual time=0.006..259.864 rows=145000 loops=1)
Total runtime: 1197.374 ms
(9 rows)
The hash-joins will probably disappear once the data outgrows the work_mem:
Merge Anti Join (cost=0.83..8779.56 rows=29832 width=120) (actual time=0.981..2508.912 rows=1000 loops=1)
Merge Cond: ((src.source = nx.source) AND (src.id = nx.id))
Join Filter: (nx.time_ > src.time_)
Rows Removed by Join Filter: 184051
-> Index Scan using tablename_sid on tablename src (cost=0.41..4061.57 rows=32544 width=120) (actual time=0.055..250.621 rows=145000 loops=1)
-> Index Scan using tablename_sid on tablename nx (cost=0.41..4061.57 rows=32544 width=120) (actual time=0.008..603.403 rows=328906 loops=1)
Total runtime: 2510.505 ms
Lateral joins can give you a clean code to select multiple columns in nested selects, without checking for null as no subqueries in select clause.
-- Assuming you want to get one '(a,b)' for every 'a'.
with recursive t as (
(select a, b from foo order by a, b limit 1)
union all
(select s.* from t, lateral(
select a, b from foo f
where f.a > t.a
order by a, b limit 1) s)
)
select * from t;