How does postgres query planner work on partitioned tables? - postgresql

We have a postgres 11 database with tables having a quite big number of rows, so we use postgres declerative partitioning to ensure query performance.
Today while wirting a database function I noticed some strange behaviour of the postgres query planner:
In this particular case we have two tables track.track and sensor.location.
The function shall return all locations for a given track.
The relation between track.track and sensor.location is given by a user_vehicle_id and a time range.
sensor.location is partitioned monthly by range using the column time
A query for this problem could look like this:
WITH single_track AS (
SELECT
start_time, stop_time, user_vehicle_id
FROM
track.track
WHERE
id = 1350000744800)
SELECT *
FROM
sensor.location as l, single_track as t
WHERE
l.time >= t.start_time AND
l.time <= t.stop_time AND
l.user_vehicle_id = t.user_vehicle_id
I would expect that the query planer only looks at those partitions of location which match the given time frame from start_time to stop_time.
Instead it performs a Bitmap Heap/Index scan on all partitions:
Nested Loop (cost=8.59..9308018.00 rows=722021 width=106) (actual time=1.796..2.296 rows=1025 loops=1)
CTE single_track
-> Index Scan using track_pkey on track (cost=0.42..8.44 rows=1 width=24) (actual time=0.023..0.024 rows=1 loops=1)
Index Cond: (id = '1350000744800'::bigint)
-> CTE Scan on single_track t (cost=0.00..0.02 rows=1 width=24) (actual time=0.027..0.029 rows=1 loops=1)
-> Append (cost=0.15..9286171.84 rows=2183770 width=82) (actual time=1.750..1.998 rows=1025 loops=1)
-> Index Scan using location_p2011_01_pkey on location_p2011_01 l (cost=0.15..8.83 rows=1 width=136) (never executed)
Index Cond: (("time" >= t.start_time) AND ("time" <= t.stop_time) AND (user_vehicle_id = t.user_vehicle_id))
-> Seq Scan on location_p2011_02 l_1 (cost=0.00..7.71 rows=1 width=82) (never executed)
Filter: (("time" >= t.start_time) AND ("time" <= t.stop_time) AND (t.user_vehicle_id = user_vehicle_id))
-> Bitmap Heap Scan on location_p2011_03 l_2 (cost=643.94..3370.03 rows=2087 width=114) (never executed)
Recheck Cond: (("time" >= t.start_time) AND ("time" <= t.stop_time) AND (user_vehicle_id = t.user_vehicle_id))
...
-> Index Scan using location_p2020_10_pkey on location_p2020_10 l_117 (cost=0.15..8.83 rows=1 width=136) (never executed)
Index Cond: (("time" >= t.start_time) AND ("time" <= t.stop_time) AND (user_vehicle_id = t.user_vehicle_id))
-> Index Scan using location_p2020_11_pkey on location_p2020_11 l_118 (cost=0.15..8.83 rows=1 width=136) (never executed)
Index Cond: (("time" >= t.start_time) AND ("time" <= t.stop_time) AND (user_vehicle_id = t.user_vehicle_id))
-> Index Scan using location_p2020_12_pkey on location_p2020_12 l_119 (cost=0.15..8.83 rows=1 width=136) (never executed)
Index Cond: (("time" >= t.start_time) AND ("time" <= t.stop_time) AND (user_vehicle_id = t.user_vehicle_id))
Planning Time: 11.046 ms
Execution Time: 4.144 ms
While playing around I discoverd, that using the same query but passing the times explicitly:
EXPLAIN ANALYSE
WITH single_track AS (
SELECT
start_time,
stop_time,
user_vehicle_id
FROM
track.track
WHERE
id = 1350000744800)
SELECT *
FROM
sensor.location as l, single_track as t
WHERE
l.time >= '2016-04-12 18:04:59' AND
l.time <= '2016-04-12 18:22:49' AND
l.user_vehicle_id = t.user_vehicle_id
produces the expected behaviour:
Nested Loop (cost=9.00..2111.73 rows=141 width=102) (actual time=0.085..2.408 rows=1025 loops=1)
CTE single_track
-> Index Scan using track_pkey on track (cost=0.42..8.44 rows=1 width=24) (actual time=0.017..0.018 rows=1 loops=1)
Index Cond: (id = '1350000744800'::bigint)
-> CTE Scan on single_track t (cost=0.00..0.02 rows=1 width=24) (actual time=0.021..0.022 rows=1 loops=1)
-> Append (cost=0.56..2099.99 rows=328 width=78) (actual time=0.060..2.081 rows=1025 loops=1)
-> Index Scan using location_p2016_04_pkey on location_p2016_04 l (cost=0.56..2098.35 rows=328 width=78) (actual time=0.058..1.994 rows=1025 loops=1)
Index Cond: (("time" >= '2016-04-12 18:04:59'::timestamp without time zone) AND ("time" <= '2016-04-12 18:22:49'::timestamp without time zone) AND (user_vehicle_id = t.user_vehicle_id))
Planning Time: 4.709 ms
Execution Time: 2.494 ms
Can anyone explain this behaviour and help me how to overcome this issue?

This seems to be too complicated for the PostgreSQL executor.
I'd suggest trying
SELECT *
FROM
sensor.location as l, single_track as t
WHERE
l.time >= (SELECT start_time FROM track.track WHERE id = 1350000744800) AND
l.time <= (SELECT stop_time FROM track.track WHERE id = 1350000744800) AND
l.user_vehicle_id = (SELECT user_vehicle_id FROM track.track WHERE id = 1350000744800)
Then PostgreSQL knows at least that there will only be a single value.
If that still doesn't work, split the query in two parts:
First, get the values from track.track.
Then construct a query using the results and execute that.

I also tried this, which may be close to what Laurenz Albe suggested. I wasn't able to confirm that this causes the right behavior, as EXPLAIN ANALYSE does not show the query plan of a psql funtction.
CREATE OR REPLACE FUNCTION location_from_track_id(
_track_id bigint)
RETURNS SETOF sensor.location
LANGUAGE 'plpgsql'
AS
$BODY$
DECLARE
_user_vehicle_id bigint;
_start_time timestamp without time zone;
_stop_time timestamp without time zone;
BEGIN
SELECT
user_vehicle_id,
start_time,
stop_time
INTO
_user_vehicle_id,
_start_time,
_stop_time
FROM
track.track
WHERE id=_track_id;
RETURN QUERY
SELECT *
FROM sensor.location
WHERE
time BETWEEN _start_time AND _stop_time AND
user_vehicle_id = _user_vehicle_id;
END;
$BODY$;

Related

PostgreSQL: Scan relevant partitions only with subquery

I have a table foundation.data which is partitioned over an asset_id with hash partitioning.
Whenever I do a query such as
with deleted_unprocessed_data as (
delete from foundation.unprocessed_ids d
where id = any(select id from foundation.unprocessed_ids up order by up.asset_id, up.data_point_timestamp asc limit 1000)
returning id, asset_id, data_point_timestamp
)
, ids_to_process as
(
insert into foundation.processed_ids select * from deleted_unprocessed_data
returning id, asset_id
)
select jh.asset_id,
min(jh.data_point_timestamp) as minborder,
max(jh.data_point_timestamp) as maxborder
from
(
select id, asset_id, data_point_timestamp FROM
foundation.DATA fd
where id = any(select id from ids_to_process)
and fd.asset_id = any(select asset_id from ids_to_process)
) jh
group by asset_id;
the explain will show that it accesses all partitions:
...
-> Append (cost=0.42..1822.75 rows=256 width=20) (actual time=0.285..0.617 rows=1 loops=1000)
-> Index Scan using asset_id_hash_0_id_idx on asset_id_hash_0 fd (cost=0.42..7.10 rows=1 width=20) (actual time=0.002..0.002 rows=0 loops=1000)
Index Cond: (id = ids_to_process.id)
-> Index Scan using asset_id_hash_1_id_idx on asset_id_hash_1 fd_1 (cost=0.42..6.66 rows=1 width=20) (actual time=0.002..0.002 rows=0 loops=1000)
Index Cond: (id = ids_to_process.id)
...
-> Index Scan using asset_id_hash_72_id_idx on asset_id_hash_72 fd_72 (cost=0.43..8.35 rows=1 width=20) (actual time=0.002..0.002 rows=0 loops=1000)
Index Cond: (id = ids_to_process.id)
...
-> Index Scan using asset_id_hash_255_id_idx on asset_id_hash_255 fd_255 (cost=0.42..6.87 rows=1 width=20) (actual time=0.002..0.002 rows=0 loops=1000)
Index Cond: (id = ids_to_process.id)
How do force the planner to only access the relevant partitions?
The subquery is accessing only one partition:
-> Hash (cost=145.14..145.14 rows=200 width=4)
-> Group (cost=0.00..143.14 rows=200 width=4)
Group Key: asset_7800_1.asset_id
-> Seq Scan on asset_7800 asset_7800_1 (cost=0.00..135.11 rows=3209 width=4)
Filter: (asset_id = 7800)
The outer query on the other hand is accessing all partitions. I think this is because the planner can't be sure that the subquery will return rows belonging to only one partition.
Don't think there is an obvious way to do this unfortunately. Try running the explain with the ANALYZE option. That will tell you what Postgres actually did and then you might see that it actually only ended up scanning one partition.
Answering my own question here but do not understand the dynamics behind so happy for hints on your side as to why this changes so much.
I used a join instead:
with deleted_unprocessed_data as (
delete from foundation.unprocessed_ids d
where id = any(select id from foundation.unprocessed_ids up order by up.asset_id, up.data_point_timestamp asc limit 1000
-- case
-- when numberOfItems<1000 then numberOfItems
-- when numberOfItems>999 then 1000
-- end
)
returning id, asset_id, data_point_timestamp
)
, ids_to_process as
(
insert into foundation.processed_ids select * from deleted_unprocessed_data
returning id, asset_id
)
select jh.asset_id,
min(jh.data_point_timestamp) as minborder,
max(jh.data_point_timestamp) as maxborder
from
(
select fd.id, fd.asset_id, fd.data_point_timestamp FROM
foundation.DATA fd
**join** ids_to_process itp
on fd.asset_id = itp.asset_id
and fd.id = itp.id
) jh
group by asset_id;
Then I get the following explain analyze which specifies that all index scans but on one partition were never executed:
...
-> Append (cost=0.42..1823.39 rows=256 width=20) (actual time=0.002..0.003 rows=1 loops=1000)
-> Index Scan using asset_id_hash_0_id_idx on asset_id_hash_0 fd (cost=0.42..7.10 rows=1 width=20) (never executed)
Index Cond: (id = itp.id)
Filter: (itp.asset_id = asset_id)
-> Index Scan using asset_id_hash_1_id_idx on asset_id_hash_1 fd_1 (cost=0.42..6.67 rows=1 width=20) (never executed)
Index Cond: (id = itp.id)
Filter: (itp.asset_id = asset_id)
...
-> Index Scan using asset_id_hash_72_id_idx on asset_id_hash_72 fd_72 (cost=0.43..8.35 rows=1 width=20) (actual time=0.002..0.002 rows=1 loops=1000)
Index Cond: (id = itp.id)
Filter: (itp.asset_id = asset_id)
...
-> Index Scan using asset_id_hash_255_id_idx on asset_id_hash_255 fd_255 (cost=0.42..6.87 rows=1 width=20) (never executed)
Index Cond: (id = itp.id)
Filter: (itp.asset_id = asset_id)

Need to reduce the query optimization time in postgres

Use Case: Need to find the index and totalCount of the particular id in the table
I am having a table ann_details which has 60 million records and based on where condition I need to retrieve the rows along with index of that id
Query:
with a as (
select an.id, row_number() over (partition by created_at) as rn
from annotation an
where ( an.layer_id = '47afb169-aed2-4378-ab13-897836275da3' or an.job_id = '' or an.task_id = '') and
an.category_id in (10019)
) select (select count(1) from a ) as totalCount , rn-1 as index from a where a.id= '47afb169-aed2-4378-ab13-897836275da3_a93f0758-8fe0-4c76-992f-0be17e5618bf_484484101';
Output:
totalCount index
1797124,1791143
Execution Time: 5 sec 487 ms
explain and analyze
CTE Scan on a (cost=872778.54..907545.00 rows=7722 width=16) (actual time=5734.572..5735.989 rows=1 loops=1)
Filter: ((id)::text = '47afb169-aed2-4378-ab13-897836275da3_a93f0758-8fe0-4c76-992f-0be17e5618bf_484484101'::text)
Rows Removed by Filter: 1797123
CTE a
-> WindowAgg (cost=0.68..838031.38 rows=1544318 width=97) (actual time=133.660..3831.998 rows=1797124 loops=1)
-> Index Only Scan using test_index_test_2 on annotation an (cost=0.68..814866.61 rows=1544318 width=89) (actual time=133.647..2660.009 rows=1797124 loops=1)
Index Cond: (category_id = 10019)
Filter: (((layer_id)::text = '47afb169-aed2-4378-ab13-897836275da3'::text) OR ((job_id)::text = ''::text) OR ((task_id)::text = ''::text))
Rows Removed by Filter: 3773007
Heap Fetches: 101650
InitPlan 2 (returns $1)
-> Aggregate (cost=34747.15..34747.17 rows=1 width=8) (actual time=2397.391..2397.392 rows=1 loops=1)
-> CTE Scan on a a_1 (cost=0.00..30886.36 rows=1544318 width=0) (actual time=0.017..2156.210 rows=1797124 loops=1)
Planning time: 0.487 ms
Execution time: 5771.080 ms
Index:
CREATE INDEX test_index_test_2 ON public.annotation USING btree (category_id,created_at,layer_id,job_id,task_id,id);
From application we will be passing the job_id or task_id or layer_id and rest 2 will be passed as empty
Need help in optimizing the query to get the response in 2 sec
Query Plan: https://explain.depesz.com/s/mXme

Postgres' planning takes unproportional time for execution

postgres 9.6 running on amazon RDS.
I have 2 tables:
aggregate events - big table with 6 keys (ids)
campaign metadata - small table with campaign definition.
I join the 2 in order to filter on metadata like campaign-name.
The query is in order to get a report of displayed breakdown by campaign channel and date ( date is daily ).
No FK and not null. The report table has multiple lines per day per campaigns ( because the aggregation is based on 6 attribute key ).
When i join , query plan grow to 10s ( vs 300ms)
explain analyze select c.campaign_channel as channel,date as day , sum( displayed ) as displayed
from report_campaigns c
left join events_daily r on r.campaign_id = c.c_id
where provider_id = 7726 and c.p_id = 7726 and c.campaign_name <> 'test'
and date >= '20170513 12:00' and date <= '20170515 12:00'
group by c.campaign_channel,date;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=71461.93..71466.51 rows=229 width=22) (actual time=104.189..114.788 rows=6 loops=1)
Group Key: c.campaign_channel, r.date
-> Sort (cost=71461.93..71462.51 rows=229 width=18) (actual time=100.263..106.402 rows=31205 loops=1)
Sort Key: c.campaign_channel, r.date
Sort Method: quicksort Memory: 3206kB
-> Hash Join (cost=1092.52..71452.96 rows=229 width=18) (actual time=22.149..86.955 rows=31205 loops=1)
Hash Cond: (r.campaign_id = c.c_id)
-> Append (cost=0.00..70245.84 rows=29948 width=20) (actual time=21.318..71.315 rows=31205 loops=1)
-> Seq Scan on events_daily r (cost=0.00..0.00 rows=1 width=20) (actual time=0.005..0.005 rows=0 loops=1)
Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone) AND (provider_id =
-> Bitmap Heap Scan on events_daily_20170513 r_1 (cost=685.36..23913.63 rows=1 width=20) (actual time=17.230..17.230 rows=0 loops=1)
Recheck Cond: (provider_id = 7726)
Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
Rows Removed by Filter: 13769
Heap Blocks: exact=10276
-> Bitmap Index Scan on events_daily_20170513_full_idx (cost=0.00..685.36 rows=14525 width=0) (actual time=2.356..2.356 rows=13769 loops=1)
Index Cond: (provider_id = 7726)
-> Bitmap Heap Scan on events_daily_20170514 r_2 (cost=689.08..22203.52 rows=14537 width=20) (actual time=4.082..21.389 rows=15281 loops=1)
Recheck Cond: (provider_id = 7726)
Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
Heap Blocks: exact=10490
-> Bitmap Index Scan on events_daily_20170514_full_idx (cost=0.00..685.45 rows=14537 width=0) (actual time=2.428..2.428 rows=15281 loops=1)
Index Cond: (provider_id = 7726)
-> Bitmap Heap Scan on events_daily_20170515 r_3 (cost=731.84..24128.69 rows=15409 width=20) (actual time=4.297..22.662 rows=15924 loops=1)
Recheck Cond: (provider_id = 7726)
Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
Heap Blocks: exact=11318
-> Bitmap Index Scan on events_daily_20170515_full_idx (cost=0.00..727.99 rows=15409 width=0) (actual time=2.506..2.506 rows=15924 loops=1)
Index Cond: (provider_id = 7726)
-> Hash (cost=1085.35..1085.35 rows=574 width=14) (actual time=0.815..0.815 rows=582 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 37kB
-> Bitmap Heap Scan on report_campaigns c (cost=12.76..1085.35 rows=574 width=14) (actual time=0.090..0.627 rows=582 loops=1)
Recheck Cond: (p_id = 7726)
Filter: ((campaign_name)::text <> 'test'::text)
Heap Blocks: exact=240
-> Bitmap Index Scan on report_campaigns_provider_id (cost=0.00..12.62 rows=577 width=0) (actual time=0.062..0.062 rows=582 loops=1)
Index Cond: (p_id = 7726)
Planning time: 9651.605 ms
Execution time: 115.092 ms
result:
channel | day | displayed
----------+---------------------+-----------
Pin | 2017-05-14 00:00:00 | 43434
Pin | 2017-05-15 00:00:00 | 3325325235
I seems to me this is because of summation forcing pre-computation before left joining.
Solution could be to impose filtering WHERE clauses in two nested sub-SELECT prior to left-joining and summation.
Hope this works:
SELECT channel, day, sum( displayed )
FROM
(SELECT campaign_channel AS channel, date AS day, displayed, p_id AS c_id
FROM report_campaigns WHERE p_id = 7726 AND campaign_name <> 'test' AND date >= '20170513 12:00' AND date <= '20170515 12:00') AS c,
(SELECT * FROM events_daily WHERE campaign_id = 7726) AS r
LEFT JOIN r.campaign_id = c.c_id
GROUP BY channel, day;

Loose index scan in Postgres on more than one field?

I have several large tables in Postgres 9.2 (millions of rows) where I need to generate a unique code based on the combination of two fields, 'source' (varchar) and 'id' (int). I can do this by generating row_numbers over the result of:
SELECT source,id FROM tablename GROUP BY source,id
but the results can take a while to process. It has been recommended that if the fields are indexed, and there are a proportionally small number of index values (which is my case), that a loose index scan may be a better option: http://wiki.postgresql.org/wiki/Loose_indexscan
WITH RECURSIVE
t AS (SELECT min(col) AS col FROM tablename
UNION ALL
SELECT (SELECT min(col) FROM tablename WHERE col > t.col) FROM t WHERE t.col IS NOT NULL)
SELECT col FROM t WHERE col IS NOT NULL
UNION ALL
SELECT NULL WHERE EXISTS(SELECT * FROM tablename WHERE col IS NULL);
The example operates on a single field though. Trying to return more than one field generates an error: subquery must return only one column. One possibility might be to try retrieving an entire ROW - e.g. SELECT ROW(min(source),min(id)..., but then I'm not sure what the syntax of the WHERE statement would need to look like to work with individual row elements.
The question is: can the recursion-based code be modified to work with more than one column, and if so, how? I'm committed to using Postgres, but it looks like MySQL has implemented loose index scans for more than one column: http://dev.mysql.com/doc/refman/5.1/en/group-by-optimization.html
As recommended, I'm attaching my EXPLAIN ANALYZE results.
For my situation - where I'm selecting distinct values for 2 columns using GROUP BY, it's the following:
HashAggregate (cost=1645408.44..1654099.65 rows=869121 width=34) (actual time=35411.889..36008.475 rows=1233080 loops=1)
-> Seq Scan on tablename (cost=0.00..1535284.96 rows=22024696 width=34) (actual time=4413.311..25450.840 rows=22025768 loops=1)
Total runtime: 36127.789 ms
(3 rows)
I don't know how to do a 2-column index scan (that's the question), but for purposes of comparison, using a GROUP BY on one column, I get:
HashAggregate (cost=1590346.70..1590347.69 rows=99 width=8) (actual time=32310.706..32310.722 rows=100 loops=1)
-> Seq Scan on tablename (cost=0.00..1535284.96 rows=22024696 width=8) (actual time=4764.609..26941.832 rows=22025768 loops=1)
Total runtime: 32350.899 ms
(3 rows)
But for a loose index scan on one column, I get:
Result (cost=181.28..198.07 rows=101 width=8) (actual time=0.069..1.935 rows=100 loops=1)
CTE t
-> Recursive Union (cost=1.74..181.28 rows=101 width=8) (actual time=0.062..1.855 rows=101 loops=1)
-> Result (cost=1.74..1.75 rows=1 width=0) (actual time=0.061..0.061 rows=1 loops=1)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..1.74 rows=1 width=8) (actual time=0.057..0.057 rows=1 loops=1)
-> Index Only Scan using tablename_id on tablename (cost=0.00..38379014.12 rows=22024696 width=8) (actual time=0.055..0.055 rows=1 loops=1)
Index Cond: (id IS NOT NULL)
Heap Fetches: 0
-> WorkTable Scan on t (cost=0.00..17.75 rows=10 width=8) (actual time=0.017..0.017 rows=1 loops=101)
Filter: (id IS NOT NULL)
Rows Removed by Filter: 0
SubPlan 3
-> Result (cost=1.75..1.76 rows=1 width=0) (actual time=0.016..0.016 rows=1 loops=100)
InitPlan 2 (returns $3)
-> Limit (cost=0.00..1.75 rows=1 width=8) (actual time=0.016..0.016 rows=1 loops=100)
-> Index Only Scan using tablename_id on tablename (cost=0.00..12811462.41 rows=7341565 width=8) (actual time=0.015..0.015 rows=1 loops=100)
Index Cond: ((id IS NOT NULL) AND (id > t.id))
Heap Fetches: 0
-> Append (cost=0.00..16.79 rows=101 width=8) (actual time=0.067..1.918 rows=100 loops=1)
-> CTE Scan on t (cost=0.00..2.02 rows=100 width=8) (actual time=0.067..1.899 rows=100 loops=1)
Filter: (id IS NOT NULL)
Rows Removed by Filter: 1
-> Result (cost=13.75..13.76 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
One-Time Filter: $5
InitPlan 5 (returns $5)
-> Index Only Scan using tablename_id on tablename (cost=0.00..13.75 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
Index Cond: (id IS NULL)
Heap Fetches: 0
Total runtime: 2.040 ms
The full table definition looks like this:
CREATE TABLE tablename
(
source character(25),
id bigint NOT NULL,
time_ timestamp without time zone,
height numeric,
lon numeric,
lat numeric,
distance numeric,
status character(3),
geom geometry(PointZ,4326),
relid bigint
)
WITH (
OIDS=FALSE
);
CREATE INDEX tablename_height
ON public.tablename
USING btree
(height);
CREATE INDEX tablename_geom
ON public.tablename
USING gist
(geom);
CREATE INDEX tablename_id
ON public.tablename
USING btree
(id);
CREATE INDEX tablename_lat
ON public.tablename
USING btree
(lat);
CREATE INDEX tablename_lon
ON public.tablename
USING btree
(lon);
CREATE INDEX tablename_relid
ON public.tablename
USING btree
(relid);
CREATE INDEX tablename_sid
ON public.tablename
USING btree
(source COLLATE pg_catalog."default", id);
CREATE INDEX tablename_source
ON public.tablename
USING btree
(source COLLATE pg_catalog."default");
CREATE INDEX tablename_time
ON public.tablename
USING btree
(time_);
Answer selection:
I took some time in comparing the approaches that were provided. It's at times like this that I wish that more than one answer could be accepted, but in this case, I'm giving the tick to #jjanes. The reason for this is that his solution matches the question as originally posed more closely, and I was able to get some insights as to the form of the required WHERE statement. In the end, the HashAggregate is actually the fastest approach (for me), but that's due to the nature of my data, not any problems with the algorithms. I've attached the EXPLAIN ANALYZE for the different approaches below, and will be giving +1 to both jjanes and joop.
HashAggregate:
HashAggregate (cost=1018669.72..1029722.08 rows=1105236 width=34) (actual time=24164.735..24686.394 rows=1233080 loops=1)
-> Seq Scan on tablename (cost=0.00..908548.48 rows=22024248 width=34) (actual time=0.054..14639.931 rows=22024982 loops=1)
Total runtime: 24787.292 ms
Loose Index Scan modification
CTE Scan on t (cost=13.84..15.86 rows=100 width=112) (actual time=0.916..250311.164 rows=1233080 loops=1)
Filter: (source IS NOT NULL)
Rows Removed by Filter: 1
CTE t
-> Recursive Union (cost=0.00..13.84 rows=101 width=112) (actual time=0.911..249295.872 rows=1233081 loops=1)
-> Limit (cost=0.00..0.04 rows=1 width=34) (actual time=0.910..0.911 rows=1 loops=1)
-> Index Only Scan using tablename_sid on tablename (cost=0.00..965442.32 rows=22024248 width=34) (actual time=0.908..0.908 rows=1 loops=1)
Heap Fetches: 0
-> WorkTable Scan on t (cost=0.00..1.18 rows=10 width=112) (actual time=0.201..0.201 rows=1 loops=1233081)
Filter: (source IS NOT NULL)
Rows Removed by Filter: 0
SubPlan 1
-> Limit (cost=0.00..0.05 rows=1 width=34) (actual time=0.100..0.100 rows=1 loops=1233080)
-> Index Only Scan using tablename_sid on tablename (cost=0.00..340173.38 rows=7341416 width=34) (actual time=0.100..0.100 rows=1 loops=1233080)
Index Cond: (ROW(source, id) > ROW(t.source, t.id))
Heap Fetches: 0
SubPlan 2
-> Limit (cost=0.00..0.05 rows=1 width=34) (actual time=0.099..0.099 rows=1 loops=1233080)
-> Index Only Scan using tablename_sid on tablename (cost=0.00..340173.38 rows=7341416 width=34) (actual time=0.098..0.098 rows=1 loops=1233080)
Index Cond: (ROW(source, id) > ROW(t.source, t.id))
Heap Fetches: 0
Total runtime: 250491.559 ms
Merge Anti Join
Merge Anti Join (cost=0.00..12099015.26 rows=14682832 width=42) (actual time=48.710..541624.677 rows=1233080 loops=1)
Merge Cond: ((src.source = nx.source) AND (src.id = nx.id))
Join Filter: (nx.time_ > src.time_)
Rows Removed by Join Filter: 363464177
-> Index Only Scan using tablename_pkey on tablename src (cost=0.00..1060195.27 rows=22024248 width=42) (actual time=48.566..5064.551 rows=22024982 loops=1)
Heap Fetches: 0
-> Materialize (cost=0.00..1115255.89 rows=22024248 width=42) (actual time=0.011..40551.997 rows=363464177 loops=1)
-> Index Only Scan using tablename_pkey on tablename nx (cost=0.00..1060195.27 rows=22024248 width=42) (actual time=0.008..8258.890 rows=22024982 loops=1)
Heap Fetches: 0
Total runtime: 541750.026 ms
Rather hideous, but this seems to work:
WITH RECURSIVE
t AS (
select a,b from (select a,b from foo order by a,b limit 1) asdf union all
select (select a from foo where (a,b) > (t.a,t.b) order by a,b limit 1),
(select b from foo where (a,b) > (t.a,t.b) order by a,b limit 1)
from t where t.a is not null)
select * from t where t.a is not null;
I don't really understand why the "is not nulls" are needed, as where do the nulls come from in the first place?
DROP SCHEMA zooi CASCADE;
CREATE SCHEMA zooi ;
SET search_path=zooi,public,pg_catalog;
CREATE TABLE tablename
( source character(25) NOT NULL
, id bigint NOT NULL
, time_ timestamp without time zone NOT NULL
, height numeric
, lon numeric
, lat numeric
, distance numeric
, status character(3)
, geom geometry(PointZ,4326)
, relid bigint
, PRIMARY KEY (source,id,time_) -- <<-- Primary key here
) WITH ( OIDS=FALSE);
-- invent some bogus data
INSERT INTO tablename(source,id,time_)
SELECT 'SRC_'|| (gs%10)::text
,gs/10
,gt
FROM generate_series(1,1000) gs
, generate_series('2013-12-01', '2013-12-07', '1hour'::interval) gt
;
Select unique values for two key fields:
VACUUM ANALYZE tablename;
EXPLAIN ANALYZE
SELECT source,id,time_
FROM tablename src
WHERE NOT EXISTS (
SELECT * FROM tablename nx
WHERE nx.source =src.source
AND nx.id = src.id
AND time_ > src.time_
)
;
Generates this plan here (Pg-9.3):
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Hash Anti Join (cost=4981.00..12837.82 rows=96667 width=42) (actual time=547.218..1194.335 rows=1000 loops=1)
Hash Cond: ((src.source = nx.source) AND (src.id = nx.id))
Join Filter: (nx.time_ > src.time_)
Rows Removed by Join Filter: 145000
-> Seq Scan on tablename src (cost=0.00..2806.00 rows=145000 width=42) (actual time=0.010..210.810 rows=145000 loops=1)
-> Hash (cost=2806.00..2806.00 rows=145000 width=42) (actual time=546.497..546.497 rows=145000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 9063kB
-> Seq Scan on tablename nx (cost=0.00..2806.00 rows=145000 width=42) (actual time=0.006..259.864 rows=145000 loops=1)
Total runtime: 1197.374 ms
(9 rows)
The hash-joins will probably disappear once the data outgrows the work_mem:
Merge Anti Join (cost=0.83..8779.56 rows=29832 width=120) (actual time=0.981..2508.912 rows=1000 loops=1)
Merge Cond: ((src.source = nx.source) AND (src.id = nx.id))
Join Filter: (nx.time_ > src.time_)
Rows Removed by Join Filter: 184051
-> Index Scan using tablename_sid on tablename src (cost=0.41..4061.57 rows=32544 width=120) (actual time=0.055..250.621 rows=145000 loops=1)
-> Index Scan using tablename_sid on tablename nx (cost=0.41..4061.57 rows=32544 width=120) (actual time=0.008..603.403 rows=328906 loops=1)
Total runtime: 2510.505 ms
Lateral joins can give you a clean code to select multiple columns in nested selects, without checking for null as no subqueries in select clause.
-- Assuming you want to get one '(a,b)' for every 'a'.
with recursive t as (
(select a, b from foo order by a, b limit 1)
union all
(select s.* from t, lateral(
select a, b from foo f
where f.a > t.a
order by a, b limit 1) s)
)
select * from t;

Postgresql COALESCE performance problem

I have this table in Postgresql:
CREATE TABLE my_table
(
id bigint NOT NULL,
value bigint,
CONSTRAINT my_table_pkey PRIMARY KEY (id)
);
There are ~50000 rows in my_table.
The question is, why the query:
SELECT * FROM my_table WHERE id = COALESCE(null, id) and value = ?
is slower than this one:
SELECT * FROM my_table WHERE value = ?
Is there any solution, other than optimizing the query string in app-layer?
EDIT: Practically, the question is how to rewrite the query select * from my_table where id=coalesce(?, id) and value=? to have worst case performance not less than that of select * from my_table where value=? in Postgresql 9.0
Try rewriting the query of the form
SELECT *
FROM my_table
WHERE value = ?
AND (? IS NULL OR id = ?)
From my own quick tests
INSERT INTO my_table select generate_series(1,50000),1;
UPDATE my_table SET value = id%17;
CREATE INDEX val_idx ON my_table(value);
VACUUM ANALYZE my_table;
\set idval 17
\set pval 0
explain analyze
SELECT *
FROM my_table
WHERE value = :pval
AND (:idval IS NULL OR id = :idval);
Index Scan using my_table_pkey on my_table (cost=0.00..8.29 rows=1 width=16) (actual time=0.034..0.035 rows=1 loops=1)
Index Cond: (id = 17)
Filter: (value = 0)
Total runtime: 0.064 ms
\set idval null
explain analyze
SELECT *
FROM my_table
WHERE value = :pval
AND (:idval IS NULL OR id = :idval);
Bitmap Heap Scan on my_table (cost=58.59..635.62 rows=2882 width=16) (actual time=0.373..1.594 rows=2941 loops=1)
Recheck Cond: (value = 0)
-> Bitmap Index Scan on validx (cost=0.00..57.87 rows=2882 width=0) (actual time=0.324..0.324 rows=2941 loops=1)
Index Cond: (value = 0)
Total runtime: 1.811 ms
From creating a similar table, populating it, updating statistics, and finally looking at the output of EXPLAIN ANALYZE, the only difference I see is that the first query filters like this:
Filter: ((id = COALESCE(id)) AND (value = 3))
and the second one filters like this:
Filter: (value = 3)
I see substantially different performance and execution plans when there's an index on the column "value". In the first case
Bitmap Heap Scan on my_table (cost=19.52..552.60 rows=5 width=16) (actual time=19.311..20.679 rows=1000 loops=1)
Recheck Cond: (value = 3)
Filter: (id = COALESCE(id))
-> Bitmap Index Scan on t2 (cost=0.00..19.52 rows=968 width=0) (actual time=19.260..19.260 rows=1000 loops=1)
Index Cond: (value = 3)
Total runtime: 22.138 ms
and in the second
Bitmap Heap Scan on my_table (cost=19.76..550.42 rows=968 width=16) (actual time=0.302..1.293 rows=1000 loops=1)
Recheck Cond: (value = 3)
-> Bitmap Index Scan on t2 (cost=0.00..19.52 rows=968 width=0) (actual time=0.276..0.276 rows=1000 loops=1)
Index Cond: (value = 3)
Total runtime: 2.174 ms
So I'd say it's slower because the db engine a) evaluates the COALESCE() expression rather than optimizing it away, and b) evaluating it involves an additional filter condition.