Partition scanning very slow - postgresql

I have a large table partitioned at two levels, first by time:
CREATE TABLE IF NOT EXISTS research.ratios
(
security_id integer NOT NULL,
period_id smallint NOT NULL,
classification_id smallint NOT NULL,
dtz timestamp with time zone NOT NULL,
create_dt timestamp with time zone NOT NULL DEFAULT now(),
update_dt timestamp with time zone NOT NULL DEFAULT now(),
ratio_value real,
latest_record boolean NOT NULL DEFAULT false,
universe_weight real,
CONSTRAINT ratios_primarykey PRIMARY KEY (dtz, security_id, period_id),
CONSTRAINT ratios_classification_id_fkey FOREIGN KEY (classification_id)
REFERENCES settings.classification (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION,
CONSTRAINT ratios_period_id_fkey FOREIGN KEY (period_id)
REFERENCES settings.periods (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION,
CONSTRAINT ratios_security_id_fkey FOREIGN KEY (security_id)
REFERENCES settings.securitymaster (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
) PARTITION BY RANGE (dtz);
CREATE TABLE IF NOT EXISTS zpart.ratios_y1990 PARTITION OF research.ratios
FOR VALUES FROM ('1990-01-01 00:00:00+00') TO ('1991-01-01 00:00:00+00')
PARTITION BY LIST (period_id);
CREATE TABLE IF NOT EXISTS zpart.ratios_y1991 PARTITION OF research.ratios
FOR VALUES FROM ('1991-01-01 00:00:00+00') TO ('1992-01-01 00:00:00+00')
PARTITION BY LIST (period_id);
etc up to today
Each of those tables is then partitioned by a period_id
CREATE TABLE IF NOT EXISTS zpart.ratios_y1990p1 PARTITION OF zpart.ratios_y1990
FOR VALUES IN ('1');
CREATE TABLE IF NOT EXISTS zpart.ratios_y1990p11 PARTITION OF zpart.ratios_y1990
FOR VALUES IN ('11');
etc
Finally, note that the actual tables have a primary key:
CONSTRAINT ratios_primarykey PRIMARY KEY (dtz, security_id, period_id)
It works well for most use cases. However when we are collecting a full history by security_id it takes a long time, and some queries in particular make me think that I have got the wrong indexes.
For example, occasionally we will update all data for a security_id and we want to clear any old data that might have been invalidly added. Below is an example where I've updated a security_id in each valid partition and I want to clear data in any earlier partitions just in case:
delete from research.ratios where security_id=10450 and dtz<'2017-01-01';
In this example, there are 189 tables to check, and none of them have any data. I would have thought that the query would check each primary key index for the 189 tables, see that there are no records for security_id 10450 and then finish. However this action takes over a minute to run which is making me think that my indexes are not what I thought they were.
Here is a cut down version of the explain analyze:
QUERY PLAN
Delete on ratios (cost=8118.25..4049804.56 rows=0 width=0) (actual time=67117.836..67117.938 rows=0 loops=1)
Delete on ratios_y1990p1 ratios_1
Delete on ratios_y1990p9 ratios_2
...
Delete on ratios_y2016p15 ratios_188
Delete on ratios_y2016p17 ratios_189
-> Append (cost=8118.25..4049804.56 rows=48564 width=10) (actual time=67117.829..67117.931 rows=0 loops=1)
-> Bitmap Heap Scan on ratios_y1990p1 ratios_1 (cost=8118.25..9048.95 rows=268 width=10) (actual time=271.742..271.743 rows=0 loops=1)
Recheck Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Bitmap Index Scan on ratios_y1990p1_pkey (cost=0.00..8118.18 rows=268 width=0) (actual time=271.738..271.738 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Bitmap Heap Scan on ratios_y1990p9 ratios_2 (cost=6554.82..7340.83 rows=232 width=10) (actual time=9.389..9.390 rows=0 loops=1)
Recheck Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Bitmap Index Scan on ratios_y1990p9_pkey (cost=0.00..6554.76 rows=232 width=0) (actual time=9.384..9.384 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Bitmap Heap Scan on ratios_y1990p11 ratios_3 (cost=6907.45..7694.32 rows=232 width=10) (actual time=7.762..7.762 rows=0 loops=1)
Recheck Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Bitmap Index Scan on ratios_y1990p11_pkey (cost=0.00..6907.39 rows=232 width=0) (actual time=7.757..7.757 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Bitmap Heap Scan on ratios_y1990p12 ratios_4 (cost=7815.19..8617.27 rows=237 width=10) (actual time=6.257..6.258 rows=0 loops=1)
Recheck Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Bitmap Index Scan on ratios_y1990p12_pkey (cost=0.00..7815.13 rows=237 width=0) (actual time=6.253..6.253 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Seq Scan on ratios_y1990p14 ratios_5 (cost=0.00..7059.28 rows=257 width=10) (actual time=23.991..23.991 rows=0 loops=1)
Filter: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
Rows Removed by Filter: 191152
-> Seq Scan on ratios_y1990p15 ratios_6 (cost=0.00..6500.28 rows=257 width=10) (actual time=23.207..23.208 rows=0 loops=1)
Filter: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
Rows Removed by Filter: 191152
-> Seq Scan on ratios_y1990p17 ratios_7 (cost=0.00..7468.97 rows=234 width=10) (actual time=28.048..28.048 rows=0 loops=1)
Filter: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
Rows Removed by Filter: 191571
...
-> Index Scan using ratios_y2013p11_pkey on ratios_y2013p11 ratios_164 (cost=0.42..23840.35 rows=261 width=10) (actual time=53.111..53.111 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2013p12_pkey on ratios_y2013p12 ratios_165 (cost=0.42..25032.72 rows=265 width=10) (actual time=642.015..642.015 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2013p14_pkey on ratios_y2013p14 ratios_166 (cost=0.42..25227.62 rows=259 width=10) (actual time=87.942..87.943 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2013p15_pkey on ratios_y2013p15 ratios_167 (cost=0.42..24860.51 rows=259 width=10) (actual time=43.079..43.080 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2013p17_pkey on ratios_y2013p17 ratios_168 (cost=0.42..23706.46 rows=258 width=10) (actual time=77.406..77.407 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2014p1_pkey on ratios_y2014p1 ratios_169 (cost=0.42..25432.35 rows=258 width=10) (actual time=112.056..112.056 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2014p9_pkey on ratios_y2014p9 ratios_170 (cost=0.42..23804.65 rows=261 width=10) (actual time=98.315..98.315 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2014p11_pkey on ratios_y2014p11 ratios_171 (cost=0.42..23990.53 rows=261 width=10) (actual time=78.770..78.770 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2014p12_pkey on ratios_y2014p12 ratios_172 (cost=0.42..24980.20 rows=257 width=10) (actual time=298.219..298.219 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2014p14_pkey on ratios_y2014p14 ratios_173 (cost=0.42..25452.75 rows=257 width=10) (actual time=40.636..40.636 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2014p15_pkey on ratios_y2014p15 ratios_174 (cost=0.42..25024.75 rows=260 width=10) (actual time=791.547..791.548 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2014p17_pkey on ratios_y2014p17 ratios_175 (cost=0.42..23898.95 rows=266 width=10) (actual time=48.073..48.073 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2015p1_pkey on ratios_y2015p1 ratios_176 (cost=0.42..25477.92 rows=257 width=10) (actual time=53.338..53.339 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2015p9_pkey on ratios_y2015p9 ratios_177 (cost=0.42..23937.18 rows=260 width=10) (actual time=45.046..45.046 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2015p11_pkey on ratios_y2015p11 ratios_178 (cost=0.42..23910.06 rows=258 width=10) (actual time=46.683..46.683 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2015p12_pkey on ratios_y2015p12 ratios_179 (cost=0.42..25184.07 rows=259 width=10) (actual time=50.504..50.504 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2015p14_pkey on ratios_y2015p14 ratios_180 (cost=0.42..25492.90 rows=258 width=10) (actual time=49.115..49.115 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2015p15_pkey on ratios_y2015p15 ratios_181 (cost=0.42..25011.61 rows=259 width=10) (actual time=375.470..375.470 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2015p17_pkey on ratios_y2015p17 ratios_182 (cost=0.42..23847.07 rows=268 width=10) (actual time=44.908..44.908 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2016p1_pkey on ratios_y2016p1 ratios_183 (cost=0.42..25192.72 rows=258 width=10) (actual time=57.975..57.975 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2016p9_pkey on ratios_y2016p9 ratios_184 (cost=0.42..24203.92 rows=264 width=10) (actual time=47.379..47.379 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2016p11_pkey on ratios_y2016p11 ratios_185 (cost=0.42..23961.60 rows=260 width=10) (actual time=44.402..44.402 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2016p12_pkey on ratios_y2016p12 ratios_186 (cost=0.42..25132.85 rows=262 width=10) (actual time=50.321..50.321 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2016p14_pkey on ratios_y2016p14 ratios_187 (cost=0.42..25185.69 rows=260 width=10) (actual time=46.699..46.699 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2016p15_pkey on ratios_y2016p15 ratios_188 (cost=0.42..25009.06 rows=260 width=10) (actual time=377.000..377.000 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
-> Index Scan using ratios_y2016p17_pkey on ratios_y2016p17 ratios_189 (cost=0.42..23785.12 rows=262 width=10) (actual time=45.472..45.472 rows=0 loops=1)
Index Cond: ((dtz < '2017-01-01 00:00:00+00'::timestamp with time zone) AND (security_id = 10450))
Planning Time: 66.162 ms
Execution Time: 67119.554 ms
Just confirming: there are zero rows affected by the above.
I'm guessing that I have set up the table/indexes incorrectly. I'm looking for advice on how to manage the partitions/indexes better so that the query can exclude partitions quickly if there are no records for a security_id? Note there are billions of records, and growing fast.
Postgres 14.1

Since the security_id is not the first column in the btree index, it can't just jump to the part of the index where all of the given security_id values are, because there is no such singular place. You would need another index leading with security_id, or rearrange the order of columns in your primary key (not an easy task to do retroactively).

Related

Slow performance and index not taken into consideration in PostgreSQL

I'll post my query plan, view and indexes at the bottom of the page so that I can keep this question as clean as possible.
The issue I have is slow performance for a view that is not using indexes as I would expect them to be used. I have a table with around 7 million rows that I use as source for the view below.
I have added an index on eventdate which is being used as expected, but why is the index on manufacturerkey ignored? Which indexes would be more efficient?
Also, is it maybe this part to_char(fe.eventdate, 'HH24:MI'::text) AS hourminutes that hurts the performance?
Query plan: https://explain.dalibo.com/plan/Pvw
CREATE OR REPLACE VIEW public.v_test
AS SELECT df.facilityname,
dd.date,
dt.military_hour AS hour,
to_char(fe.eventdate, 'HH24:MI'::text) AS hourminutes,
df.tenantid,
df.tenantname,
dev.name AS event_type_name,
dtt.name AS ticket_type_name,
dde.name AS device_type_name,
count(*) AS count,
dl.country,
dl.state,
dl.district,
ds.systemmanufacturer
FROM fact_entriesexits fe
JOIN dim_facility df ON df.key = fe.facilitykey
JOIN dim_date dd ON dd.key = fe.datekey
JOIN dim_time dt ON dt.key = fe.timekey
LEFT JOIN dim_device dde ON dde.key = fe.devicekey
JOIN dim_eventtype dev ON dev.key = fe.eventtypekey
JOIN dim_tickettype dtt ON dtt.key = fe.tickettypekey
JOIN dim_licenseplate dl ON dl.key = fe.licenseplatekey
LEFT JOIN dim_systeminterface ds ON ds.key = fe.systeminterfacekey
WHERE fe.manufacturerkey = ANY (ARRAY[2, 1])
AND fe.eventdate >= '2022-01-01'
GROUP BY df.tenantname, df.tenantid, dl.region, dl.country, dl.state,
dl.district, df.facilityname, dev.name, dtt.name, dde.name,
ds.systemmanufacturer, dd.date, dt.military_hour, (to_char(fe.eventdate, 'HH24:MI'::text)), fe.licenseplatekey;
Here are the indexes the table fact_entriesexits contains:
CREATE INDEX idx_devicetype_fact_entriesexits_202008 ON public.fact_entriesexits_202008 USING btree (devicetype)
CREATE INDEX idx_etlsource_fact_entriesexits_202008 ON public.fact_entriesexits_202008 USING btree (etlsource)
CREATE INDEX idx_eventdate_fact_entriesexits_202008 ON public.fact_entriesexits_202008 USING btree (eventdate)
CREATE INDEX idx_fact_entriesexits_202008 ON public.fact_entriesexits_202008 USING btree (datekey)
CREATE INDEX idx_manufacturerkey_202008 ON public.fact_entriesexits_202008 USING btree (manufacturerkey)
Query plan:
Subquery Scan on v_lpr2 (cost=505358.60..508346.26 rows=17079 width=340) (actual time=85619.542..109797.440 rows=3008065 loops=1)
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Finalize GroupAggregate (cost=505358.60..508175.47 rows=17079 width=359) (actual time=85619.539..109097.943 rows=3008065 loops=1)
Group Key: df.tenantname, df.tenantid, dl.region, dl.country, dl.state, dl.district, df.facilityname, dev.name, dtt.name, dde.name, ds.systemmanufacturer, dd.date, dt.military_hour, (to_char(fe.eventdate, 'HH24:MI'::text)), fe.licenseplatekey
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Gather Merge (cost=505358.60..507392.70 rows=14232 width=359) (actual time=85619.507..105395.429 rows=3308717 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Partial GroupAggregate (cost=504358.57..504749.95 rows=7116 width=359) (actual time=85169.770..94043.715 rows=1102906 loops=3)
Group Key: df.tenantname, df.tenantid, dl.region, dl.country, dl.state, dl.district, df.facilityname, dev.name, dtt.name, dde.name, ds.systemmanufacturer, dd.date, dt.military_hour, (to_char(fe.eventdate, 'HH24:MI'::text)), fe.licenseplatekey
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Sort (cost=504358.57..504376.36 rows=7116 width=351) (actual time=85169.748..91995.088 rows=1500405 loops=3)
Sort Key: df.tenantname, df.tenantid, dl.region, dl.country, dl.state, dl.district, df.facilityname, dev.name, dtt.name, dde.name, ds.systemmanufacturer, dd.date, dt.military_hour, (to_char(fe.eventdate, 'HH24:MI'::text)), fe.licenseplatekey
Sort Method: external merge Disk: 218752kB
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Hash Left Join (cost=3904.49..503903.26 rows=7116 width=351) (actual time=52.894..46338.295 rows=1500405 loops=3)
Hash Cond: (fe.systeminterfacekey = ds.key)
Buffers: shared hit=90979 read=366546
-> Hash Join (cost=3886.89..503848.87 rows=7116 width=321) (actual time=52.458..44551.012 rows=1500405 loops=3)
Hash Cond: (fe.licenseplatekey = dl.key)
Buffers: shared hit=90943 read=366546
-> Hash Left Join (cost=3849.10..503792.31 rows=7116 width=269) (actual time=51.406..43869.673 rows=1503080 loops=3)
Hash Cond: (fe.devicekey = dde.key)
Buffers: shared hit=90870 read=366546
-> Hash Join (cost=3405.99..503330.51 rows=7116 width=255) (actual time=47.077..43258.069 rows=1503080 loops=3)
Hash Cond: (fe.timekey = dt.key)
Buffers: shared hit=90021 read=366546
-> Hash Join (cost=570.97..500476.80 rows=7116 width=257) (actual time=6.869..42345.723 rows=1503080 loops=3)
Hash Cond: (fe.datekey = dd.key)
Buffers: shared hit=87348 read=366546
-> Hash Join (cost=166.75..500053.90 rows=7116 width=257) (actual time=2.203..41799.463 rows=1503080 loops=3)
Hash Cond: (fe.facilitykey = df.key)
Buffers: shared hit=86787 read=366546
-> Hash Join (cost=2.72..499871.14 rows=7116 width=224) (actual time=0.362..41103.372 rows=1503085 loops=3)
Hash Cond: (fe.tickettypekey = dtt.key)
Buffers: shared hit=86427 read=366546
-> Hash Join (cost=1.14..499722.81 rows=54741 width=214) (actual time=0.311..40595.537 rows=1503085 loops=3)
Hash Cond: (fe.eventtypekey = dev.key)
Buffers: shared hit=86424 read=366546
-> Append (cost=0.00..494830.25 rows=1824733 width=40) (actual time=0.266..40015.860 rows=1503085 loops=3)
Buffers: shared hit=86421 read=366546
-> Parallel Seq Scan on fact_entriesexits fe (cost=0.00..0.00 rows=1 width=40) (actual time=0.001..0.001 rows=0 loops=3)
Filter: ((manufacturerkey = ANY ('{2,1}'::integer[])) AND (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone))
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202101 on fact_entriesexits_202101 fe_25 (cost=0.42..4.28 rows=1 width=40) (actual time=0.005..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202102 on fact_entriesexits_202102 fe_26 (cost=0.42..4.27 rows=1 width=40) (actual time=0.005..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202103 on fact_entriesexits_202103 fe_27 (cost=0.42..4.24 rows=1 width=40) (actual time=0.007..0.007 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202104 on fact_entriesexits_202104 fe_28 (cost=0.42..4.05 rows=1 width=40) (actual time=0.006..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202105 on fact_entriesexits_202105 fe_29 (cost=0.43..4.12 rows=1 width=40) (actual time=0.006..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202106 on fact_entriesexits_202106 fe_30 (cost=0.43..4.19 rows=1 width=40) (actual time=0.005..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202107 on fact_entriesexits_202107 fe_31 (cost=0.43..4.28 rows=1 width=40) (actual time=0.005..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202108 on fact_entriesexits_202108 fe_32 (cost=0.43..3.83 rows=1 width=40) (actual time=0.007..0.007 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202109 on fact_entriesexits_202109 fe_33 (cost=0.43..3.40 rows=1 width=40) (actual time=0.006..0.007 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202110 on fact_entriesexits_202110 fe_34 (cost=0.43..2.77 rows=1 width=40) (actual time=0.005..0.005 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202111 on fact_entriesexits_202111 fe_35 (cost=0.43..3.21 rows=1 width=40) (actual time=0.005..0.005 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202112 on fact_entriesexits_202112 fe_36 (cost=0.43..3.45 rows=1 width=40) (actual time=0.004..0.004 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Seq Scan on fact_entriesexits_202201 fe_37 (cost=0.00..382550.76 rows=445931 width=40) (actual time=0.032..39090.092 rows=379902 loops=3)
Filter: ((manufacturerkey = ANY ('{2,1}'::integer[])) AND (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 298432
Buffers: shared hit=3286 read=366546
-> Parallel Seq Scan on fact_entriesexits_202204 fe_38 (cost=0.00..39567.99 rows=469653 width=40) (actual time=0.015..242.895 rows=375639 loops=3)
Filter: ((manufacturerkey = ANY ('{2,1}'::integer[])) AND (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 158868
Buffers: shared hit=29546
-> Parallel Seq Scan on fact_entriesexits_202202 fe_39 (cost=0.00..30846.99 rows=437343 width=40) (actual time=0.019..230.952 rows=357451 loops=3)
Filter: ((manufacturerkey = ANY ('{2,1}'::integer[])) AND (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 98708
Buffers: shared hit=22294
I think you'll get the most benefit out of creating a composite index for querying with both eventdate and manufacturerkey; e.g.:
CREATE INDEX idx_manufacturerkey_eventdate_202008
ON public.fact_entriesexits_202008 USING btree (manufacturerkey, eventdate)
Since it's a composite index, put whatever column you're more likely to query by alone on the left side. You can remove the other index for that column, since it will be covered by the composite index.
As for the to_char on evendate, while you could make a special index for that calculation, you might be able to get better performance by splitting the query up into a grouped CTE and a join. In other words, limit the group by to the columns that actually define your unique groups, and then join that query with the tables that you need to get the final selection of columns.
Your slowest seq scan step is returning over half the rows of its partition, removing 298432 and returning 379902. (times around 3 each due to parallel workers). An index is unlikely to be helpful when returning so much of the table rows anyway.
Note that that partition also seems to be massively bloated. It is hard to see why else it would be so slow, and require so many buffer reads compared to the number of rows.

Suggestions to reduce memory usage on table partitioning (psql 11)

I have few tables will 20-40million rows, due to which my queries used to take a lot of time for execution. Are there any suggestions to troubleshoot/analyze the queries in details as of where most of the memory is consumed or any more suggestions before going for partitioning?
Also, I have few queries which are used for analysis too, and these queries run over whole range of dates (have to go through whole data).
So I will need an overall solution to keep my basic queries fast and that the analysis queries doesn't fail by going out of memory or crashing the DB.
One table size is nearly 120GB, other tables just have huge number of rows.
I tried to partition the tables with weekly and monthly date basis but then the queries are running out of memory, number of locks increases by a huge factor while having partitions, normal table query took 13 locks and queries on partitioned tables take 250 locks (monthly partition) and 1000 locks (weekly partitions).
I read, there is an overhead that adds up while we have partitions.
Analysis query:
SELECT id
from TABLE1
where id NOT IN (
SELECT DISTINCT id
FROM TABLE2
);
TABLE1 and TABLE2 are partitioned, the first by event_data_timestamp and the second by event_timestamp.
Analysis queries run out of memory and consumes huge number of locks, date based queries are pretty fast though.
QUERY:
EXPLAIN (ANALYZE, BUFFERS) SELECT id FROM Table1_monthly WHERE event_timestamp > '2019-01-01' and id NOT IN (SELECT DISTINCT id FROM Table2_monthly where event_data_timestamp > '2019-01-01');
Append (cost=32731.14..653650.98 rows=4656735 width=16) (actual time=2497.747..15405.447 rows=10121827 loops=1)
Buffers: shared hit=3 read=169100
-> Seq Scan on TABLE1_monthly_2019_01_26 (cost=32731.14..77010.63 rows=683809 width=16) (actual time=2497.746..3489.767 rows=1156382 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
Rows Removed by Filter: 462851
Buffers: shared read=44559
SubPlan 1
-> HashAggregate (cost=32728.64..32730.64 rows=200 width=16) (actual time=248.084..791.054 rows=1314570 loops=6)
Group Key: TABLE2_monthly_2019_01_26.cid
Buffers: shared read=24568
-> Append (cost=0.00..32277.49 rows=180458 width=16) (actual time=22.969..766.903 rows=1314570 loops=1)
Buffers: shared read=24568
-> Seq Scan on TABLE2_monthly_2019_01_26 (cost=0.00..5587.05 rows=32135 width=16) (actual time=22.965..123.734 rows=211977 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
Rows Removed by Filter: 40282
Buffers: shared read=4382
-> Seq Scan on TABLE2_monthly_2019_02_25 (cost=0.00..5573.02 rows=32054 width=16) (actual time=0.700..121.657 rows=241977 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
Buffers: shared read=4371
-> Seq Scan on TABLE2_monthly_2019_03_27 (cost=0.00..5997.60 rows=34496 width=16) (actual time=0.884..123.043 rows=253901 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
Buffers: shared read=4704
-> Seq Scan on TABLE2_monthly_2019_04_26 (cost=0.00..6581.55 rows=37855 width=16) (actual time=0.690..129.537 rows=282282 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
Buffers: shared read=5162
-> Seq Scan on TABLE2_monthly_2019_05_26 (cost=0.00..6585.38 rows=37877 width=16) (actual time=1.248..122.794 rows=281553 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
Buffers: shared read=5165
-> Seq Scan on TABLE2_monthly_2019_06_25 (cost=0.00..999.60 rows=5749 width=16) (actual time=0.750..23.020 rows=42880 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
Buffers: shared read=784
-> Seq Scan on TABLE2_monthly_2019_07_25 (cost=0.00..12.75 rows=73 width=16) (actual time=0.007..0.007 rows=0 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
-> Seq Scan on TABLE2_monthly_2019_08_24 (cost=0.00..12.75 rows=73 width=16) (actual time=0.003..0.004 rows=0 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
-> Seq Scan on TABLE2_monthly_2019_09_23 (cost=0.00..12.75 rows=73 width=16) (actual time=0.003..0.004 rows=0 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
-> Seq Scan on TABLE2_monthly_2019_10_23 (cost=0.00..12.75 rows=73 width=16) (actual time=0.007..0.007 rows=0 loops=1)
Filter: (event_data_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone)
-> Seq Scan on TABLE1_monthly_2019_02_25 (cost=32731.14..88679.16 rows=1022968 width=16) (actual time=1008.738..2341.807 rows=1803957 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
Rows Removed by Filter: 241978
Buffers: shared hit=1 read=25258
-> Seq Scan on TABLE1_monthly_2019_03_27 (cost=32731.14..97503.58 rows=1184315 width=16) (actual time=1000.795..2474.769 rows=2114729 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
Rows Removed by Filter: 253901
Buffers: shared hit=1 read=29242
-> Seq Scan on TABLE1_monthly_2019_04_26 (cost=32731.14..105933.54 rows=1338447 width=16) (actual time=892.820..2405.941 rows=2394619 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
Rows Removed by Filter: 282282
Buffers: shared hit=1 read=33048
-> Seq Scan on TABLE1_monthly_2019_05_26 (cost=32731.14..87789.65 rows=249772 width=16) (actual time=918.397..2614.059 rows=2340789 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
Rows Removed by Filter: 281553
Buffers: shared read=32579
-> Seq Scan on TABLE1_monthly_2019_06_25 (cost=32731.14..42458.60 rows=177116 width=16) (actual time=923.367..1141.672 rows=311351 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
Rows Removed by Filter: 42880
Buffers: shared read=4414
-> Seq Scan on TABLE1_monthly_2019_07_25 (cost=32731.14..32748.04 rows=77 width=16) (actual time=0.008..0.008 rows=0 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
-> Seq Scan on TABLE1_monthly_2019_08_24 (cost=32731.14..32748.04 rows=77 width=16) (actual time=0.003..0.003 rows=0 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
-> Seq Scan on TABLE1_monthly_2019_09_23 (cost=32731.14..32748.04 rows=77 width=16) (actual time=0.003..0.003 rows=0 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
-> Seq Scan on TABLE1_monthly_2019_10_23 (cost=32731.14..32748.04 rows=77 width=16) (actual time=0.003..0.003 rows=0 loops=1)
Filter: ((event_timestamp > '2019-01-01 00:00:00+00'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
Planning Time: 244.669 ms
Execution Time: 15959.111 ms
(69 rows)
A query that joins two large partitioned tables to produce 10 million rows is going to consume resources, there is no way around that.
You can trade memory consumption for speed by reducing work_mem: smaller vakues will make your queries slower, but consume less memory.
I'd say that the best thing would be to leave work_mem high but reduce max_connections so that you don't run out of memory so fast. Also, putting more RAM into the machine is one of the cheapest hardware tuning techniques.
You can improve the query slighty:
Remove the DISTINCT, which is useless, consumes CPU resources and throws your estimates off.
ANALYZE table2 so that you get better estimates.
About partitioning: if these queries scan all partitions, the query will be slower with partitioned tables.
Whether partitioning is a good idea for you or not depends on the question if you have other queries that benefit from partitioning:
First and foremost, mass deletion, which is painless by dropping partitions.
Sequential scans where the partitioning key is part of the scan filter.
Contrary to popular belief, partitioning is not something you always benefit from if you have large tables: many queries become slower by partitioning.
The locks are your least worry: just increase max_locks_per_transaction.

Postgres' planning takes unproportional time for execution

postgres 9.6 running on amazon RDS.
I have 2 tables:
aggregate events - big table with 6 keys (ids)
campaign metadata - small table with campaign definition.
I join the 2 in order to filter on metadata like campaign-name.
The query is in order to get a report of displayed breakdown by campaign channel and date ( date is daily ).
No FK and not null. The report table has multiple lines per day per campaigns ( because the aggregation is based on 6 attribute key ).
When i join , query plan grow to 10s ( vs 300ms)
explain analyze select c.campaign_channel as channel,date as day , sum( displayed ) as displayed
from report_campaigns c
left join events_daily r on r.campaign_id = c.c_id
where provider_id = 7726 and c.p_id = 7726 and c.campaign_name <> 'test'
and date >= '20170513 12:00' and date <= '20170515 12:00'
group by c.campaign_channel,date;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=71461.93..71466.51 rows=229 width=22) (actual time=104.189..114.788 rows=6 loops=1)
Group Key: c.campaign_channel, r.date
-> Sort (cost=71461.93..71462.51 rows=229 width=18) (actual time=100.263..106.402 rows=31205 loops=1)
Sort Key: c.campaign_channel, r.date
Sort Method: quicksort Memory: 3206kB
-> Hash Join (cost=1092.52..71452.96 rows=229 width=18) (actual time=22.149..86.955 rows=31205 loops=1)
Hash Cond: (r.campaign_id = c.c_id)
-> Append (cost=0.00..70245.84 rows=29948 width=20) (actual time=21.318..71.315 rows=31205 loops=1)
-> Seq Scan on events_daily r (cost=0.00..0.00 rows=1 width=20) (actual time=0.005..0.005 rows=0 loops=1)
Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone) AND (provider_id =
-> Bitmap Heap Scan on events_daily_20170513 r_1 (cost=685.36..23913.63 rows=1 width=20) (actual time=17.230..17.230 rows=0 loops=1)
Recheck Cond: (provider_id = 7726)
Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
Rows Removed by Filter: 13769
Heap Blocks: exact=10276
-> Bitmap Index Scan on events_daily_20170513_full_idx (cost=0.00..685.36 rows=14525 width=0) (actual time=2.356..2.356 rows=13769 loops=1)
Index Cond: (provider_id = 7726)
-> Bitmap Heap Scan on events_daily_20170514 r_2 (cost=689.08..22203.52 rows=14537 width=20) (actual time=4.082..21.389 rows=15281 loops=1)
Recheck Cond: (provider_id = 7726)
Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
Heap Blocks: exact=10490
-> Bitmap Index Scan on events_daily_20170514_full_idx (cost=0.00..685.45 rows=14537 width=0) (actual time=2.428..2.428 rows=15281 loops=1)
Index Cond: (provider_id = 7726)
-> Bitmap Heap Scan on events_daily_20170515 r_3 (cost=731.84..24128.69 rows=15409 width=20) (actual time=4.297..22.662 rows=15924 loops=1)
Recheck Cond: (provider_id = 7726)
Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
Heap Blocks: exact=11318
-> Bitmap Index Scan on events_daily_20170515_full_idx (cost=0.00..727.99 rows=15409 width=0) (actual time=2.506..2.506 rows=15924 loops=1)
Index Cond: (provider_id = 7726)
-> Hash (cost=1085.35..1085.35 rows=574 width=14) (actual time=0.815..0.815 rows=582 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 37kB
-> Bitmap Heap Scan on report_campaigns c (cost=12.76..1085.35 rows=574 width=14) (actual time=0.090..0.627 rows=582 loops=1)
Recheck Cond: (p_id = 7726)
Filter: ((campaign_name)::text <> 'test'::text)
Heap Blocks: exact=240
-> Bitmap Index Scan on report_campaigns_provider_id (cost=0.00..12.62 rows=577 width=0) (actual time=0.062..0.062 rows=582 loops=1)
Index Cond: (p_id = 7726)
Planning time: 9651.605 ms
Execution time: 115.092 ms
result:
channel | day | displayed
----------+---------------------+-----------
Pin | 2017-05-14 00:00:00 | 43434
Pin | 2017-05-15 00:00:00 | 3325325235
I seems to me this is because of summation forcing pre-computation before left joining.
Solution could be to impose filtering WHERE clauses in two nested sub-SELECT prior to left-joining and summation.
Hope this works:
SELECT channel, day, sum( displayed )
FROM
(SELECT campaign_channel AS channel, date AS day, displayed, p_id AS c_id
FROM report_campaigns WHERE p_id = 7726 AND campaign_name <> 'test' AND date >= '20170513 12:00' AND date <= '20170515 12:00') AS c,
(SELECT * FROM events_daily WHERE campaign_id = 7726) AS r
LEFT JOIN r.campaign_id = c.c_id
GROUP BY channel, day;

PostgreSQL performance difference with datetime comparison

I'm trying to optimize the performance of my PostgreSQL queries. I noticed a big change in the time required to execute my query when I change the datetime in my query by one second. I'm trying to figure out why there's this drastic change in performance with such a small change in the query. I ran an explain(analyze, buffers) and see there's a difference in how they are operating but I don't understand enough to determine what to do about it. Any help?
Here is the first query
SELECT avg(travel_time_all)
FROM tt_data
WHERE date_time >= '2014-01-01 08:00:00' and
date_time < '2014-01-01 8:14:13' and
(tmc = '118P04252' or tmc = '118P04253' or tmc = '118P04254' or tmc = '118P04255' or tmc = '118P04256')
group by tmc order by tmc
If I increase the later date_time by one second to 2014-01-01 8:14:14 and rerun the query, it drastically increases the execution time.
Here are the results of the explain (analyze, buffers) on the two queries. First query:
GroupAggregate (cost=6251.99..6252.01 rows=1 width=14) (actual time=0.829..0.829 rows=1 loops=1)
Buffers: shared hit=506
-> Sort (cost=6251.99..6252.00 rows=1 width=14) (actual time=0.823..0.823 rows=1 loops=1)
Sort Key: tmc
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=506
-> Bitmap Heap Scan on tt_data (cost=36.29..6251.98 rows=1 width=14) (actual time=0.309..0.817 rows=1 loops=1)
Recheck Cond: ((date_time >= '2014-01-01 08:00:00'::timestamp without time zone) AND (date_time < '2014-01-01 08:14:13'::timestamp without time zone))
Filter: ((tmc = '118P04252'::text) OR (tmc = '118P04253'::text) OR (tmc = '118P04254'::text) OR (tmc = '118P04255'::text) OR (tmc = '118P04256'::text))
Rows Removed by Filter: 989
Buffers: shared hit=506
-> Bitmap Index Scan on tt_data_2_date_time_idx (cost=0.00..36.29 rows=1572 width=0) (actual time=0.119..0.119 rows=990 loops=1)
Index Cond: ((date_time >= '2014-01-01 08:00:00'::timestamp without time zone) AND (date_time < '2014-01-01 08:14:13'::timestamp without time zone))
Buffers: shared hit=7
Total runtime: 0.871 ms
Below is the second query:
GroupAggregate (cost=6257.31..6257.34 rows=1 width=14) (actual time=52.444..52.444 rows=1 loops=1)
Buffers: shared hit=2693
-> Sort (cost=6257.31..6257.32 rows=1 width=14) (actual time=52.438..52.438 rows=1 loops=1)
Sort Key: tmc
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=2693
-> Bitmap Heap Scan on tt_data (cost=6253.28..6257.30 rows=1 width=14) (actual time=52.427..52.431 rows=1 loops=1)
Recheck Cond: ((date_time >= '2014-01-01 08:00:00'::timestamp without time zone) AND (date_time < '2014-01-01 08:14:14'::timestamp without time zone) AND ((tmc = '118P04252'::text) OR (tmc = '118P04253'::text) OR (tmc = '118P04254'::text) OR (...)
Rows Removed by Index Recheck: 5
Buffers: shared hit=2693
-> BitmapAnd (cost=6253.28..6253.28 rows=1 width=0) (actual time=52.410..52.410 rows=0 loops=1)
Buffers: shared hit=2689
-> Bitmap Index Scan on tt_data_2_date_time_idx (cost=0.00..36.31 rows=1574 width=0) (actual time=0.132..0.132 rows=990 loops=1)
Index Cond: ((date_time >= '2014-01-01 08:00:00'::timestamp without time zone) AND (date_time < '2014-01-01 08:14:14'::timestamp without time zone))
Buffers: shared hit=7
-> BitmapOr (cost=6216.71..6216.71 rows=271178 width=0) (actual time=52.156..52.156 rows=0 loops=1)
Buffers: shared hit=2682
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=8.439..8.439 rows=125081 loops=1)
Index Cond: (tmc = '118P04252'::text)
Buffers: shared hit=483
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=10.257..10.257 rows=156115 loops=1)
Index Cond: (tmc = '118P04253'::text)
Buffers: shared hit=602
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=6.867..6.867 rows=102318 loops=1)
Index Cond: (tmc = '118P04254'::text)
Buffers: shared hit=396
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=13.371..13.371 rows=160566 loops=1)
Index Cond: (tmc = '118P04255'::text)
Buffers: shared hit=619
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=13.218..13.218 rows=150709 loops=1)
Index Cond: (tmc = '118P04256'::text)
Buffers: shared hit=582
Total runtime: 52.507 ms
Any advice on how to make the second query as fast as the first? I'd like to increase this time interval by a greater amount but don't want the performance to decrease.

Postgresql doesn't use partial index

Postgresql 9.3
I have a table with date_field:
date_field timestamp without time zone
CREATE INDEX ix__table__date_field ON table USING btree (date_field)
WHERE date_field IS NOT NULL;
Then I've tried to use my partial index:
EXPLAIN ANALYZE SELECT count(*) from table where date_field is not null;
Aggregate (cost=29048.22..29048.23 rows=1 width=0) (actual time=41.714..41.714 rows=1 loops=1)
-> Seq Scan on table (cost=0.00..28138.83 rows=363755 width=0) (actual time=41.711..41.711 rows=0 loops=1)
Filter: (date_field IS NOT NULL)
Rows Removed by Filter: 365583
Total runtime: 41.744 ms
But it uses partial index with comparing dates:
EXPLAIN ANALYZE SELECT count(*) from table where date_field > '2015-1-1';
Aggregate (cost=26345.51..26345.52 rows=1 width=0) (actual time=0.006..0.007 rows=1 loops=1)
-> Bitmap Heap Scan on table (cost=34.60..26040.86 rows=121861 width=0) (actual time=0.005..0.005 rows=0 loops=1)
Recheck Cond: (date_field > '2015-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on ix__table__date_field (cost=0.00..4.13 rows=121861 width=0) (actual time=0.003..0.003 rows=0 loops=1)
Index Cond: (date_field > '2015-01-01 00:00:00'::timestamp without time zone)
Total runtime: 0.037 ms
So, why it doesn't use index on date_field is not null?
Thanks in advance!