Indexes on table:
create index shifts_start_at_idx
on shifts (start_at);
Query 1 with at time zone:
SELECT shifts.id
FROM shifts
JOIN stores ON shifts.store_id = stores.id AND stores.deleted_at IS NULL
JOIN cities ON stores.city_id = cities.id
WHERE TRUE
AND (shifts.start_at >= '2022-05-06 03:00:00'::timestamp AT TIME ZONE
(EXTRACT(timezone FROM cities.time_zone) * INTERVAL '1 second'))
ORDER BY shifts.start_at DESC, shifts.end_at DESC, shifts.id DESC
LIMIT 100;
Explain query 1:
Limit (cost=0.86..298.93 rows=100 width=24) (actual time=0.143..25.257 rows=100 loops=1)
-> Nested Loop (cost=0.86..1485256.59 rows=498300 width=24) (actual time=0.131..23.317 rows=100 loops=1)
" Join Filter: (shifts.start_at >= timezone((date_part('timezone'::text, cities.time_zone) * '00:00:01'::interval), '2022-05-06 03:00:00'::timestamp without time zone))"
-> Nested Loop (cost=0.72..1209695.67 rows=1494900 width=32) (actual time=0.096..17.621 rows=100 loops=1)
-> Index Scan Backward using shifts_admin_order_by_idx on shifts (cost=0.43..291132.79 rows=3000000 width=32) (actual time=0.036..6.780 rows=205 loops=1)
-> Index Scan using stores_id_deleted_at_null_idx on stores (cost=0.29..0.31 rows=1 width=16) (actual time=0.025..0.025 rows=0 loops=205)
Index Cond: (id = shifts.store_id)
-> Index Scan using cities_pkey on cities (cost=0.14..0.16 rows=1 width=20) (actual time=0.017..0.017 rows=1 loops=100)
Index Cond: (id = stores.city_id)
Planning Time: 0.632 ms
Execution Time: 26.436 ms
Postgres doesn't use index
Query 2 without at time zone:
SELECT shifts.id
FROM shifts
JOIN stores ON shifts.store_id = stores.id AND stores.deleted_at IS NULL
JOIN cities ON stores.city_id = cities.id
WHERE TRUE
AND (shifts.start_at >= '2022-05-06 03:00:00')
ORDER BY shifts.start_at DESC, shifts.end_at DESC, shifts.id DESC
LIMIT 100;
Explain query 2:
Limit (cost=0.86..108.84 rows=100 width=24) (actual time=0.125..8.866 rows=100 loops=1)
-> Nested Loop (cost=0.86..898691.17 rows=832261 width=24) (actual time=0.115..7.886 rows=100 loops=1)
-> Nested Loop (cost=0.72..761958.37 rows=832261 width=32) (actual time=0.066..5.570 rows=100 loops=1)
-> Index Scan Backward using shifts_admin_order_by_idx on shifts (cost=0.43..248984.02 rows=1670200 width=32) (actual time=0.014..1.380 rows=205 loops=1)
Index Cond: (start_at >= '2022-05-06 03:00:00+00'::timestamp with time zone)
-> Index Scan using stores_id_deleted_at_null_idx on stores (cost=0.29..0.31 rows=1 width=16) (actual time=0.008..0.008 rows=0 loops=205)
Index Cond: (id = shifts.store_id)
-> Index Only Scan using cities_pkey on cities (cost=0.14..0.16 rows=1 width=8) (actual time=0.008..0.008 rows=1 loops=100)
Index Cond: (id = stores.city_id)
Heap Fetches: 100
Planning Time: 0.327 ms
Execution Time: 9.394 ms
It is not entirely clear why it does not want to use the index when converting the time to a time format with a timezone
I'll post my query plan, view and indexes at the bottom of the page so that I can keep this question as clean as possible.
The issue I have is slow performance for a view that is not using indexes as I would expect them to be used. I have a table with around 7 million rows that I use as source for the view below.
I have added an index on eventdate which is being used as expected, but why is the index on manufacturerkey ignored? Which indexes would be more efficient?
Also, is it maybe this part to_char(fe.eventdate, 'HH24:MI'::text) AS hourminutes that hurts the performance?
Query plan: https://explain.dalibo.com/plan/Pvw
CREATE OR REPLACE VIEW public.v_test
AS SELECT df.facilityname,
dd.date,
dt.military_hour AS hour,
to_char(fe.eventdate, 'HH24:MI'::text) AS hourminutes,
df.tenantid,
df.tenantname,
dev.name AS event_type_name,
dtt.name AS ticket_type_name,
dde.name AS device_type_name,
count(*) AS count,
dl.country,
dl.state,
dl.district,
ds.systemmanufacturer
FROM fact_entriesexits fe
JOIN dim_facility df ON df.key = fe.facilitykey
JOIN dim_date dd ON dd.key = fe.datekey
JOIN dim_time dt ON dt.key = fe.timekey
LEFT JOIN dim_device dde ON dde.key = fe.devicekey
JOIN dim_eventtype dev ON dev.key = fe.eventtypekey
JOIN dim_tickettype dtt ON dtt.key = fe.tickettypekey
JOIN dim_licenseplate dl ON dl.key = fe.licenseplatekey
LEFT JOIN dim_systeminterface ds ON ds.key = fe.systeminterfacekey
WHERE fe.manufacturerkey = ANY (ARRAY[2, 1])
AND fe.eventdate >= '2022-01-01'
GROUP BY df.tenantname, df.tenantid, dl.region, dl.country, dl.state,
dl.district, df.facilityname, dev.name, dtt.name, dde.name,
ds.systemmanufacturer, dd.date, dt.military_hour, (to_char(fe.eventdate, 'HH24:MI'::text)), fe.licenseplatekey;
Here are the indexes the table fact_entriesexits contains:
CREATE INDEX idx_devicetype_fact_entriesexits_202008 ON public.fact_entriesexits_202008 USING btree (devicetype)
CREATE INDEX idx_etlsource_fact_entriesexits_202008 ON public.fact_entriesexits_202008 USING btree (etlsource)
CREATE INDEX idx_eventdate_fact_entriesexits_202008 ON public.fact_entriesexits_202008 USING btree (eventdate)
CREATE INDEX idx_fact_entriesexits_202008 ON public.fact_entriesexits_202008 USING btree (datekey)
CREATE INDEX idx_manufacturerkey_202008 ON public.fact_entriesexits_202008 USING btree (manufacturerkey)
Query plan:
Subquery Scan on v_lpr2 (cost=505358.60..508346.26 rows=17079 width=340) (actual time=85619.542..109797.440 rows=3008065 loops=1)
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Finalize GroupAggregate (cost=505358.60..508175.47 rows=17079 width=359) (actual time=85619.539..109097.943 rows=3008065 loops=1)
Group Key: df.tenantname, df.tenantid, dl.region, dl.country, dl.state, dl.district, df.facilityname, dev.name, dtt.name, dde.name, ds.systemmanufacturer, dd.date, dt.military_hour, (to_char(fe.eventdate, 'HH24:MI'::text)), fe.licenseplatekey
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Gather Merge (cost=505358.60..507392.70 rows=14232 width=359) (actual time=85619.507..105395.429 rows=3308717 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Partial GroupAggregate (cost=504358.57..504749.95 rows=7116 width=359) (actual time=85169.770..94043.715 rows=1102906 loops=3)
Group Key: df.tenantname, df.tenantid, dl.region, dl.country, dl.state, dl.district, df.facilityname, dev.name, dtt.name, dde.name, ds.systemmanufacturer, dd.date, dt.military_hour, (to_char(fe.eventdate, 'HH24:MI'::text)), fe.licenseplatekey
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Sort (cost=504358.57..504376.36 rows=7116 width=351) (actual time=85169.748..91995.088 rows=1500405 loops=3)
Sort Key: df.tenantname, df.tenantid, dl.region, dl.country, dl.state, dl.district, df.facilityname, dev.name, dtt.name, dde.name, ds.systemmanufacturer, dd.date, dt.military_hour, (to_char(fe.eventdate, 'HH24:MI'::text)), fe.licenseplatekey
Sort Method: external merge Disk: 218752kB
Buffers: shared hit=91037 read=366546, temp read=83669 written=83694
-> Hash Left Join (cost=3904.49..503903.26 rows=7116 width=351) (actual time=52.894..46338.295 rows=1500405 loops=3)
Hash Cond: (fe.systeminterfacekey = ds.key)
Buffers: shared hit=90979 read=366546
-> Hash Join (cost=3886.89..503848.87 rows=7116 width=321) (actual time=52.458..44551.012 rows=1500405 loops=3)
Hash Cond: (fe.licenseplatekey = dl.key)
Buffers: shared hit=90943 read=366546
-> Hash Left Join (cost=3849.10..503792.31 rows=7116 width=269) (actual time=51.406..43869.673 rows=1503080 loops=3)
Hash Cond: (fe.devicekey = dde.key)
Buffers: shared hit=90870 read=366546
-> Hash Join (cost=3405.99..503330.51 rows=7116 width=255) (actual time=47.077..43258.069 rows=1503080 loops=3)
Hash Cond: (fe.timekey = dt.key)
Buffers: shared hit=90021 read=366546
-> Hash Join (cost=570.97..500476.80 rows=7116 width=257) (actual time=6.869..42345.723 rows=1503080 loops=3)
Hash Cond: (fe.datekey = dd.key)
Buffers: shared hit=87348 read=366546
-> Hash Join (cost=166.75..500053.90 rows=7116 width=257) (actual time=2.203..41799.463 rows=1503080 loops=3)
Hash Cond: (fe.facilitykey = df.key)
Buffers: shared hit=86787 read=366546
-> Hash Join (cost=2.72..499871.14 rows=7116 width=224) (actual time=0.362..41103.372 rows=1503085 loops=3)
Hash Cond: (fe.tickettypekey = dtt.key)
Buffers: shared hit=86427 read=366546
-> Hash Join (cost=1.14..499722.81 rows=54741 width=214) (actual time=0.311..40595.537 rows=1503085 loops=3)
Hash Cond: (fe.eventtypekey = dev.key)
Buffers: shared hit=86424 read=366546
-> Append (cost=0.00..494830.25 rows=1824733 width=40) (actual time=0.266..40015.860 rows=1503085 loops=3)
Buffers: shared hit=86421 read=366546
-> Parallel Seq Scan on fact_entriesexits fe (cost=0.00..0.00 rows=1 width=40) (actual time=0.001..0.001 rows=0 loops=3)
Filter: ((manufacturerkey = ANY ('{2,1}'::integer[])) AND (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone))
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202101 on fact_entriesexits_202101 fe_25 (cost=0.42..4.28 rows=1 width=40) (actual time=0.005..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202102 on fact_entriesexits_202102 fe_26 (cost=0.42..4.27 rows=1 width=40) (actual time=0.005..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202103 on fact_entriesexits_202103 fe_27 (cost=0.42..4.24 rows=1 width=40) (actual time=0.007..0.007 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202104 on fact_entriesexits_202104 fe_28 (cost=0.42..4.05 rows=1 width=40) (actual time=0.006..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202105 on fact_entriesexits_202105 fe_29 (cost=0.43..4.12 rows=1 width=40) (actual time=0.006..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202106 on fact_entriesexits_202106 fe_30 (cost=0.43..4.19 rows=1 width=40) (actual time=0.005..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202107 on fact_entriesexits_202107 fe_31 (cost=0.43..4.28 rows=1 width=40) (actual time=0.005..0.006 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202108 on fact_entriesexits_202108 fe_32 (cost=0.43..3.83 rows=1 width=40) (actual time=0.007..0.007 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202109 on fact_entriesexits_202109 fe_33 (cost=0.43..3.40 rows=1 width=40) (actual time=0.006..0.007 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202110 on fact_entriesexits_202110 fe_34 (cost=0.43..2.77 rows=1 width=40) (actual time=0.005..0.005 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202111 on fact_entriesexits_202111 fe_35 (cost=0.43..3.21 rows=1 width=40) (actual time=0.005..0.005 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Index Scan using idx_eventdate_fact_entriesexits_202112 on fact_entriesexits_202112 fe_36 (cost=0.43..3.45 rows=1 width=40) (actual time=0.004..0.004 rows=0 loops=3)
Index Cond: (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone)
Filter: (manufacturerkey = ANY ('{2,1}'::integer[]))
Buffers: shared hit=3
-> Parallel Seq Scan on fact_entriesexits_202201 fe_37 (cost=0.00..382550.76 rows=445931 width=40) (actual time=0.032..39090.092 rows=379902 loops=3)
Filter: ((manufacturerkey = ANY ('{2,1}'::integer[])) AND (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 298432
Buffers: shared hit=3286 read=366546
-> Parallel Seq Scan on fact_entriesexits_202204 fe_38 (cost=0.00..39567.99 rows=469653 width=40) (actual time=0.015..242.895 rows=375639 loops=3)
Filter: ((manufacturerkey = ANY ('{2,1}'::integer[])) AND (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 158868
Buffers: shared hit=29546
-> Parallel Seq Scan on fact_entriesexits_202202 fe_39 (cost=0.00..30846.99 rows=437343 width=40) (actual time=0.019..230.952 rows=357451 loops=3)
Filter: ((manufacturerkey = ANY ('{2,1}'::integer[])) AND (eventdate >= '2022-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 98708
Buffers: shared hit=22294
I think you'll get the most benefit out of creating a composite index for querying with both eventdate and manufacturerkey; e.g.:
CREATE INDEX idx_manufacturerkey_eventdate_202008
ON public.fact_entriesexits_202008 USING btree (manufacturerkey, eventdate)
Since it's a composite index, put whatever column you're more likely to query by alone on the left side. You can remove the other index for that column, since it will be covered by the composite index.
As for the to_char on evendate, while you could make a special index for that calculation, you might be able to get better performance by splitting the query up into a grouped CTE and a join. In other words, limit the group by to the columns that actually define your unique groups, and then join that query with the tables that you need to get the final selection of columns.
Your slowest seq scan step is returning over half the rows of its partition, removing 298432 and returning 379902. (times around 3 each due to parallel workers). An index is unlikely to be helpful when returning so much of the table rows anyway.
Note that that partition also seems to be massively bloated. It is hard to see why else it would be so slow, and require so many buffer reads compared to the number of rows.
I recently did a full vacuum on a range of tables, and a specific monitoring query suddenly became really slow. This used to be a query we used for monitoring, so would happily run every 10sec for the past 2 months, but with the performance hit that came after the vacuum, most dashboards using that is down, and it ramps up until the server runs out of connections or resources depending.
Unfortunately I do not have the explain output of the previous one.
By not specifying a date limitation:
explain (analyze,timing) select min(id) from iqsim_cdrs;
Result (cost=0.64..0.65 rows=1 width=8) (actual time=6.222..6.222 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.57..0.64 rows=1 width=8) (actual time=6.216..6.217 rows=1 loops=1)
-> Index Only Scan using iqsim_cdrs_pkey on iqsim_cdrs (cost=0.57..34265771.63 rows=531041357 width=8) (a
ctual time=6.213..6.213 rows=1 loops=1)
Index Cond: (id IS NOT NULL)
Heap Fetches: 1
Planning time: 1.876 ms
Execution time: 6.313 ms
(8 rows)
By limiting the date:
explain (analyze,timing) select min(id) from iqsim_cdrs where timestamp < '2019-01-01 00:00:00';
Result (cost=7.38..7.39 rows=1 width=8) (actual time=363763.144..363763.145 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.57..7.38 rows=1 width=8) (actual time=363763.137..363763.138 rows=1 loops=1)
-> Index Scan using iqsim_cdrs_pkey on iqsim_cdrs (cost=0.57..35593384.68 rows=5227047 width=8) (actual t
ime=363763.133..363763.133 rows=1 loops=1)
Index Cond: (id IS NOT NULL)
Filter: ("timestamp" < '2019-01-01 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 488693105
Planning time: 7.707 ms
Execution time: 363763.219 ms
(9 rows)
Not sure what could have caused this, I can only presume before the full vacuum it used the index on timestamp?
* UPDATE *
As per #jjanes's recommendation, here the id+0 update
explain (analyze,timing) select min(id+0) from iqsim_cdrs where timestamp < '2019-01-01 00:00:00';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=377400.34..377400.35 rows=1 width=8) (actual time=109.176..109.177 rows=1 loops=1)
-> Index Scan using index_iqsim_cdrs_on_timestamp on iqsim_cdrs (cost=0.57..351196.84 rows=5240699 width=8) (actual time=0.131..108.911 rows=126 loops=1)
Index Cond: ("timestamp" < '2019-01-01 00:00:00'::timestamp without time zone)
Planning time: 4.756 ms
Execution time: 109.405 ms
(5 rows)
I'm trying to optimize the performance of my PostgreSQL queries. I noticed a big change in the time required to execute my query when I change the datetime in my query by one second. I'm trying to figure out why there's this drastic change in performance with such a small change in the query. I ran an explain(analyze, buffers) and see there's a difference in how they are operating but I don't understand enough to determine what to do about it. Any help?
Here is the first query
SELECT avg(travel_time_all)
FROM tt_data
WHERE date_time >= '2014-01-01 08:00:00' and
date_time < '2014-01-01 8:14:13' and
(tmc = '118P04252' or tmc = '118P04253' or tmc = '118P04254' or tmc = '118P04255' or tmc = '118P04256')
group by tmc order by tmc
If I increase the later date_time by one second to 2014-01-01 8:14:14 and rerun the query, it drastically increases the execution time.
Here are the results of the explain (analyze, buffers) on the two queries. First query:
GroupAggregate (cost=6251.99..6252.01 rows=1 width=14) (actual time=0.829..0.829 rows=1 loops=1)
Buffers: shared hit=506
-> Sort (cost=6251.99..6252.00 rows=1 width=14) (actual time=0.823..0.823 rows=1 loops=1)
Sort Key: tmc
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=506
-> Bitmap Heap Scan on tt_data (cost=36.29..6251.98 rows=1 width=14) (actual time=0.309..0.817 rows=1 loops=1)
Recheck Cond: ((date_time >= '2014-01-01 08:00:00'::timestamp without time zone) AND (date_time < '2014-01-01 08:14:13'::timestamp without time zone))
Filter: ((tmc = '118P04252'::text) OR (tmc = '118P04253'::text) OR (tmc = '118P04254'::text) OR (tmc = '118P04255'::text) OR (tmc = '118P04256'::text))
Rows Removed by Filter: 989
Buffers: shared hit=506
-> Bitmap Index Scan on tt_data_2_date_time_idx (cost=0.00..36.29 rows=1572 width=0) (actual time=0.119..0.119 rows=990 loops=1)
Index Cond: ((date_time >= '2014-01-01 08:00:00'::timestamp without time zone) AND (date_time < '2014-01-01 08:14:13'::timestamp without time zone))
Buffers: shared hit=7
Total runtime: 0.871 ms
Below is the second query:
GroupAggregate (cost=6257.31..6257.34 rows=1 width=14) (actual time=52.444..52.444 rows=1 loops=1)
Buffers: shared hit=2693
-> Sort (cost=6257.31..6257.32 rows=1 width=14) (actual time=52.438..52.438 rows=1 loops=1)
Sort Key: tmc
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=2693
-> Bitmap Heap Scan on tt_data (cost=6253.28..6257.30 rows=1 width=14) (actual time=52.427..52.431 rows=1 loops=1)
Recheck Cond: ((date_time >= '2014-01-01 08:00:00'::timestamp without time zone) AND (date_time < '2014-01-01 08:14:14'::timestamp without time zone) AND ((tmc = '118P04252'::text) OR (tmc = '118P04253'::text) OR (tmc = '118P04254'::text) OR (...)
Rows Removed by Index Recheck: 5
Buffers: shared hit=2693
-> BitmapAnd (cost=6253.28..6253.28 rows=1 width=0) (actual time=52.410..52.410 rows=0 loops=1)
Buffers: shared hit=2689
-> Bitmap Index Scan on tt_data_2_date_time_idx (cost=0.00..36.31 rows=1574 width=0) (actual time=0.132..0.132 rows=990 loops=1)
Index Cond: ((date_time >= '2014-01-01 08:00:00'::timestamp without time zone) AND (date_time < '2014-01-01 08:14:14'::timestamp without time zone))
Buffers: shared hit=7
-> BitmapOr (cost=6216.71..6216.71 rows=271178 width=0) (actual time=52.156..52.156 rows=0 loops=1)
Buffers: shared hit=2682
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=8.439..8.439 rows=125081 loops=1)
Index Cond: (tmc = '118P04252'::text)
Buffers: shared hit=483
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=10.257..10.257 rows=156115 loops=1)
Index Cond: (tmc = '118P04253'::text)
Buffers: shared hit=602
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=6.867..6.867 rows=102318 loops=1)
Index Cond: (tmc = '118P04254'::text)
Buffers: shared hit=396
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=13.371..13.371 rows=160566 loops=1)
Index Cond: (tmc = '118P04255'::text)
Buffers: shared hit=619
-> Bitmap Index Scan on tt_data_2_tmc_idx (cost=0.00..1243.34 rows=54236 width=0) (actual time=13.218..13.218 rows=150709 loops=1)
Index Cond: (tmc = '118P04256'::text)
Buffers: shared hit=582
Total runtime: 52.507 ms
Any advice on how to make the second query as fast as the first? I'd like to increase this time interval by a greater amount but don't want the performance to decrease.
I have a very small table "events" with just 10,703 records.
The following query takes about 600 ms:
SELECT count(id)
FROM events
WHERE event_date > now()
AND earth_distance((select position from zips where zip='94121'), ll_to_earth(venue_lat, venue_lon))<16090;
I tried to set gis index like this
CREATE INDEX latlon_idx on events USING gist(ll_to_earth(venue_lat, venue_lon));
but it didn't change anything. I also have index on event_date.
Here's explain analyze:
Aggregate (cost=5400.48..5400.49 rows=1 width=8) (actual time=615.479..615.479 rows=1 loops=1) InitPlan 1 (returns $0)
-> Index Scan using zips_zip_idx on zips (cost=0.00..8.27 rows=1 width=56) (actual time=0.051..0.056 rows=1 loops=1)
Index Cond: ((zip)::text = '94121'::text) -> Bitmap Heap Scan on events (cost=144.41..5386.03 rows=2468 width=8) (actual time=16.065..599.613 rows=3347 loops=1)
Recheck Cond: (event_date > now())
Filter: (sec_to_gc(cube_distance(($0)::cube, (ll_to_earth((venue_lat)::double precision, (venue_lon)::double precision))::cube)) < 16090::double precision)
-> Bitmap Index Scan on events_date_idx (cost=0.00..143.79 rows=7405 width=0) (actual time=13.523..13.523 rows=7614 loops=1)
Index Cond: (event_date > now()) Total runtime: 615.663 ms (10 rows)
What else I can try to speed it up?