Postgres query is sometimes taking a very long time - postgresql

I have implemented a FTS engine in my website using GIN tsvector and it works quite well, but there are a few times when it seems to take a very long time, for no specific reason. I am copying the output of the EXPLAIN ANALYE command below:
sitedb=# EXPLAIN ANALYZE SELECT id, title FROM post_1 WHERE search_vector ## to_tsquery('quantum');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on post_1 (cost=315.68..105654.80 rows=32443 width=106) (actual time=76.963..17281.184 rows=31925 loops=1)
Recheck Cond: (search_vector ## to_tsquery('quantum'::text))
Heap Blocks: exact=29259
-> Bitmap Index Scan on index1_idx (cost=0.00..307.57 rows=32443 width=0) (actual time=60.208..60.209 rows=31925 loops=1)
Index Cond: (search_vector ## to_tsquery('quantum'::text))
Planning Time: 47.648 ms
Execution Time: 17308.511 ms
(7 rows)
I thought at some point that changing work_mem would help. I have set it up to 86MB and it is still the same.
The weird thing is that if I re-run the same command right after, it is much faster. See below:
sitedb=# EXPLAIN ANALYZE SELECT id, title FROM post_1 WHERE search_vector ## to_tsquery('quantum');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on post_1 (cost=315.68..105654.80 rows=32443 width=106) (actual time=44.542..495.525 rows=31925 loops=1)
Recheck Cond: (search_vector ## to_tsquery('quantum'::text))
Heap Blocks: exact=29259
-> Bitmap Index Scan on index1_idx (cost=0.00..307.57 rows=32443 width=0) (actual time=29.256..29.256 rows=31925 loops=1)
Index Cond: (search_vector ## to_tsquery('quantum'::text))
Planning Time: 0.597 ms
Execution Time: 502.296 ms
(7 rows)
Would anyone have an idea?
Thank you very much.

It is probably a cold cache. E.g it had to read 29,259 pages and few of them were already in memory. The 2nd time you run it, they are in memory, so it is faster. You could help confirm this by doing EXPLAIN (ANALYZE, BUFFERS) after turning track_io_timing on.
You can increase effective_io_concurrency so that PostgreSQL will have multiple IO requests outstanding at once. How effective this is will depend on your IO hardware. It should be more effective on striped RAID or JBOD, for example.
If your cache was cold because you recently restarted, well, don't restart very often, or use pg_prewarm to warm up the cache when you do. If it can't stay in cache because your frequently-used data is too big for it to stay in memory, then get more RAM, or get faster disks (like SSD, of if they already are SSD then get better ones).

Related

Slow postgres query even though it does bitmap index scan

I have a table with 4707838 rows. When I run the following query on this table it takes around 9 seconds to execute.
SELECT json_agg(
json_build_object('accessorId',
p."accessorId",
'mobile',json_build_object('enabled', p.mobile,'settings',
json_build_object('proximityAccess', p."proximity",
'tapToAccess', p."tapToAccess",
'clickToAccessRange', p."clickToAccessRange",
'remoteAccess',p."remote")
),'
card',json_build_object('enabled',p."card"),
'fingerprint',json_build_object('enabled',p."fingerprint"))
) AS permissions
FROM permissions AS p
WHERE p."accessPointId"=99
The output of explain analyze is as follows:
Aggregate (cost=49860.12..49860.13 rows=1 width=32) (actual time=9011.711..9011.712 rows=1 loops=1)
Buffers: shared read=29720
I/O Timings: read=8192.273
-> Bitmap Heap Scan on permissions p (cost=775.86..49350.25 rows=33991 width=14) (actual time=48.886..8704.470 rows=36556 loops=1)
Recheck Cond: ("accessPointId" = 99)
Heap Blocks: exact=29331
Buffers: shared read=29720
I/O Timings: read=8192.273
-> Bitmap Index Scan on composite_key_accessor_access_point (cost=0.00..767.37 rows=33991 width=0) (actual time=38.767..38.768 rows=37032 loops=1)
Index Cond: ("accessPointId" = 99)
Buffers: shared read=105
I/O Timings: read=32.592
Planning Time: 0.142 ms
Execution Time: 9012.719 ms
This table has a btree index on accessorId column and composite index on (accessorId,accessPointId).
Can anyone tell me what could be the reason for this query to be slow even though it uses an index?
Over 90% of the time is waiting to get data from disk. At 3.6 ms per read, that is pretty fast for a harddrive (suggesting that much of the data was already in the filesystem cache, or that some of the reads brought in neighboring data that was also eventually required--that is sequential reads not just random reads) but slow for a SSD.
If you set enable_bitmapscan=off and clear the cache (or pick a not recently used "accessPointId" value) what performance do you get?
How big is the table? If you are reading a substantial fraction of the table and think you are not getting as much benefit from sequential reads as you should be, you can try making your OSes readahead settings more aggressive. On Linux that is something like sudo blockdev --setra ...
You could put all columns referred to by the query into the index, to enable index-only scans. But given the number of columns you are using that might be impractical. You could want "accessPointId" to be the first column in the index. By the way, is the index currently used really on (accessorId,accessPointId)? It looks to me like "accessPointId" is really the first column in that index, not the 2nd one.
You could cluster the table by an index which has "accessPointId" as the first column. That would group the related records together for faster access. But note it is a slow operation and takes a strong lock on the table while it is running, and future data going into the table won't be clustered, only the current data.
You could try to increase effective_io_concurrency so that you can have multiple io requests outstanding at a time. How effective this is will depend on your hardware.

Why the execution time given by ANALYZE (POSTGRESQL) of the same query, is different each time I execute it even if I reset the cache?

My setup:
MacOS
Homebrew
Postgresql10 installed with brew
To reset the cache I run these commands:
brew services stop postgresql#10
brew services start postgresql#10
Indeed each time I run the query I don't see share hit when I run EXPLAIN (ANALYZE, buffers).
This is what happen:
EXPLAIN (ANALYZE, buffers)
SELECT *
FROM my_table
WHERE key = '...';
From query plan:
Bitmap Heap Scan on mytable (cost=5.31..663.61 rows=169 width=97) (actual time=1.172..32.475 rows=221 loops=1)
Recheck Cond: ((key)::text = '...'::text)
Heap Blocks: exact=220
Buffers: shared read=222
-> Bitmap Index Scan on idx_hash_mytable_key (cost=0.00..5.27 rows=169 width=0) (actual time=0.719..0.719 rows=221 loops=1)
Index Cond: ((key)::text = '...'::text)
Buffers: shared read=2
Planning time: 6.370 ms
Execution time: 32.527 ms
After I re-start postgres and re-run the same query I have:
Bitmap Heap Scan on mytable (cost=5.31..663.61 rows=169 width=97) (actual time=0.705..42.808 rows=221 loops=1)
Recheck Cond: ((key)::text = '...'::text)
Heap Blocks: exact=220
Buffers: shared read=222
-> Bitmap Index Scan on idx_hash_mytable_key (cost=0.00..5.27 rows=169 width=0) (actual time=0.464..0.464 rows=221 loops=1)
Index Cond: ((key)::text = '...'::text)
Buffers: shared read=2
Planning time: 5.611 ms
Execution time: 42.869 ms
As you can see the difference between the two executions is quite big: the first run is about 25% faster that the second one. Probably there are other variables that influence the final result.
Is there a way to have the same execution time every time a run the query? Maybe my current approach to reset the cache is not correct.
The goal is to check the differences in terms of performance between two indexes, but if the executions time is different with the same query plan, after resetting the cache, I cannot check with precision the differences between the performances of two indexes.

Inordinately slow Nested Loop with join on simple query

I'm running the query below against the primary key lt_id (no other index bar the pkey btree) and joining against 1000 ids.
It might be just my lack of experience with postgres but it seems like it's maybe an order of magnitude slow.. There are 800k rows in the table in total.
This is a low spec machine(4G mem) but still thought it should be faster. CPU is idle.
EXPLAIN (ANALYZE,BUFFERS) SELECT lt_id FROM "mytable" d INNER JOIN ( VALUES (1839147),(...998 more rows here...),(1756908)) v(id) ON (d.lt_id = v.id);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.42..7743.00 rows=1000 width=4) (actual time=69.852..20743.393 rows=1000 loops=1)
Buffers: shared hit=2395 read=1607
-> Values Scan on "*VALUES*" (cost=0.00..12.50 rows=1000 width=4) (actual time=0.004..4.770 rows=1000 loops=1)
-> Index Only Scan using lt_id_idx on mytable d (cost=0.42..7.73 rows=1 width=4) (actual time=20.732..20.732 rows=1 loops=1000)
Index Cond: (lt_id = "*VALUES*".column1)
Heap Fetches: 1000
Buffers: shared hit=2395 read=1607
Planning Time: 86.284 ms
Execution Time: 20744.223 ms
(9 rows)
psql 11.7 , I was using 9 but upgraded to 11.7 , no real difference in speed observed.
free
total used free shared buff/cache available
Mem: 3783732 158076 3400932 55420 224724 3366832
Swap: 0 0 0
Even though it's low spec should it really be taking 20 seconds? In fact many other queries are taking twice as long or more. 20 seconds seems to be the best case scenario. There are a couple of other text columns in the table with some small text articles which I doubt is the issue.
I was previously using IN operator but observed similar or worse speeds.
I also made a couple of small changes from the default config, but it doesn't seem to make much difference.
work_mem = 32MB
shared_buffers = 512MB
Any ideas if this is expected performance given the machine? Or is there something else I can try?
edit: I guess what I'm curious about it the time in the actual loop
actual time=20.732..20.732 rows=1 loops=1000
It seems like the actual time is less than or equal 1ms per loop which in worst case would be less than 1 second for 1000 iterations and other operations also seem negligible. Does this mean the issue is simple IO ? slow disk ? What would typically be the situation here.
I notice if I run the query on my desktop which only has 8G ram but is using an SSD the query is massively faster..
Using an SSD is fine of course but I'd like to know if something in my config or query/setup is not optimal..
As #pifor suggested, set track_io_timing=on , can see that this is indeed almost entirely IO slowness..
Nested Loop (cost=0.42..7743.00 rows=1000 width=69) (actual time=0.026..14901.004 rows=1000 loops=1)
Buffers: shared hit=2859 read=1145
I/O Timings: read=14861.578
-> Values Scan on "*VALUES*" (cost=0.00..12.50 rows=1000 width=4) (actual time=0.002..5.497 rows=1000 loops=1)
-> Index Scan using mytable_pkey on mytable d (cost=0.42..7.73 rows=1 width=69) (actual time=14.888..14.888 rows=1 loops=1000)
Index Cond: (lt_id = "*VALUES*".column1)
Buffers: shared hit=2859 read=1145
I/O Timings: read=14861.578
Planning Time: 0.420 ms
Execution Time: 14901.734 ms
(10 rows)

Phrase frequency counter with FULL Text Search of PostgreSQL 9.6

I need to calculate the number of times that a phrase appears using ts_query against an indexed text field (ts_vector data type). It works but it is very slow because the table is huge. For single words I pre-calculated all the frequencies but I have no ideas for increasing the speed of a phrase search.
Edit: Thank you for your reply #jjanes.
This is my query:
SELECT substring(date_input::text,0,5) as myear, ts_headline('simple',text_input,q, 'StartSel=<b>, StopSel=</b>,MaxWords=2, MinWords=1, ShortWord=1, HighlightAll=FALSE, MaxFragments=9999, FragmentDelimiter=" ... "') as headline
FROM
db_test, to_tsquery('simple','united<->kingdom') as q WHERE date_input BETWEEN '2000-01-01'::DATE AND '2019-12-31'::DATE and idxfti_simple ## q
And this is the EXPLAIN (ANALYZE, BUFFERS) output:
Nested Loop (cost=25408.33..47901.67 rows=5509 width=64) (actual time=286.536..17133.343 rows=38127 loops=1)
Buffers: shared hit=224723
-> Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.005..0.007 rows=1 loops=1)
-> Append (cost=25408.33..46428.00 rows=5510 width=625) (actual time=285.372..864.868 rows=38127 loops=1)
Buffers: shared hit=165713
-> Bitmap Heap Scan on db_test (cost=25408.33..46309.01 rows=5509 width=625) (actual time=285.368..791.111 rows=38127 loops=1)
Recheck Cond: ((idxfti_simple ## q.q) AND (date_input >= '2000-01-01'::date) AND (date_input <= '2019-12-31'::date))
Rows Removed by Index Recheck: 136
Heap Blocks: exact=29643
Buffers: shared hit=165607
-> BitmapAnd (cost=25408.33..25408.33 rows=5509 width=0) (actual time=278.370..278.371 rows=0 loops=1)
Buffers: shared hit=3838
-> Bitmap Index Scan on idxftisimple_idx (cost=0.00..1989.01 rows=35869 width=0) (actual time=67.280..67.281 rows=176654 loops=1)
Index Cond: (idxfti_simple ## q.q)
Buffers: shared hit=611
-> Bitmap Index Scan on db_test_date_input_idx (cost=0.00..23142.24 rows=1101781 width=0) (actual time=174.711..174.712 rows=1149456 loops=1)
Index Cond: ((date_input >= '2000-01-01'::date) AND (date_input <= '2019-12-31'::date))
Buffers: shared hit=3227
-> Seq Scan on test (cost=0.00..118.98 rows=1 width=451) (actual time=0.280..0.280 rows=0 loops=1)
Filter: ((date_input >= '2000-01-01'::date) AND (date_input <= '2019-12-31'::date) AND (idxfti_simple ## q.q))
Rows Removed by Filter: 742
Buffers: shared hit=106
Planning time: 0.332 ms
Execution time: 17176.805 ms
Sorry, I can't set track_io_timing turned on. I do know that ts_headline is not recommended but I need it to calculate the number of times that a phrase appears on the same field.
Thank you in advance for your help.
Note that fetching the rows in Bitmap Heap Scan is quite fast, <0.8 seconds, and almost all the time is spent in the top-level node. That time is likely to be spent in ts_headline, reparsing the text_input document. As long as you keep using ts_headline, there isn't much you can do about this.
ts_headline doesn't directly give you what you want (frequency), so you must be doing some kind of post-processing of it. Maybe you could move to postprocessing the tsvector directly, so the document doesn't need to be reparsed.
Another option is to upgrade further, which could allow the work of ts_headline to be spread over multiple CPUs. PostgreSQL 9.6 was the first version which supported parallel query, and it was not mature enough in that version to be able to parallelize this type of thing. v10 is probably enough to get parallel query for this, but you might as well jump all the way to v12.
Version 9.2 is old and out of support. It didn't have native support for phrase searching in the first place (introduced in 9.6).
Please upgrade.
And if it is still slow, show us the query, and the EXPLAIN (ANALYZE, BUFFERS) for it, preferably with track_io_timing turned on.

Moving from Influx to Postgres, need tips

I used Influx to store our time series data. It's cool when it worked, then after about one month, it stopped working and I couldn't figure out why. (Similiar to this issue https://github.com/influxdb/influxdb/issues/1386)
Maybe Influx will be great one day, but for now I need to use something that's more stable. I'm thinking about Postgres. Our data comes from many sensors, each sensor has a sensor id. So I'm thinking about structuring our data as this:
(pk), sensorId(string), time(timestamp), value(float)
Influx is built for time series data so it probably has some built in optimizations. Do I need to do optimizations myself to make Postgres efficient? More specifically, I have these questions:
Influx has has this notion of 'series' and it's cheap to create new series. So I had a separate series for each sensor. Should I create a separate Postgres table for each sensor?
How should I setup up indexes to make queries fast? A typical query is: select all data for sensor123 for the last 3 days.
Should I use timestamp or integer for the time column?
How do I set a retention policy? E.g. delete data that's older than one week automatically.
Will Postgres scale horizontally? Can I setup ec2 clusters for data replication and load balancing?
Can I downsample in Postgres? I have read in some articles that I can use date_trunc. But it seems that I can't date_trunc it to a specific interval e.g. 25 seconds.
Any other caveats I missed?
Thanks in advance!
Updates
Storing the time column as big integer is faster than storing it as timestamp. Am I doing something wrong?
storing it as timestamp:
postgres=# explain analyze select * from test where sensorid='sensor_0';
Bitmap Heap Scan on test (cost=3180.54..42349.98 rows=75352 width=25) (actual time=10.864..19.604 rows=51840 loops=1)
Recheck Cond: ((sensorid)::text = 'sensor_0'::text)
Heap Blocks: exact=382
-> Bitmap Index Scan on sensorindex (cost=0.00..3161.70 rows=75352 width=0) (actual time=10.794..10.794 rows=51840 loops=1)
Index Cond: ((sensorid)::text = 'sensor_0'::text)
Planning time: 0.118 ms
Execution time: 22.984 ms
postgres=# explain analyze select * from test where sensorid='sensor_0' and addedtime > to_timestamp(1430939804);
Bitmap Heap Scan on test (cost=2258.04..43170.41 rows=50486 width=25) (actual time=22.375..27.412 rows=34833 loops=1)
Recheck Cond: (((sensorid)::text = 'sensor_0'::text) AND (addedtime > '2015-05-06 15:16:44-04'::timestamp with time zone))
Heap Blocks: exact=257
-> Bitmap Index Scan on sensorindex (cost=0.00..2245.42 rows=50486 width=0) (actual time=22.313..22.313 rows=34833 loops=1)
Index Cond: (((sensorid)::text = 'sensor_0'::text) AND (addedtime > '2015-05-06 15:16:44-04'::timestamp with time zone))
Planning time: 0.362 ms
Execution time: 29.290 ms
storing it as big integer:
postgres=# explain analyze select * from test where sensorid='sensor_0';
Bitmap Heap Scan on test (cost=3620.92..42810.47 rows=85724 width=25) (actual time=12.450..19.615 rows=51840 loops=1)
Recheck Cond: ((sensorid)::text = 'sensor_0'::text)
Heap Blocks: exact=382
-> Bitmap Index Scan on sensorindex (cost=0.00..3599.49 rows=85724 width=0) (actual time=12.359..12.359 rows=51840 loops=1)
Index Cond: ((sensorid)::text = 'sensor_0'::text)
Planning time: 0.130 ms
Execution time: 22.331 ms
postgres=# explain analyze select * from test where sensorid='sensor_0' and addedtime > 1430939804472;
Bitmap Heap Scan on test (cost=2346.57..43260.12 rows=52489 width=25) (actual time=10.113..14.780 rows=31839 loops=1)
Recheck Cond: (((sensorid)::text = 'sensor_0'::text) AND (addedtime > 1430939804472::bigint))
Heap Blocks: exact=235
-> Bitmap Index Scan on sensorindex (cost=0.00..2333.45 rows=52489 width=0) (actual time=10.059..10.059 rows=31839 loops=1)
Index Cond: (((sensorid)::text = 'sensor_0'::text) AND (addedtime > 1430939804472::bigint))
Planning time: 0.154 ms
Execution time: 16.589 ms
You shouldn't create a table for each sensor. Instead you could add a field to your table that identifies what series it is in. You could also have another table that describes additional attributes about the series. If data points could belong to multiple series, then you'd need a different structure altogether.
For the query you described in q2, an index on your recorded_at column should work (time is a sql reserved keyword, so best avoid that as a name)
You should use TIMESTAMP WITH TIME ZONE as your time data type.
Retention is up to you.
Postgres has various options for sharding/replication. That's a big topic.
Not sure I understand your objective for #6, but I'm sure you can figure something out.