Slow count(*) on postgresql 9.2 - postgresql

There are a few discussions this and there (including the official post on postgres web) about the slow count(*) prior version 9.2; somehow I did not find satisfied answer.
Basically I had postgres 9.1 installed, and I observed slow count(*) as simple as
select count(*) from restaurants;
on tables with records of 100k+. The average request is around 850ms. Well I assumed that that was the symptom people have been talking about for slow count on postgres 9.1 and below since postgres 9.2 has some new feature like index-only scan. I want to experiment this by using the same dataset from 9.1 and put it on 9.2. I call the count statement, and it still give a bad result as 9.1.
explain analyze select count(*) from restaurants;
------------------------------------------------------------------
Aggregate (cost=23510.35..23510.36 rows=1 width=0) (actual time=979.960..979.961 rows=1 loops=1)
-> Seq Scan on restaurants (cost=0.00..23214.88 rows=118188 width=0) (actual time=0.050..845.097 rows=118188 loops=1)
Total runtime: 980.037 ms
Can anyone suggest feasible solution to this problem? Do I need to configure anything on postgres to enable the feature?
P.S. where clause doesn't help in my case either.

See the index only scans wiki entries:
What types of queries may be satisfied by an index-only scan?
is count(*) much faster now?
Why isn't my query using an index-only scan?
In particular, I quote:
It is important to realise that the planner is concerned with
minimising the total cost of the query. With databases, the cost of
I/O typically dominates. For that reason, "count(*) without any
predicate" queries will only use an index-only scan if the index is
significantly smaller than its table. This typically only happens when
the table's row width is much wider than some indexes'.
See also the discussion of VACUUM and ANALYZE for maintaining the visibility map. Essentially, you probably want to make VACUUM more aggressive, and you'll want to manually VACUUM ANALYZE the table after you first load it.

This happens due to PostgreSQL's MVCC implementation. In short, in order to count the table rows, PostgreSQL needs to ensure that they exist. But given the multiple snapshots/versions of each record, PostgreSQL is unable to summarize the whole table directly. So instead PostgreSQL reads each row, performing a sequential scan.!
How to fix this?
There are different approaches to fix this issue, including a trigger-based mechanism. If it is acceptable for you to use an estimated count of the rows you can check PostgreSQL pg_class reltuples:
SELECT reltuples::BIGINT AS estimate FROM pg_class WHERE relname=<table_name>
Reltuples:
[It is the] Number of rows in the table. This is only an estimate used by the planner. — PostgreSQL: Documentation: pg_class
More info:
https://medium.com/#vinicius_edipo/handling-slow-counting-with-elixir-postgresql-f5ff47f3d5b9
http://www.varlena.com/GeneralBits/120.php
https://www.postgresql.org/docs/current/static/catalog-pg-class.html

Related

Postgres - same query two different EXPLAIN plans

I have two postgres databases (one for development, one for test). Both have the same structure.
I'm running this query on both (merges one table into another)
WITH moved_rows AS
(
DELETE FROM mybigtable_20220322
RETURNING *
)
INSERT INTO mybigtable_202203
SELECT * FROM moved_rows;
But I get slightly different results for EXPLAIN on each version.
Development (Postgres 13.1) -
Insert on mybigtable_202203 (cost=363938.39..545791.17 rows=9092639 width=429)
CTE moved_rows
-> Delete on mybigtable_20220322 (cost=0.00..363938.39 rows=9092639 width=6)
-> Seq Scan on mybigtable_20220322 (cost=0.00..363938.39 rows=9092639 width=6)
-> CTE Scan on moved_rows (cost=0.00..181852.78 rows=9092639 width=429)
Test (Postgres 14.1) -
Insert on mybigtable_202203 (cost=372561.91..558377.73 rows=0 width=0)
CTE moved_rows
-> Delete on mybigtable_20220322 (cost=0.00..372561.91 rows=9290791 width=6)
-> Seq Scan on mybigtable_20220322 (cost=0.00..372561.91 rows=9290791 width=6)
-> CTE Scan on moved_rows (cost=0.00..185815.82 rows=9290791 width=429)
The big difference is the first line, on Development I get rows=9092639 width=429 on Test I get rows=0 width=0
All the tables have the same definitions, with the same indexes (not that they seem to be used) the query succeeds on both databases, the EXPLAIN indicates similar costs on both database, and the tables on each database have a similar record count (just over 9 million rows)
In practice the difference is that on Development the query takes a few minutes, on Test is takes a few hours.
Both databases were created with the same scripts, so should be 100% identical my guess is there's some small, subtle difference that's crept in somewhere. Any suggestion on what the difference might be or how to find it? Thanks
Update
both the tables being merged (on both databases) have been VACUUM ANALYZED in similar timeframes.
I used fc to compare both DBs. There was ONE difference, on the development database the table was clustered on one of the indexes. I did similar clustering on the test table but results didn't change.
In response to the comment 'the plans are the same, only the estimated rows are different'. This difference is the only clue I currently have to an underlying problem. My development database is on a 10 year old server struggling with lack of resources, my test database is a brand new server. The former takes a few minutes to actually run the query the later takes a few hours. Whenever I post a question on the forum I'm always told 'start with the explain plan'
This change was made with commit f0f13a3a08b27, which didn't make it into the release notes, since it was considered a bug fix:
Fix estimates for ModifyTable paths without RETURNING.
In the past, we always estimated that a ModifyTable node would emit the
same number of rows as its subpaths. Without a RETURNING clause, the
correct estimate is zero. Fix, in preparation for a proposed parallel
write patch that is sensitive to that number.
A remaining problem is that for RETURNING queries, the estimated width
is based on subpath output rather than the RETURNING tlist.
Reviewed-by: Greg Nancarrow gregn4422#gmail.com
Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV%3DqpFJrR3AcrTS3g%40mail.gmail.com
This change only affects EXPLAIN output, not what actually happens during data modifications.

How does postgres decide whether to use index scan or seq scan?

explain analyze shows that postgres will use index scanning for my query that fetches rows and performs filtering by date (i.e., 2017-04-14 05:27:51.039):
explain analyze select * from tbl t where updated > '2017-04-14 05:27:51.039';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Scan using updated on tbl t (cost=0.43..7317.12 rows=10418 width=93) (actual time=0.011..0.515 rows=1179 loops=1)
Index Cond: (updated > '2017-04-14 05:27:51.039'::timestamp without time zone)
Planning time: 0.102 ms
Execution time: 0.720 ms
however running the same query but with different date filter '2016-04-14 05:27:51.039' shows that postgres will run the query using seq scan instead:
explain analyze select * from tbl t where updated > '2016-04-14 05:27:51.039';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl t (cost=0.00..176103.94 rows=5936959 width=93) (actual time=0.008..2005.455 rows=5871963 loops=1)
Filter: (updated > '2016-04-14 05:27:51.039'::timestamp without time zone)
Rows Removed by Filter: 947
Planning time: 0.100 ms
Execution time: 2910.086 ms
How does postgres decide on what to use, specifically when performing filtering by date?
The Postgres query planner bases its decisions on cost estimates and column statistics, which are gathered by ANALYZE and opportunistically by some other utility commands. That all happens automatically when autovacuum is on (by default).
The manual:
Most queries retrieve only a fraction of the rows in a table, due to
WHERE clauses that restrict the rows to be examined. The planner thus
needs to make an estimate of the selectivity of WHERE clauses, that
is, the fraction of rows that match each condition in the WHERE
clause. The information used for this task is stored in the
pg_statistic system catalog. Entries in pg_statistic are updated by
the ANALYZE and VACUUM ANALYZE commands, and are always approximate
even when freshly updated.
There is a row count (in pg_class), a list of most common values, etc.
The more rows Postgres expects to find, the more likely it will switch to a sequential scan, which is cheaper to retrieve large portions of a table.
Generally, it's index scan -> bitmap index scan -> sequential scan, the more rows are expected to be retrieved.
For your particular example, the important statistic is histogram_bounds, which give Postgres a rough idea how many rows have a greater value than the given one. There is the more convenient view pg_stats for the human eye:
SELECT histogram_bounds
FROM pg_stats
WHERE tablename = 'tbl'
AND attname = 'updated';
There is a dedicated chapter explaining row estimation in the manual.
Obviously, optimization of queries is tricky. This answer is not intended to dive into the specifics of the Postgres optimizer. Instead, it is intended to give you some background on how the decision to use an index is made.
Your first query is estimated to return 10,418 rows. When using an index, the following operations happen:
The engine uses the index to find the first value meeting the condition.
The engine then loops over the values, finishing when the condition is no longer true.
For each value in the index, the engine then looks up the data on the data page.
In other words, there is a little bit of overhead when using the index -- initializing the index and then looking up each data page individually.
When the engine does a full table scan it:
Starts with the first record on the first page
Does the comparison and accepts or rejects the record
Continues sequentially through all data pages
There is no additional overhead. Further, the engine can "pre-load" the next pages to be scanned while processing the current page. This overlap of I/O and processing is a big win.
The point I'm trying to make is that getting the balance between these two can be tricky. Somewhere between 10,418 and 5,936,959, Postgres decides that the index overhead (and fetching the pages randomly) costs more than just scanning the whole table.

Slow index scan

I have table with index:
Table:
Participates (player_id integer, other...)
Indexes:
"index_participates_on_player_id" btree (player_id)
Table contains 400kk rows.
I execute the same query two times:
Query: explain analyze select * from participates where player_id=149294217;
First time:
Index Scan using index_participates_on_player_id on participates (cost=0.57..19452.86 rows=6304 width=58) (actual time=261.061..2025.559 rows=332 loops=1)
Index Cond: (player_id = 149294217)
Total runtime: 2025.644 ms
(3 rows)
Second time:
Index Scan using index_participates_on_player_id on participates (cost=0.57..19452.86 rows=6304 width=58) (actual time=0.030..0.479 rows=332 loops=1)
Index Cond: (player_id = 149294217)
Total runtime: 0.527 ms
(3 rows)
So, first query has big actual time - how to increase speed the first execute?
UPDATE
Sorry, How to accelerate first query?)
Why index scan search so slow?
The difference in execution time is probably because the second time through, the table/index data from the first run of the query is in the shared buffers cache, and so the subsequent run of the query takes less time, since it doesn't have to go long-path to disk for that information.
Edit:
Regarding the slowness of the original query, does the table have a lot of dead tuples? Those can slow things down considerably. If so, VACUUM ANALYZE the table.
Another factor can be if there are long-ish idle transactions on the server (i.e. several minutes or more). Due to the nature of MVCC this can also slow even index-based queries down quite a bit.
Also, what the query planner is expecting for the results vs. actual is quite different, so you may want to do an ANALYZE on the query beforehand to update the stats.
1.) Take a look at http://www.postgresql.org/docs/9.3/static/runtime-config-resource.html and check out for some tuning for using more memory. This can speed up your search but will not give you a warranty (depending on the answer before)!
2.) Transfer a part of your tables/indexes to a more powerful tablespace. For example a tablespace based on SSDs.

PostgreSQL query doesn't use index

I have a very simple db schema, which has a multi column b-tree index on following columns:
PersonId, Amount, Commission
Now, if I try to select the table with following query:
explain select * from "Order" where "PersonId" = 2 AND "Commission" > 3
Pg is scanning the index and the query is very fast, but if I try the following query:
explain select * from "Order" where "PersonId" > 2 AND "Commission" > 3
It does a sequential scan, even when the index is present. Even this query
explain select * from "Order" where "Commission" > 3
does a sequential scan.
Anyone care to explain why? :-)
Thank you very much.
UPDATE
The table contains 100 million rows. I have created it just to test PostgreSQL performance against MS SQL. The table is already VACUUMED. I'm runnning Core I5 2500k quad core cpu and 8 GB of ram.
Here's the result of explain analyze for this query:
explain ANALYZE select * from "Order" where "Commission" BETWEEN 3000000 AND 3000010 LIMIT 20
Limit (cost=0.00..2218328.00 rows=1 width=24) (actual time=28043.249..28043.249 rows=0 loops=1)
-> Seq Scan on "Order" (cost=0.00..2218328.00 rows=1 width=24) (actual time=28043.247..28043.247 rows=0 loops=1)
Filter: (("Commission" >= 3000000::numeric) AND ("Commission" <= 3000010::numeric))
Total runtime: 28043.278 ms
The short answer is that when comparing the various available plans, the sequential scan is expected to be the fastest, based on the costing factors you have configured and the latest statistics available. From what little information you've provided, it seems quite likely that the planner has made the right choice. If you had three single-column indexes, it might be able to use bitmap index scans, particularly if the rows to be selected are less than about 10% of the rows in the table.
Note that with the index you describe, the entire index would need to be scanned from for all rows where "PersonId" > 2; which unless you have a lot of negative values for "PersonId" is very likely to be most of the table.
Also note that if you have a tiny table -- say a few thousand rows or less, accessing the rows through an index will rarely be faster than just scanning those few rows. Plans are sensitive to data volume, and the plan you get with a small number of rows is very unlikely to be the same plan you get with a lot of rows.
If it is, in fact, not picking the fastest plan, the odds are good that you need to adjust your cost factors to better model the costs on your machine. Another possibility is that you need to be more aggressive in your autovacuum settings, to make sure up-to-date statistics are available, or you may need to configure collection of finer-grained statistics.
People will be able to provide more specific advice if you show the table descriptions (including indexes), the EXPLAIN ANALYZE output for the query, and a description of the hardware.

How to influence the Postgres Query Analyzer when dealing with text search and geospatial data

I have a quite serious performance issue with the following statement that i can't fix myself.
Given Situation
I have a postgres 8.4 Database with Postgis 1.4 installed
I have a geospatial table with ~9 Million entries. This table has a (postgis) geometry column and a tsvector column
I have a GIST Index on the geometry and a VNAME Index on the vname column
Table is ANALYZE'd
I want to excecute a to_tsquerytext search within a subset of these geometries that should give me all affecteded ids back.
The area to search in will limit the 9 Million datasets to approximately 100.000 and the resultset of the ts_query inside this area will most likely give an output of 0..1000 Entries.
Problem
The query analyzer decides that he wants to do an Bitmap Index Scan on the vname first, and then aggreates and puts a filter on the geometry (and other conditions I have in this statement)
Query Analyzer output:
Aggregate (cost=12.35..12.62 rows=1 width=510) (actual time=5.616..5.616 rows=1 loops=1)
-> Bitmap Heap Scan on mxgeom g (cost=8.33..12.35 rows=1 width=510) (actual time=5.567..5.567 rows=0 loops=1)
Recheck Cond: (vname ## '''hemer'' & ''hauptstrasse'':*'::tsquery)
Filter: (active AND (geom && '0107000020E6100000010000000103000000010000000B0000002AFFFF5FD15B1E404AE254774BA8494096FBFF3F4CC11E40F37563BAA9A74940490200206BEC1E40466F209648A949404DF6FF1F53311F400C9623C206B2494024EBFF1F4F711F404C87835954BD4940C00000B0E7CA1E4071551679E0BD4940AD02004038991E40D35CC68418BE49408EF9FF5F297C1E404F8CFFCB5BBB4940A600006015541E40FAE6468054B8494015040060A33E1E4032E568902DAE49402AFFFF5FD15B1E404AE254774BA84940'::geometry) AND (mandator_id = ANY ('{257,1}'::bigint[])))
-> Bitmap Index Scan on gis_vname_idx (cost=0.00..8.33 rows=1 width=0) (actual time=5.566..5.566 rows=0 loops=1)
Index Cond: (vname ## '''hemer'' & ''hauptstrasse'':*'::tsquery)
which causes a LOT of I/O - AFAIK It would be smarter to limit the geometry first, and do the vname search after.
Attempted Solutions
To achieve the desired behaviour i tried to
I Put the geom ## AREA into a subselect -> Did not change the execution plan
I created a temporary view with the desired area subset -> Did not change the execution plan
I created a temporary table of the desired area -> Takes 4~6 seconds to create, so that made it even worse.
Btw, sorry for not posting the actual query: I think my boss would really be mad at me if I did, also I'm looking more for theoretical pointers for someone to fix my actual query. Please ask if you need further clarification
EDIT
Richard had a very good point: You can achieve the desired behaviour of the Query Planner with the width statement. The bad thing is that this temporary Table (or CTE) messes up the vname index, thus making the query return nothing in some cases.
I was able to fix this with creating a new vname on-the-fly with to_tsvector(), but this is (too) costly - around 300 - 500ms per query.
My Solution
I ditched the vname search and went with a simple LIKE('%query_string%') (10-20 ms/query), but this is only fast in my given environment. YMMV.
There have been some improvements in statistics handling for tsvector (and I think PostGIS too, but I don't use it). If you've got the time, it might be worth trying again with a 9.1 release and see what that does for you.
However, for this single query you might want to look at the WITH construct.
http://www.postgresql.org/docs/8.4/static/queries-with.html
If you put the geometry part as the WITH clause, it will be evaluated first (guaranteed) and then that result-set will filtered by the following SELECT. It might end up slower though, you won't know until you try.
It might be an adjustment to work_mem would help too - you can do this per-session ("SET work_mem = ...") but be careful with setting it too high - concurrent queries can quickly burn through all your RAM.
http://www.postgresql.org/docs/8.4/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY