Postgresql: what order should I cluster reindex and analyze? - postgresql

If I am to run cluster reindex and analyze on a table, what is the best order to do so?

There doesn't seem to be any point analyzing before clustering, since clustering will invalidate the "correlation" statistics if it rearranges the heap. So I'd say cluster first.

cluster reorganizes physically your data so it must be the first operation then reindex (data placement will probably changes after clustering your table) and finally analyze witch gives some informations (stat) to the query planner.

Related

Cannot execute ANALYZE during recovery

We have a insert only table for which we often get bad results due to query plan using nested loops instead of hash joins. To solve this we have to run ANALYZE manually (vacuum sometimes don't run on insret only tables, long story, not the point here). When I try to run analyze on replica machine, I get ERROR: cannot execute ANALYZE during recovery error. So this made me think that we maybe don't need to execute ANALYZE on replica.
My question is: are statistics propagated to replica when executing analyze on master node?
Question in link below is similar to this one, but it is asked in regards to vacuum. We are only using ANALYZE.
https://serverfault.com/questions/212219/postgresql-9-does-vacuuming-a-table-on-the-primary-replicate-on-the-mirror
Statistics are stored in table, and this table is replicated from primary server to replica. So you don't need and you cannot to run ANALYZE on replica (physical replication)

Are PostgreSQL tables automatically re-indexed after an update?

If I index a PostgreSQL table and then update it, do I need to re-index the table or is it automatically re-indexed?
Can someone provide a link to PostgreSQL documentation for further reading? I've got this so far:
https://www.postgresql.org/docs/9.1/static/sql-createindex.html
indexes in PostgreSQL do not need maintenance or tuning
You do not need to re-index manually.
For more details, please also read
https://www.postgresql.org/docs/current/static/monitoring-stats.html
From further reading in the PostgreSQL documentation:
Once an index is created, no further intervention is required: the
system will update the index when the table is modified, and it will
use the index in queries when it thinks doing so would be more
efficient than a sequential table scan. But you might have to run the
ANALYZE command regularly to update statistics to allow the query
planner to make educated decisions. See Chapter 14 for information
about how to find out whether an index is used and when and why the
planner might choose not to use an index.
See:
https://www.postgresql.org/docs/current/static/indexes-intro.html

Which way is better to optimize postgresql db?

Reindex or backup/restore to optimize database? Do indexes rebuild while restoring db from backup?
If practical, a full backup and restore is always better than a simple reindex simply because you also get an extra backup file.
The restore process will (1) create tables, then (2) copy data in and finally (3) create indexes, apply constraints etc.
This is not the same as using CLUSTER of course, which physically re-orders a table based on one of its indexes. In some cases that can be useful.
If you are going to do this though, make sure you have good measurements before and after your "optimization" because many factors affect overall database performance and this may prove pointless.

postgresql 9.2 never vacuumed and analyze

I have given a postgres 9.2 DB around 20GB of size.
I looked through the database and saw that it has been never run vacuum and/or analyze on any tables.
Autovacuum is on and the transaction wraparound limit is very far (only 1% of it).
I know nothing about the data activity (number of deletes,inserts, updates), but I see, it uses a lot of index and sequence.
My question is:
does the lack of vacuum and/or analyze affect data integrity (for example a select doesn't show all the rows matches the select from a table or from an index)? The speed of querys and writes doesn't matter.
is it possible that after the vacuum and/or analyze the same query gives a different answer than it would executed before the vacuum/analyze command?
I'm fairly new to PG, thank you for your help!!
Regards,
Figaro88
Running vacuum and/or analyze will not change the result set produced by any select operation (unless there was a bug in PostgreSQL). They may effect the order of results if you do not supply an ORDER BY clause.

How to stop clustering a table in PostgreSQL

I've clustered a table using the following:
CLUSTER foos USING idx_foos_on_bar;
Now every time I run CLUSTER it reclusters that table (and all other tables with clustering) appropriately.
Now I want to stop reordering that one table (but still reorder all the others with a single CLUSTER command).
I don't see anything in the documentation about how to uncluster. Is this possible? Or do I have to completely drop and recreate the table?
http://www.postgresql.org/docs/9.3/static/sql-cluster.html
When a table is clustered, PostgreSQL remembers which index it was
clustered by. The form CLUSTER table_name reclusters the table using
the same index as before. You can also use the CLUSTER or SET WITHOUT
CLUSTER forms of ALTER TABLE to set the index to be used for future
cluster operations, or to clear any previous setting.
I think older versions didn't support the set without cluster option.