Are PostgreSQL tables automatically re-indexed after an update? - postgresql

If I index a PostgreSQL table and then update it, do I need to re-index the table or is it automatically re-indexed?
Can someone provide a link to PostgreSQL documentation for further reading? I've got this so far:
https://www.postgresql.org/docs/9.1/static/sql-createindex.html

indexes in PostgreSQL do not need maintenance or tuning
You do not need to re-index manually.
For more details, please also read
https://www.postgresql.org/docs/current/static/monitoring-stats.html

From further reading in the PostgreSQL documentation:
Once an index is created, no further intervention is required: the
system will update the index when the table is modified, and it will
use the index in queries when it thinks doing so would be more
efficient than a sequential table scan. But you might have to run the
ANALYZE command regularly to update statistics to allow the query
planner to make educated decisions. See Chapter 14 for information
about how to find out whether an index is used and when and why the
planner might choose not to use an index.
See:
https://www.postgresql.org/docs/current/static/indexes-intro.html

Related

What is the difference between postgresql rebuild index and recreate index, which one is better?

I have a table which index size is too big (about 2G). When I restore the database to a VM, the size is only 200M so I need to rebuild/recreate the index and I will probably do this online.
What is the difference between re-building (reindex) and re-creating the index, and which one is better when I do it online? Particularly, which option allows querying the DB during the operation?
The REINDEX command requires an exclusive table lock, which means that it'll stall any accesses to the table until the command has completed. If you can afford that kind of maintenance window it's perfectly fine.
The alternative for online rebuilding is to create a new index using CREATE INDEX CONCURRENTLY, then drop the old one. This will take longer to complete, but allows access to the table while rebuilding the index.
Postgres 12 has added a REINDEX INDEX CONCURRENTLY command, which does what you want here. https://paquier.xyz/postgresql-2/postgres-12-reindex-concurrently/ https://www.depesz.com/2019/03/29/waiting-for-postgresql-12-reindex-concurrently/

Does PostgreSQL automatically create an index of a table?

In PostgreSQL,
when I create a table, and doesn't create any index for it, will PostgreSQL automatically create some default index for the table?
When I later update and query the table several times, will PostgreSQL be smart enough to automatically create an index for me based on how and how often I update and query the table?
If not, what commands in PostgreSQL can help me manually choose an index that will improve the performance of the table?
Thanks.
No database engine will create indexes on its own. Indexes have an important impact on performance (when modifying the records), and it's your role to know and calculate the performance gain or drop to take a clever/informed decision. The only index which is automatically created is the PrimaryKey index.
The only thing your database engine will be "smart" about, is when and how to use the indexes which already exists. This is called the query optimizer, and it bases its decision on complex algorithms and internal statistics.
There are tools to analyze how the database works to suggest some indexes. But the best, and simplest way, is to use an EXPLAIN.
https://www.postgresql.org/docs/9.5/static/sql-explain.html

PostgreSQL - CREATE TABLE AS vs INSERT INTO performance comparision

I'm trying to insert couple of million rows into a PostgreSQL database. I am wondering what is the best way to do it.
CREATE TABLE AS
INSERT INTO
I'm looking to see which one is better and why? I have read through some blogs but still couldn't come to a conclusion.
I think INSERT INTO is a bulk insert operation. Please correct me if I'm wrong. Whether CREATE TABLE AS SELECT is a bulk insert operation?
Please advise.
CREATE TABLE AS is a bulk insert operation as well. The main difference is that CREATE TABLE AS is easier to optimize for PostgreSQL; it is clear that no WAL information has to be written (unless WAL-based replication is active, of course). See the wal_level documentation and Disable WAL Archival and Streaming Replication for some other cases where this optimization applies.

How to add index to CitusDB's cstore_fdw?

I'm currently building an OLAP database in postgres and want to compare the performance of a column-store vs row-store database. CitusDB open-sourced its columnar-store extension cstore_fdw so I'm comparing database performance with and without this extension.
The example shows how to make a test db and query it. I have that example running. But then I try to add indices to it to and get the error ERROR: cannot create index on foreign table "table_name". It makes sense that I can't add indices to a foreign table. Yet, I still need to index that table, or else there's no way it will do well slicing or drilling into the data. How do I do this?
cstore_fdw currently doesn't support PostgreSQL indexes. But it automatically stores some min/max statistics in skip indexes which makes execution of some queries much more efficient.
To learn more about how to use skip indexes please consult documentation.

UPDATE vs INSERT statement performance in PostgreSQL

I am working with a database of a million rows approx.. using python to parse documents and populate the table with terms.. The insert statements work fine but the update statements get extremely time consuming as the table size grows..
It would be great if some can explain this phenomenon and also tell me if there is a faster way to do updates.
Thanks,
Arnav
Sounds like you have an indexing problem. Whenever I hear about problems getting worse as table size grows, it makes me wonder if you're doing a table scan whenever you interact with a table.
Check to see if you have a primary key and meaningful indexes on that table. Look at the WHERE clause you have on that UPDATE and make sure there's an index on those columns to make finding that record as fast as possible.
UPDATE: Write a SELECT query using the WHERE clause you use to UPDATE and ask the database engine to EXPLAIN PLAN. If you see a TABLE SCAN, you'll know what to do.