Future index creation on partition tables postgres [duplicate] - postgresql

I am using postgresql 14.1, and I re-created my live database using parititons for some tables.
since i did that, i could create index when the server wasn't live, but when it's live i can only create the using concurrently but unfortunately when I try to create an index concurrently i get an error.
running this:
create index concurrently foo on foo_table(col1,col2,col3));
provides the error:
ERROR: cannot create index on partitioned table "foo_table" concurrently
now it's a live server and i cannot create indexes not concurrently and i need to create some indexes in order to improve performance. any ideas how do to that ?
thanks

No problem. First, use CREATE INDEX CONCURRENTLY to create the index on each partition. Then use CREATE INDEX to create the index on the partitioned table. That will be fast, and the indexes on the partitions will become the partitions of the index.

Step 1: Create an index on the partitioned (parent) table
CREATE INDEX foo_idx ON ONLY foo (col1, col2, col3);
This step creates an invalid index. That way, none of the table partitions will get the index applied automatically.
Step 2: Create the index for each partition using CONCURRENTLY and attach to the parent index
CREATE INDEX CONCURRENTLY foo_idx_1
ON foo_1 (col1, col2, col3);
ALTER INDEX foo_idx
ATTACH PARTITION foo_idx_1;
Repeat this step for every partition index.
Step 3: Verify that the parent index created at the beginning (Step 1) is valid. Once indexes for all partitions are attached to the parent index, the parent index is marked valid automatically.
SELECT * FROM pg_index WHERE pg_index.indisvalid = false;
The query should return zero results. If thats not the case then check your script for mistakes.

Related

What ONLY keyword really means in Postgresql CREATE INDEX command

From the docs: "Indicates not to recurse creating indexes on partitions, if the table is partitioned. The default is to recurse.".
Am I understand correctly that index will not be created on existing partitons? What kind of index will be created then (on what)?
The objective is to build a partitioned index with as little locking as possible.
Normally, you'd use CREATE INDEX CONCURRENTLY to create an index on each partition, then CREATE INDEX on the partitioned table. If the index definitions match, the previously created indexes will become partitions of the partitioned index. See this related question.
The potential problem with that is that all partitions will be locked at the same time. Instead, you can do it one partition at a time:
create the index ONLY on the partitioned table (the index will be invalid)
use ALTER INDEX ... ATTACH PARTITION to attach the indexes on the partitions as partitions of the index
once all partitions are attached, the partitioned index will become valid
When CREATE INDEX is invoked on a partitioned table, the default
behavior is to recurse to all partitions to ensure they all have
matching indexes. Each partition is first checked to determine whether
an equivalent index already exists, and if so, that index will become
attached as a partition index to the index being created, which will
become its parent index. If no matching index exists, a new index will
be created and automatically attached; the name of the new index in
each partition will be determined as if no index name had been
specified in the command. If the ONLY option is specified, no
recursion is done, and the index is marked invalid. (ALTER INDEX ...
ATTACH PARTITION marks the index valid, once all partitions acquire
matching indexes.) Note, however, that any partition that is created
in the future using CREATE TABLE ... PARTITION OF will automatically
have a matching index, regardless of whether ONLY is specified.
small demo example:
create table index_part (a int, b int) partition by range (a, b);
create table index_part1 partition of index_part for values from (0,0) to (10, 10);
create table index_part2 partition of index_part for values from (10,10) to (20, 20);
create index index_part_a_b_idx on only index_part (a, b);
now is INVALID:
\d+ index_part_a_b_idx
---
btree, for table "public.index_part", invalid
Partitions: index_part2_a_b_idx
Access method: btree
create index idxpart1_a_b_idx on index_part1 (a, b);
alter index index_part_a_b_idx attach partition idxpart1_a_b_idx;
still INVALID.
\d+ index_part_a_b_idx
---
btree, for table "public.index_part", invalid
Partitions: idxpart1_a_b_idx
Access method: btree
then
create index idxpart2_a_b_idx on index_part2(a, b);
alter index index_part_a_b_idx attach partition idxpart2_a_b_idx;
now ISVALID.
select indisvalid from pg_index where indexrelid = 'idxpart2_a_b_idx'::regclass; ---return true.

Adding indexes to a postgres table after it is created and and having partitions

I have a Postgres table named: services, and it has columns called id, mac_addr, dns_name, hash, and it is partitioned based on mac_addr, so the partition tables names look like: services_3eeeea123e3 and so on. And there are around 20K partition based on mac_addrs
Q1: there was no index created when the tables were created. so now, when I am trying to add an index CREATE INDEX idx_services_id on services (id), it throws an error ERROR: cannot create an index on partitioned table "services"
But I am able to add indexes to individual partitioned tables CREATE INDEX idx_services_3eeeea123e3 on services_3eeeea123e3 (id).
So do I have to create an index on each partition table now? Is there a way to create an index on the base table(services) itself, which will automatically create an index on each partition table?
Q2: When I run a select query, it is fast when I use the direct partition table; however, using the base table is very slow. Any idea what could be the reason.
Fast: SELECT id, dns_name, hash from services_3eeeea123e3 where id='123232'
very slow: SELECT id,dns_name, hash from services where mac_addr='3eeeea123e3' and id='123232'

Does using 'create index concurrently' in postgres, help future rows which will be inserted to be free from locks?

Does the create index concurrently works only when the table is created for the first time or does it even work for the records which will be inserted in future?
You are misunderstanding what concurrently does: it avoids locking the table for write access while the index is created.
Once the index is created, there is no difference between an index created with the concurrently option and one without. If new rows get inserted, the index is updated with the new values. Inserting new rows, does not "rebuild" the entire index.
Once an index is created, inserting rows into the table does not lock the table at all, regardless of how the index was created. Non-unique indexes always allow concurrent insert to the table.
A unique index however will block concurrent inserts for the same value(s) - but not for different values.

PostgreSQL - reindex when add new index

I have a table with 100k records without indexes. I created a new index on column that is used for left join.
Do I need to reindex my table?
Creation of an index took a few ms. So I am guessing that query can not use this index (no data) until I reindex my table (in case I would have other indexes I would reindex only index - I read the manual).
I can't find any information when new index is populated with data? Is this done automatically? When?
Once CREATE INDEX has finished, the index is ready to be used. There is no need to run REINDEX after that.
From the REINDEX documentation page:
REINDEX is similar to a drop and recreate of the index in that the index contents are rebuilt from scratch. However, the locking considerations are rather different. REINDEX locks out writes but not reads of the index's parent table.
That means REINDEX behaves similar to CREATE after DROP.
And from the CREATE INDEX documentation page:
Creating an index can interfere with regular operation of a database. Normally PostgreSQL locks the table to be indexed against writes and performs the entire index build with a single scan of the table. Other transactions can still read the table, but if they try to insert, update, or delete rows in the table they will block until the index build is finished.
I think this unambiguously explains that creation implies indexation.
Whether or not a specific query uses the index depends on many different things though. If your query doesn't use the index, you need to post the query, the table definitions (e.g. as a create table statement), the index you have defined and the output of explain (analyze, verbose) of your query.

Will postgresql generate index automatically?

is there automatic index in Postgresql or need users to create index explicitly? if there is automatic index, how can I view it? thanks.
An index on the primary key and unique constraints will be made automatically. Use CREATE INDEX to make more indexes. To view existing database structure including the indexes, use \d table.
A quick example of generating an index would be:
CREATE INDEX unique_index_name ON table (column);
You can create an index on multiple columns:
CREATE INDEX unique_index_name ON table (column1, column2, column3);
Or a partial index which will only exist when conditions are met:
CREATE INDEX unique_index_name ON table (column) WHERE column > 0;
There is a lot more you can do with them, but that is for the documentation (linked above) to tell you. Also, if you create an index on a production database, use CREATE INDEX CONCURRENTLY (it will take longer, but not lock out new writes to the table). Let me know if you have any other questions.
Update:
If you want to view indexes with pure SQL, look at the pg_catalog.pg_indexes table:
SELECT *
FROM pg_catalog.pg_indexes
WHERE schemaname='public'
AND tablename='table';