I have to write a migration command to remove a column from the index. Currently let us say I have table1 that has index on col1 and col2
I want to remove col1 from the index. I am looking at https://www.postgresql.org/docs/9.4/static/sql-alterindex.html but it does not seem I can actually just remove a column?
If yes, will it be better to remove the column and how VS
Create new Index
Drop the old index
Also, I want to do the reverse if I need to do downgrade. So just wondering how to achieve this
The ability to alter an index doesn't exist because in order to do so you would have to destroy and recreate the index with the new columns. By default, Postgres uses B-Trees to create indices and removing a column causes that B-Tree to become invalid. As a result the B-Tree needs to be built from scratch.
If you want some more details on how indices work under the hood, this is a good article: Postgres Indices Under the Hood
You’re right, you’ll have to create a new index with a single column and then drop an old index with two columns.
Related
In Oracle, if we try to create two indexes on same column of same table then it gives ORA-01408: such column list already indexed.
create index tmp_idx1 on student(roll_no);
create index tmp_idx2 on student(roll_no);
ORA-01408: such column list already indexed.
But, same if we try to create in Postgres, it allows.
How can we stop this in postgres and make behavior like in Oracle?
I am testing some queries on Postgresql extension Timescaledb.
The table is called timestampdb and i run some queries on that seems like this
select id13 from timestampdb where timestamp1 >='2010-01-01 00:05:00' and timestamp1<='2011-01-01 00:05:00',
select avg(id13)::numeric(10,2) from timestasmpdb where timestamp1>='2015-01-01 00:05:00' and timestamp1<='2015-01-01 10:30:00'
When i create a hypertable i do this.
create hyper_table('timestampdb','timestamp1')
The thing is that now i want to create an index on id13.
should i try something like this?:
create hyper_table('timestampdb','timestamp1') ,import data of the table and then create index on timestampdb(id13)
or something like this:
create table timestampdb,then create hypertable('timestampdb',timestamp1') ,import the data and then CREATE INDEX ON timestampdb (timestamp1,id13)
What is the correct way to do this?
You can create an index without time dimension column, since you don't require it to be unique. Including time dimension column into an index is needed if an index contains UNIQUE or is PRIMARY KEY, since TimescaleDB partitions a hypertable into chunks on the time dimension column, which is timestamp1 in the question. If partitioning key will include space dimension columns in addition to time, they will need to be included too.
So in your case the following should be sufficient after the migration to hypertable:
create index on timestampdb(id13);
The question contains two queries and none of them need index on id13. It will be valuable to create the index on id13 if you expect different queries than in the question, which will contain condition or join on id13 column.
I want to create a hypertable in postgres timescale.
What I do is CREATE TABLE then CREATE INDEX and finally SELECT CREATE_HYPERTABLE.
My question: is CREATE INDEX necessary, helpful or problematic for a high performance of the hypertable?
In short: no indexes are needed to be created as TimescaleDB will create an index on time dimension by default. Depending on your usage you might want to create indexes to speedup select queries and it is good to create them after creating the hypertable.
In more details:
Creating hypertable with create_hypertable function replaces the original PotgreSQL table with new table. Thus it is better to create hypertable and then create index. It also works to create index first, and then call create_hypertable. In such case the existing indexes will be recreated on the hypertable. It is important to remember that unique indexes and primary keys need to include time dimension column. And note that create_hypertable will create an index on the time dimension column by default.
In general, the considerations for creating indexes are the similar as with PostgreSQL: there are tradeoffs in using indexes. Indexes introduces overheads during data ingesting, while can improve select queries significantly. I suggest to check the best practice of using indexes in TimescaleDB and the blog about using composite indexes for time-series queries
is there automatic index in Postgresql or need users to create index explicitly? if there is automatic index, how can I view it? thanks.
An index on the primary key and unique constraints will be made automatically. Use CREATE INDEX to make more indexes. To view existing database structure including the indexes, use \d table.
A quick example of generating an index would be:
CREATE INDEX unique_index_name ON table (column);
You can create an index on multiple columns:
CREATE INDEX unique_index_name ON table (column1, column2, column3);
Or a partial index which will only exist when conditions are met:
CREATE INDEX unique_index_name ON table (column) WHERE column > 0;
There is a lot more you can do with them, but that is for the documentation (linked above) to tell you. Also, if you create an index on a production database, use CREATE INDEX CONCURRENTLY (it will take longer, but not lock out new writes to the table). Let me know if you have any other questions.
Update:
If you want to view indexes with pure SQL, look at the pg_catalog.pg_indexes table:
SELECT *
FROM pg_catalog.pg_indexes
WHERE schemaname='public'
AND tablename='table';
I was wondering I'm not really sure how multiple indexes would work on the same column.
So lets say I have an id column and a country column. And on those I have an index on id and another index on id and country. When I do my query plan it looks like its using both those indexes. I was just wondering how that works? Can I force it to use just the id and country index.
Also is it bad practice to do that? When is it a good idea to index the same column multiple times?
It is common to have indexes on both (id) and (country,id), or alternatively (country) and (country,id) if you have queries that benefit from each of them. You might also have (id) and (id, country) if you want the "covering" index on (id,country) to support index only scans, but still need the stand along to enforce a unique constraint.
In theory you could just have (id,country) and still use it to enforce uniqueness of id, but PostgreSQL does not support that at this time.
You could also sensibly have different indexes on the same column if you need to support different collations or operator classes.
If you want to force PostgreSQL to not use a particular index to see what happens with it gone, you can drop it in a transactions then roll it back when done:
BEGIN; drop index table_id_country_idx; explain analyze select * from ....; ROLLBACK;