How to create indexes on MQT(materialized query table) in Db2? I haven't found this information in documentation? Is index creation syntax the same as for common tables?
After you create your MQT you have to refresh the table before you can create indexes. However, at this point it's exactly the same as creating indexes on a normal table.
There are some limitations on what type of indexes you can create on an MQT. For example, it cannot be a unique index.
Related
I want to create a hypertable in postgres timescale.
What I do is CREATE TABLE then CREATE INDEX and finally SELECT CREATE_HYPERTABLE.
My question: is CREATE INDEX necessary, helpful or problematic for a high performance of the hypertable?
In short: no indexes are needed to be created as TimescaleDB will create an index on time dimension by default. Depending on your usage you might want to create indexes to speedup select queries and it is good to create them after creating the hypertable.
In more details:
Creating hypertable with create_hypertable function replaces the original PotgreSQL table with new table. Thus it is better to create hypertable and then create index. It also works to create index first, and then call create_hypertable. In such case the existing indexes will be recreated on the hypertable. It is important to remember that unique indexes and primary keys need to include time dimension column. And note that create_hypertable will create an index on the time dimension column by default.
In general, the considerations for creating indexes are the similar as with PostgreSQL: there are tradeoffs in using indexes. Indexes introduces overheads during data ingesting, while can improve select queries significantly. I suggest to check the best practice of using indexes in TimescaleDB and the blog about using composite indexes for time-series queries
This might be an obvious and simple question.
But I read through the jsonb data type documentation, but nowhere it mentions the lookup cost of a key in jsonb data.
For example, let's say I have a table with following schema:
CREATE TABLE A (id character varying (20),
info jsonb);
I want to know how postgres would parse a where query as below:
SELECT * FROM A WHERE info->>'city' = 'portland';
While going through the jsonb field of a row, is the lookup constant time (O(1)) or linear time (checking each key one by one in the row's jsonb dictionary) within that jsonb data dictionary?
My intuition is that it must be constant time (else what's the point of a dictionary style data?) but I can't see it in the official documentation to convince my team.
Any help would be great!
Thanks!
As with any WHERE condition in SQL: if there is no index, the database has to go through all rows of the table to find those that satisfy your condition.
You can either index a specific expression, or you can index the whole json value using a GIN index which then enables Postgres to use the index if any of the supported operators are used.
If you always check for the city, you can create a regular B-Tree index:
create index on a ( (info->>'city') );
If you don't know what you will be looking for, a GIN index might be a better choice:
create index on a using gin (info);
But you will need to change your query to use one of the operators that are supported by a GIN index, e.g. using the contains operator #>
select *
from a
where info #> '{"city": "portland"}::jsonb;
Note that an index lookup is not always the most efficient solution. Sometimes it's faster to simply go through all rows, sometimes the index lookup is faster.
If you want to learn more about indexes in relational database, go through the material here: http://use-the-index-luke.com/
I've checked index type in one of my table and found that all indexes are of type REG (non clustered). As per DB2 documentation, DB2 by default use the first index created as clustered index if not explicitly specified. Why DB2 is showing all of my indexes as REGULAR?
Reference: http://www-01.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/com.ibm.db2z10.doc.intro/src/tpc/db2z_clusteringindexes.dita
"When a table has a clustering index, an INSERT statement causes DB2 to insert the records as nearly as possible in the order of their index values. The first index that you define on the table serves implicitly as the clustering index unless you explicitly specify CLUSTER when you create or alter another index. For example, if you first define a unique index on the EMPNO column of the EMP table, DB2 inserts rows into the EMP table in the order of the employee identification number unless you explicitly define another index to be the clustering index"
Here is my understanding of your question - You read on the IBM documentation website that
DB2 by default use the first index created as clustered index if not explicitly specified
and your question is that you saw your DB2 9.7 LUW database and saw only REG indexes.
#mustaccio is correct. DB2 LUW never creates clustered indexes by default.
As per DB2 9.7 LUW documentation here, it says
clustering indexes cannot be specified as part of the table definition
used with the CREATE TABLE statement. Instead, clustering indexes are
only created by executing the CREATE INDEX statement with the CLUSTER
option specified. Then the ALTER TABLE statement should be used to add
a primary key that corresponds to the clustering index created to the
table. This clustering index will then be used as the table's primary
key index.
And #Ian Bjorhovde is also correct, you are reading DB2 for z/OS documentation. There are many differences between DB2 LUW and DB2 for z/OS
is there automatic index in Postgresql or need users to create index explicitly? if there is automatic index, how can I view it? thanks.
An index on the primary key and unique constraints will be made automatically. Use CREATE INDEX to make more indexes. To view existing database structure including the indexes, use \d table.
A quick example of generating an index would be:
CREATE INDEX unique_index_name ON table (column);
You can create an index on multiple columns:
CREATE INDEX unique_index_name ON table (column1, column2, column3);
Or a partial index which will only exist when conditions are met:
CREATE INDEX unique_index_name ON table (column) WHERE column > 0;
There is a lot more you can do with them, but that is for the documentation (linked above) to tell you. Also, if you create an index on a production database, use CREATE INDEX CONCURRENTLY (it will take longer, but not lock out new writes to the table). Let me know if you have any other questions.
Update:
If you want to view indexes with pure SQL, look at the pg_catalog.pg_indexes table:
SELECT *
FROM pg_catalog.pg_indexes
WHERE schemaname='public'
AND tablename='table';
I have a table with hundreds of millions rows with schema like below.
tabe AA {
id integer primay key,
prop0 boolean not null,
prop1 boolean not null,
prop2 smallint not null,
...
}
The each "property" field (prop0, prop1, ...) has a small number of distinct values. And I usually query to find "id" from the given conditions of properties fields. I think Bitmap index is best for this query. But postgresql seems not support bitmap index.
I tried b-tree index on each field but these indexes are not used according to the query explain.
Is there a good alternative way to do this?
(i'm using postgresql 9)
Your real problem is a bad schema design, not the index. The properties should be placed in a different table and your current table should link to that table using a many to many relation.
The BIT datatype might also be of use, just check the manual.
Create a multicolumn index on properties which are always or almost always queried. Or several multicolumn indexes if needed.
The alternative, when you do not query the same properties almost always, is to make a tsvector column with words describing your data, maintained using trigger, for example
prop0=true
prop1=false
prop2=4
would be
'propzero nopropone proptwo4'::tsvector
index it using GIN and then use full text search for searching:
where tsv ## 'popzero & nopropone & proptwo4'::tsquery
An index is only used if it actually speeds up the query which is not necessarily always the case. Especially with smallish tables (say thousands of rows) a full table scan ("seq scan" in the Postgres execution plan) might indeed be a lot faster.
How many rows did the table have when you tried the statement?
How did the query look like? Maybe there are other conditions that prevent the index usage.
Did you analyze the table to have up-to-date statistics?