I created a database on db2 11.5, then tablespace, then create a table. Everything okay so far.
However, when i tried to create index in the newly created TABLESPACE, it complains syntax error:
CREATE INDEX SCH.TBL_CAT_ERR_NIX01 ON SCH.TBL_CAT_ERR (CAT_NO ASC, CAT_ERR_ID ASC) in TBS_CAT_SINDEX;
with error:
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0109N The statement or command was not processed because the following
clause is not supported in the context where it is used: "IN". SQLSTATE=42601
i tried these 2 but still they don't work, complaining the "PARTITION" clause is not supported
CREATE INDEX SCH.TBL_CAT_ERR_NIX01 ON SCH.TBL_CAT_ERR (CAT_NO ASC, CAT_ERR_ID ASC) partitioned in TBS_CAT_SINDEX;
CREATE INDEX SCH.TBL_CAT_ERR_NIX01 ON SCH.TBL_CAT_ERR (CAT_NO ASC, CAT_ERR_ID ASC) not partitioned in TBS_CAT_SINDEX;
Could you help me pointing out what I'm missing?
When you run your create table statement, you can optionally specify the index in clause at that time to allow indexes to use a specific tablespace that you pre-created. Additional capabilities are available for PARTITIONED tables.
The key detail you omitted from your question was whether or not your table is itself partitioned.
The documentation for create index states that the IN tablespapace-name clause can only be specified or a nonpartitioned index on a partitioned table. So if your table is not itself partitioned then you cannot use this IN clause when creating an index, and you should consider using the create table statement to identify an index tablespace for that table instead.
Related
I am using postgresql 14.1, and I re-created my live database using parititons for some tables.
since i did that, i could create index when the server wasn't live, but when it's live i can only create the using concurrently but unfortunately when I try to create an index concurrently i get an error.
running this:
create index concurrently foo on foo_table(col1,col2,col3));
provides the error:
ERROR: cannot create index on partitioned table "foo_table" concurrently
now it's a live server and i cannot create indexes not concurrently and i need to create some indexes in order to improve performance. any ideas how do to that ?
thanks
No problem. First, use CREATE INDEX CONCURRENTLY to create the index on each partition. Then use CREATE INDEX to create the index on the partitioned table. That will be fast, and the indexes on the partitions will become the partitions of the index.
Step 1: Create an index on the partitioned (parent) table
CREATE INDEX foo_idx ON ONLY foo (col1, col2, col3);
This step creates an invalid index. That way, none of the table partitions will get the index applied automatically.
Step 2: Create the index for each partition using CONCURRENTLY and attach to the parent index
CREATE INDEX CONCURRENTLY foo_idx_1
ON foo_1 (col1, col2, col3);
ALTER INDEX foo_idx
ATTACH PARTITION foo_idx_1;
Repeat this step for every partition index.
Step 3: Verify that the parent index created at the beginning (Step 1) is valid. Once indexes for all partitions are attached to the parent index, the parent index is marked valid automatically.
SELECT * FROM pg_index WHERE pg_index.indisvalid = false;
The query should return zero results. If thats not the case then check your script for mistakes.
We have got some DDLs for our schema. These DDLs create tables and indexes in a given tablespace. Something like:
CREATE TABLE mySchema.myTable
(
someField1 CHAR(2) NOT NULL ,
someField2 VARCHAR(70) NOT NULL
)
IN MY_TBSPC
INDEX IN MY_TBSPC;
We want to reuse this DDLs to run some integration tests using APACHE DERBY. The problem is that such syntax is not accepted by DERBY. Is there any way to define a kind of default tablespace for tables and indexes, so we can remove this 'IN TABLESPACE' statements.
There is no deterministic way of defining a "default" tablespace in DB2 (I'm assuming we're dealing with DB2 for LUW here). If the tablespaces are not explicitly indicated in the CREATE TABLE statement, the database manager will pick for table data the first tablespace with the suitable page size that you are authorized to use, and indexes will be stored in the same tablespace as data.
This means that if you only have one user tablespace it will always be used for both data and indexes, so in a way it becomes the default. However, if you have more than one tablespace with different page sizes you may end up with tables (and their indexes) in different tablespaces.
I've checked index type in one of my table and found that all indexes are of type REG (non clustered). As per DB2 documentation, DB2 by default use the first index created as clustered index if not explicitly specified. Why DB2 is showing all of my indexes as REGULAR?
Reference: http://www-01.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/com.ibm.db2z10.doc.intro/src/tpc/db2z_clusteringindexes.dita
"When a table has a clustering index, an INSERT statement causes DB2 to insert the records as nearly as possible in the order of their index values. The first index that you define on the table serves implicitly as the clustering index unless you explicitly specify CLUSTER when you create or alter another index. For example, if you first define a unique index on the EMPNO column of the EMP table, DB2 inserts rows into the EMP table in the order of the employee identification number unless you explicitly define another index to be the clustering index"
Here is my understanding of your question - You read on the IBM documentation website that
DB2 by default use the first index created as clustered index if not explicitly specified
and your question is that you saw your DB2 9.7 LUW database and saw only REG indexes.
#mustaccio is correct. DB2 LUW never creates clustered indexes by default.
As per DB2 9.7 LUW documentation here, it says
clustering indexes cannot be specified as part of the table definition
used with the CREATE TABLE statement. Instead, clustering indexes are
only created by executing the CREATE INDEX statement with the CLUSTER
option specified. Then the ALTER TABLE statement should be used to add
a primary key that corresponds to the clustering index created to the
table. This clustering index will then be used as the table's primary
key index.
And #Ian Bjorhovde is also correct, you are reading DB2 for z/OS documentation. There are many differences between DB2 LUW and DB2 for z/OS
In AWS Redshift, I want to add a sort key to a table that is already created. Is there any command which can add a column and use it as sort key?
UPDATE:
Amazon Redshift now enables users to add and change sort keys of existing Redshift tables without having to re-create the table. The new capability simplifies user experience in maintaining the optimal sort order in Redshift to achieve high performance as their query patterns evolve and do it without interrupting the access to the tables.
source: https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-supports-changing-table-sort-keys-dynamically/
At the moment I think its not possible (hopefully that will change in the future). In the past when I ran into this kind of situation I created a new table and copied the data from the old one into it.
from http://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_TABLE.html:
ADD [ COLUMN ] column_name
Adds a column with the specified name to the table. You can add only one column in each ALTER TABLE statement.
You cannot add a column that is the distribution key (DISTKEY) or a sort key (SORTKEY) of the table.
You cannot use an ALTER TABLE ADD COLUMN command to modify the following table and column attributes:
UNIQUE
PRIMARY KEY
REFERENCES (foreign key)
IDENTITY
The maximum column name length is 127 characters; longer names are truncated to 127 characters. The maximum number of columns you can define in a single table is 1,600.
As Yaniv Kessler mentioned, it's not possible to add or change distkey and sort key after creating a table, and you have to recreate a table and copy all data to the new table.
You can use the following SQL format to recreate a table with a new design.
ALTER TABLE test_table RENAME TO old_test_table;
CREATE TABLE new_test_table([new table columns]);
INSERT INTO new_test_table (SELECT * FROM old_test_table);
ALTER TABLE new_test_table RENAME TO test_table;
DROP TABLE old_test_table;
In my experience, this SQL is used for not only changing distkey and sortkey, but also setting the encoding(compression) type.
To add to Yaniv's answer, the ideal way to do this is probably using the CREATE TABLE AS command. You can specify the distkey and sortkey explicitly. I.e.
CREATE TABLE test_table_with_dist
distkey(field)
sortkey(sortfield)
AS
select * from test_table
Additional examples:
http://docs.aws.amazon.com/redshift/latest/dg/r_CTAS_examples.html
EDIT
I've noticed that this method doesn't preserve encoding. Redshift only automatically encodes during a copy statement. If this is a persistent table you should redefine the table and specify the encoding.
create table test_table_with_dist(
field1 varchar encode row distkey
field2 timestam pencode delta sortkey);
insert into test_table select * from test_table;
You can figure out which encoding to use by running analyze compression test_table;
AWS now allows you to add both sortkeys and distkeys without having to recreate tables:
TO add a sortkey (or alter a sortkey):
ALTER TABLE data.engagements_bot_free_raw
ALTER SORTKEY (id)
To alter a distkey or add a distkey:
ALTER TABLE data.engagements_bot_free_raw
ALTER DISTKEY id
Interestingly, the paranthesis are mandatory on SORTKEY, but not on DISTKEY.
You still cannot inplace change the encoding of a table - that still requires the solutions where you must recreate tables.
I followed this approach for adding the sort columns to my table table_transactons its more or less same approach only less number of commands.
alter table table_transactions rename to table_transactions_backup;
create table table_transactions compound sortkey(key1, key2, key3, key4) as select * from table_transactions_backup;
drop table table_transactions_backup;
Catching this query a bit late.
I find that using 1=1 the best way to create and replicate data into another table in redshift
eg:
CREATE TABLE NEWTABLE AS SELECT * FROM OLDTABLE WHERE 1=1;
then you can drop the OLDTABLE after verifying that the data has been copied
(if you replace 1=1 with 1=2, it copies only the structure - which is good for creating staging tables)
it is now possible to alter a sort kay:
Amazon Redshift now supports changing table sort keys dynamically
Amazon Redshift now enables users to add and change sort keys of existing Redshift tables without having to re-create the table. The new capability simplifies user experience in maintaining the optimal sort order in Redshift to achieve high performance as their query patterns evolve and do it without interrupting the access to the tables.
Customers when creating Redshift tables can optionally specify one or more table columns as sort keys. The sort keys are used to maintain the sort order of the Redshift tables and allows the query engine to achieve high performance by reducing the amount of data to read from disk and to save on storage with better compression. Currently Redshift customers who desire to change the sort keys after the initial table creation will need to re-create the table with new sort key definitions.
With the new ALTER SORT KEY command, users can dynamically change the Redshift table sort keys as needed. Redshift will take care of adjusting data layout behind the scenes and table remains available for users to query. Users can modify sort keys for a given table as many times as needed and they can alter sort keys for multiple tables simultaneously.
For more information ALTER SORT KEY, please refer to the documentation.
documentation
as for the documentation itself:
ALTER DISTKEY column_name or ALTER DISTSTYLE KEY DISTKEY column_name A
clause that changes the column used as the distribution key of a
table. Consider the following:
VACUUM and ALTER DISTKEY cannot run concurrently on the same table.
If VACUUM is already running, then ALTER DISTKEY returns an error.
If ALTER DISTKEY is running, then background vacuum doesn't start on a table.
If ALTER DISTKEY is running, then foreground vacuum returns an error.
You can only run one ALTER DISTKEY command on a table at a time.
The ALTER DISTKEY command is not supported for tables with interleaved sort keys.
When specifying DISTSTYLE KEY, the data is distributed by the values in the DISTKEY column. For more information about DISTSTYLE, see CREATE TABLE.
ALTER [COMPOUND] SORTKEY ( column_name [,...] ) A clause that changes
or adds the sort key used for a table. Consider the following:
You can define a maximum of 400 columns for a sort key per table.
You can only alter a compound sort key. You can't alter an interleaved sort key.
When data is loaded into a table, the data is loaded in the order of the sort key. When you alter the sort key, Amazon Redshift reorders the data. For more information about SORTKEY, see CREATE TABLE.
According to the updated documentation it is now possible to change a sort key type with:
ALTER [COMPOUND] SORTKEY ( column_name [,...] )
For reference (https://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_TABLE.html):
"You can alter an interleaved sort key to a compound sort key or no sort key. However, you can't alter a compound sort key to an interleaved sort key."
ALTER TABLE table_name ALTER SORTKEY (sortKey1, sortKey2 ...etc)
I have moved some records from my SOURCE table in DB_1 into an ARCHIVE table in another DB_2 (ie. INSERTED the records from SOURCE into ARCHIVE and then DELETED the records from SOURCE.)
My SOURCE table has the following index created as SOURCE_1:
CREATE UNIQUE NONCLUSTERED INDEX SOURCE_1
ON dbo.SOURCE(TRADE_SET_ID, ORDER_ID)
The problem is - when I try to insert the rows back into SOURCE from ARCHIVE, Sybase throws the following error:
Attempt to insert duplicate key row in object 'SOURCE' with unique index 'SOURCE_1'
And, of course, subsequently fails the insertions.
I confirmed that my SOURCE table does not have these duplicates because the following query returned empty:
select * from DB_1.dbo.SOURCE
join DB_2.dbo.ARCHIVE
on DB_1.dbo.SOURCE.TRADE_SET_ID = DB_2.dbo.ARCHIVE.TRADE_SET_ID
AND DB_1.dbo.SOURCE.ORDER_ID = DB_2.dbo.ARCHIVE.ORDER_ID
If the above query returned nothing, then that means I haven not violated my unique index constraint on the 2 columns, however Sybase complains that I have.
Does anyone have any ideas on why this is happening?
If Sybase is anything like SQL Server in this regard (Which I'm more familiar with), I would suspect that the index is blocking the insert. Try disabling the index (along with any other indexes or autoincrement columns) on your archive version before copying over to it, then re-enabling. Its probable that Sybase would try to automatically create IDs for the insertions, which would interfere with the existing records.