Db2 to Netezza migration

Db2 to Netezza migration - db2

I am working on SAS and I don't have knowledge on Db2 and Netezza.
Now my requirement is to migrate below code from DB2 to Netezza.
So could you please help me out on this?
Here is my code:
CREATE TABLE acct_grp_holder (
acct_num CHAR(7) NOT NULL,
grp_num CHAR(9) NOT NULL
)
PARTITIONING KEY (grp_num)
IN ts_mdc1 /*Not aware what's the meaning of IN here*/
ORGANIZE BY (grp_num)
NOT LOGGED INITIALLY
);
Thanks in advance.

Without knowing the intended use for the table (e.g. if this for permanent user or part of your data preparation process for use in SAS), here is starting point for your conversion.
CREATE TABLE acct_grp_holder (
acct_num CHAR(7) NOT NULL,
grp_num CHAR(9) NOT NULL
)
DISTRIBUTE ON (grp_num)
--DISTRIBUTE ON RANDOM
ORGANIZE ON (grp_num)
;
The PARTITIONING KEY clause is roughly equivalent to the Netezza DISTRIBUTE ON clause. However, without knowing anything about your data, we can't tell if using "DISTRIBUTE ON RANDOM" would be a better fit.
The ORGANIZE BY clause in the original indicates an MDC table. The ORGANIZE ON clause in the Netezza is a rough conceptual fit for this.
There is no need, or ability, to specify as tablespace for the table (the IN clause) or logging behavior.

Related

DB2 'list tables' returns nothing found even though there are 50+ tables defined?

This one is most odd, I've got a DB2 instance with 50+ tables defined, and whilst I can insert and query data. DB2 is being extremely picky about formatting and keeps complaining about both table / column context whilst insisting on everything being quoted.
Most weird is that none of the tables show in the results of a 'list tables' command whilst 2 other tables defined by API do..?
Syntax I used to create the tables..
CREATE TABLE Shell.Customers
(
"idCustomers" BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY ( INCREMENT BY 1 NO CYCLE ORDER ),
"Name" VARCHAR(64) NOT NULL,
"Code" VARCHAR(6) NOT NULL,
PRIMARY KEY ("idCustomers")
) COMPRESS YES ADAPTIVE WITH RESTRICT ON DROP;
Any ideas where I messed it up?
Thanks in advance.. :)

LIST TABLES command without ‘FOR’ clause shows tables for the current user. Your table is not listed unless your current user name is SHELL.
Use LIST TABLES FOR SCHEMA SHELL (or FOR ALL) command to list the table you mentioned.

PostgreSQL table row goes to the bottom when updating

I am currently migrating from MySQL to PostgreSQL in a Laravel application, and I noticed that when updating, the row goes to the end of the table (bottom).
In the application I know I can use ORDER BY to sort, but I am referring to the internal behavior of the database while performing the UPDATE action.
In Mysql, it remains in the same position it occupied before the update.
Is there any way to apply this function? Would it be a InnoDB feature? Using Navicat Premium 12.1 DBMS.
I think this is just an aesthetic factor, but even so I would like to learn how to carry out this "permanent ordination".
The database is in UTF-8 encoding and pt_BR.UTF8 collation and ctype.
Following is the table:
CREATE TABLE `properties` (
`id` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
`title` varchar(255) NOT NULL,
`description` text NOT NULL,
`name` varchar(255),
`rental_price` decimal(10, 2),
`sale_price` decimal(10, 2)
);
Thank you all!

Part 1: Generally use ORDER BY
If you do not use the ORDER BY statement, both MySQL and PostgreSQL (and for that matter most relational DBMS systems) do not make any promises about the order of records.
You should refactor your application to use the ORDER BY statement. If you want your data set to be ordered by newest first, you could use something like:
SELECT * FROM yourtable ORDER BY id DESC;
SELECT * FROM yourtable ORDER BY creation_date DESC; -- if your table has such a column
Similarly, you can have oldest objects first by using one of the following:
SELECT * FROM yourtable ORDER BY id ASC;
SELECT * FROM yourtable ORDER BY creation_date ASC; -- if your table has such a column
Part 2: Looking into the mechanics
You added to your question a more detailed inquiry:
[...] I know I can use ORDER BY to sort, but I am referring to the internal behavior of the database while performing the UPDATE action.
There is multiple things that influence the sequence of database records displayed on your screen, when performing a query. In a real life application, it is not (practially) possible to predict this sequence.
I assume this is simply an effect of PostgreSQL creating a new record for the updated record as can be found here in the Updating a Row section. I suggest to not rely on this behvaiour in any of your applications.

T-SQL implicit conversion between 2 varchars

I have some T-SQL (SQL Server 2008) that I inherited and am trying to find out why some of queries are running really slow. In the Actual Execution Plan I have three clustered index scans which are costing me 19%, 21% and 26%, so this seems to be the source of my problem.
The contents of the fields are usually numeric (but some job numbers have an alpha prefix)
The database design (vendor supplied) is pretty poor. The max length of a job number in their application is 12 chars, but in the tables that are joined it is defined as varchar(50) in some places and varchar(15) in others. My parameter is a varchar(12), but I get same thing if I change it to a varchar(50)
The node contains this:
Predicate: [Live_Costing].[dbo].[TSTrans].[JobNo] as [sts1].[JobNo]=CONVERT_IMPLICIT(varchar(50),[#JobNo],0)
sts1 is a derived table, but the table it pulls jobno from is a varchar(50)
I don't understand why it's doing an implicit conversion between 2 varchars. Is it just because they are different lengths?
I'm fairly new to the execution plan
Is there an easy way to figure out which node in the exec plan relates to which part of the query?
Is the predicate, the join clause?
Regards
Mark

Some variables can have collation: enter link description here
Regardless you need to verify your collations, which can be specified at server, DB, table, and column level.
First, check your collation between tempdb and the vendor supplied database. It should match. If it doesn't, it will tend to do implicit conversions.
Assuming you cannot modify the vendor supplied code base, one or more of the following should help you:
1) Predefine your temp tables and specify the same collation for the key field as in the db in use, rather than tempdb.
2) Provide collations when doing string comparisons.
3) Specify collation for key values if using "select into" with a temp table
4) Make sure your collations on your tables and columns match your database collation (VERY important if you imported only specific tables from a vendor into an existing database.)
If you can change the vendor supplied code base, I would suggest reviewing the cost for making all of your char keys the same length and NOT varchar. Varchar has an overhead of 10. The caveat is that if you create a fixed length character field not null, it will be padded to the right (unavoidable).
Ideally, you would have int keys, and only use varchar fields for user interaction/lookup:
create table Products(ProductID int not null identity(1,1) primary key clustered, ProductNumber varchar(50) not null)
alter table Products add constraint uckProducts_ProductNumber unique(ProductNumber)
Then do all joins on ProductID, rather than ProductNumber. Just filter on ProductNumber.
would be perfectly fine.

Creating a "table of tables" in PostgreSQL or achieving similar functionality?

I'm just getting started with PostgreSQL, and I'm new to database design.
I'm writing software in which I have various plugins that update a database. Each plugin periodically updates its own designated table in the database. So a plugin named 'KeyboardPlugin' will update the 'KeyboardTable', and 'MousePlugin' will update the 'MouseTable'. I'd like for my database to store these 'plugin-table' relationships while enforcing referential integrity. So ideally, I'd like a configuration table with the following columns:
Plugin-Name (type 'text')
Table-Name (type ?)
My software will read from this configuration table to help the plugins determine which table to update. Originally, my idea was to have the second column (Table-Name) be of type 'text'. But then, if someone mistypes the table name, or an existing relationship becomes invalid because of someone deleting a table, we have problems. I'd like for the 'Table-Name' column to act as a reference to another table, while enforcing referential integrity.
What is the best way to do this in PostgreSQL? Feel free to suggest an entirely new way to setup my database, different from what I'm currently exploring. Also, if it helps you answer my question, I'm using the pgAdmin tool to setup my database.
I appreciate your help.

I would go with your original plan to store the name as text. Possibly enhanced by additionally storing the schema name:
addin text
,sch text
,tbl text
Tables have an OID in the system catalog (pg_catalog.pg_class). You can get those with a nifty special cast:
SELECT 'myschema.mytable'::regclass
But the OID can change over a dump / restore. So just store the names as text and verify the table is there by casting it like demonstrated at application time.
Of course, if you use each tables for multiple addins it might pay to make a separate table
CREATE TABLE tbl (
,tbl_id serial PRIMARY KEY
,sch text
,name text
);
and reference it in ...
CREATE TABLE addin (
,addin_id serial PRIMARY KEY
,addin text
,tbl_id integer REFERENCES tbl(tbl_id) ON UPDATE CASCADE ON DELETE CASCADE
);
Or even make it an n:m relationship if addins have multiple tables. But be aware, as #OMG_Ponies commented, that a setup like this will require you to execute a lot of dynamic SQL because you don't know the identifiers beforehand.

I guess all plugins have a set of basic attributes and then each plugin will have a set of plugin-specific attributes. If this is the case you can use a single table together with the hstore datatype (a standard extension that just needs to be installed).
Something like this:
CREATE TABLE plugins
(
plugin_name text not null primary key,
common_int_attribute integer not null,
common_text_attribute text not null,
plugin_atttributes hstore
)
Then you can do something like this:
INSERT INTO plugins
(plugin_name, common_int_attribute, common_text_attribute, hstore)
VALUES
('plugin_1', 42, 'foobar', 'some_key => "the fish", other_key => 24'),
('plugin_2', 100, 'foobar', 'weird_key => 12345, more_info => "10.2.4"');
This creates two plugins named plugin_1 and plugin_2
Plugin_1 has the additional attributes "some_key" and "other_key", while plugin_2 stores the keys "weird_key" and "more_info".
You can index those hstore columns and query them very efficiently.
The following will select all plugins that have a key "weird_key" defined.
SELECT *
FROM plugins
WHERE plugin_attributes ? 'weird_key'
The following statement will select all plugins that have a key some_key with the value the fish:
SELECT *
FROM plugins
WHERE plugin_attributes #> ('some_key => "the fish"')
Much more convenient than using an EAV model in my opinion (and most probably a lot faster as well).
The only drawback is that you lose type-safety with this approach (but usually you'd lose that with the EAV concept as well).

You don't need an application catalog. Just add the application name to the keys of the table. This of course assumes that all the tables have the same structure. If not: use the application name for a table name, or as others have suggested: as a schema name( which also would allow for multiple tables per application).
EDIT:
But the real issue is of course that you should first model your data, and than build the applications to manipulate it. The data should not serve the code; the code should serve the data.

What is the difference between these two T-SQL statements?

In a SSIS package at work there are some SQL tasks that create staging tables for holding import data. All the statements take this form:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'dbo.tbNewTable') AND type in (N'U'))
BEGIN
TRUNCATE TABLE dbo.tbNewTable
END
ELSE
BEGIN
CREATE TABLE dbo.tbNewTable (
ColumnA VARCHAR(10) NULL,
ColumnB VARCHAR(10) NULL,
ColumnC INT NULL
) ON PRIMARY
END
In Itzik Ben-Gan's T-SQL Fundamentals I see a different form of statement for creating a table:
IF OBJECT_ID('dbo.tbNewTable', 'U') IS NOT NULL
BEGIN
DROP TABLE dbo.tbNewTable
END
CREATE TABLE dbo.tbNewTable (
ColumnA VARCHAR(10) NULL,
ColumnB VARCHAR(10) NULL,
ColumnC INT NULL
) ON PRIMARY
Each of these appears to do the same thing. After execution, there will be a empty table called tbNewTable in the dbo schema.
Are there any practical or theoretical differences between the two? What implications might they have?

The first one assumes that if the table exists, it has the same columns as those it would create. The second one does not make that assumption. So if a table with that name happened to exist and had a different set of columns, the two would have very different results.

The first will not actually DROP the table -- it merely TRUNCATES all the data in said table. Hence why the CREATE is guarded.
Thus the form with the DROP will allow the subsequent CREATE to change the schema (when the new table is created) even if tbNewTable previously existed.
Because the DROP/CREATE alters the database schema it may not also be allowed in all cases. For instance, a view created with a SCHEMABINDING will prevent the table from being dropped. (This also hold true for more general FK relationships, should any exist.)
...when SCHEMABINDING is specified, the base table or tables cannot be modified in a way that would affect the view definition.
The TRUNCATE should be marginally faster in one of those constant "don't care" ways: there should be no performance consideration given to one over the other.
There are also permission differences. TRUNCATE only requires the ALTER permission.
The minimum permission required is ALTER on table_name. TRUNCATE TABLE permissions default to the table owner...
Happy coding.

These are very different..
The first does an equality check on the sys.objects system table and looks to see if there is a matching table name. If so, it truncates the table. Basically removing all rows but maintaining the table structure itself - i.e. the actual table is never dropped.
In the second, the check to make sure that the table exists is implicitly done using the OBJECT_ID() method. If so, the table is dropped completely - rows and structure.
If you have a primary and foreign key constraint on the table, you'll certainly have issues dropping it completely... and if you have other tables that are linked to the table you are trying to 'truncate' you'll have issues there too, unless you have cascade deletion turned on.

I tend to dislike either construction in an SSIS package. I create the tables in a deployment script and I want the package to fail if one of the tables I use is missing later on because then something drastically wrong has happened and I want to investigate what before I try putting data anywhere.