IBM DB2 - Do not unload DECIMAL rows exceeding a certain length - db2

I have an IBM DB2 table where I want to UNLOAD data from so that I can load it into another DB2 table.
Both tables have the same columns (and types), except one decimal field.
It is DECIMAL(6) in the source table and DECIMAL(5) in the destination table.
There are many entries in the source table which only use up to 5 digits in the DECIMAL field and only some use up all 6 of them.
What I am going to do is only copy the entries which go up to 5 digits in the source table and drop all others.
Can I do this only using the UNLOAD statement? So having an option which tells the system "unload column 'id' as DECIMAL(5) (although it is DECIMAL(6) in the table itself) and if an entry of that column uses all 6 digits (>99999) do not unload that row".
Also how would you handle the case if it was the other way around? E.g. unload DECIMAL(5) from source and LOAD as DECIMAL(6) in destination
Why am I doing this? Because the destination table is an older version of the table used by older versions of the applications. We will drop support in 6 months, but until then we need to refresh the datasets in it.
With UNLOAD and LOAD I am talking about the UNLOAD and LOAD utilities for z/OS (?) described under e.g.
https://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/ugref/src/tpc/db2z_utl_unload.html

Without knowing your source environment, for Db2 Linux, Unix, Windows the export statement is used with a select statement. So you'd do whatever logic makes sense to turn a DECIMAL(6) into a DECIMAL(5) - including skipping rows that need all 6 digits since there's no way to make them fit into a 5 digit allocation. The actual preferred method in this environment is now to use external tables but they work in a similar fashion.
export to 'myfile.csv' OF del select col1, dec_col6 from thetable where dec_col6 < 100000
OR
create external table 'myfile.csv' using (DELIMITER ',') as select col1, case when dec_col6 > 99999 then 99999 else dec_col6 end from thetable
External Tables
Export
Since you used the word "unload" I suppose you're either using a tool such as High Performance Unload or you are on a different flavor of Db2.

Related

Trim/whitespace issue when load data from Db2 source to Postgresql DB using Talend Open source

We are seeing issue in table value which are populated from DB2 (source) to Postgres (Target).
I have including here all the job details for each component.
Based on the above approach and once the data has been populated, when we run the below query in the Postgres DB.
SELECT * FROM VMRCTTA1.VMRRCUST_SUMM where cust_gssn_cd='XY03666699' ;
SELECT * FROM VMRCTTA1.VMRRCUST_SUMM where cust_cntry_cd='847' ;
There will be no records were returned however, when we run the same query with Trim as below it works.
SELECT * FROM VMRCTTA1.VMRRCUST_SUMM where trim(cust_gssn_cd)='XY03666699' ;
SELECT * FROM VMRCTTA1.VMRRCUST_SUMM where trim(cust_cntry_cd)='847' ;
Below are the ways we have tried to overcome this but no luck.
Used tmap between source and target component.
Used trim in source component under Advanced setting.
Change the datatype in Postgres DB of cust_cntry_cd from char(5) to Character varying, this will allow value without any length restriction.
Please suggest what is missing as we have this issue in almost all the table where we have character/varchar columns.
We are using TOS.
The data type is probably character(5) in DB2.
That means that the trailing spaces are part of the column and will be migrated. You have to compare with
cust_cntry_cd = '847 '
or cast the right argument to character(5):
cust_cntry_cd = CAST ('847' AS character(5))
Maybe you could delete all spaces in the advanced settings of the tDB2Input component.
Like the screen :

How to add a date column which is 7 days later than an existing column in a Postgres table? [duplicate]

Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx
Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.
YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.
One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.
PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo
Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.
Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc
I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;
A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);

Does Postgres support virtual columns? [duplicate]

Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx
Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.
YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.
One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.
PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo
Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.
Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc
I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;
A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);

Jasper server community edition installation issues for Postgres

I installed the war file distribution using the install scripts in buildomatic. The installation is successful but when I boot tomcat server it shows some database exceptions
https://gist.github.com/shruti-palshikar/5ae801674dbd2a537518
I checked if the latest postgres driver exists in the tomcat/lib.
I also checked if the database 'jasperserver' has all the necessary tables
However these tables are empty , does anyone know which script loads data into tables?
Any help is appreciated
The actual error from PostgreSQL is:
relation "jiresourcefolder" does not exist
The query seems to be:
select this_.id as id5_0_, this_.version as version5_0_, this_.uri as uri5_0_, this_.hidden as hidden5_0_, this_.name as name5_0_, this_.label as label5_0_, this_.description as descript7_5_0_, this_.parent_folder as parent8_5_0_, this_.creation_date as creation9_5_0_, this_.update_date as update10_5_0_
from JIResourceFolder this_ where (this_.uri=?)
Typically ugly framework generated SQL.
There are only two possibilities:
There is no table "jiresourcefolder", "JIResourceFolder" or any other variation in capitals.
The table was created with quotes to preserve its case and the query is not using quotes.
The following will work:
CREATE TABLE JiReSoRrCeFoLdEr ...
SELECT * FROM jiresourcefolder...
SELECT * FROM JIRESOURCEFOLDER...
SELECT * FROM JIresourceFolder...
Any unquoted table (or column) names are internally mapped to lower-case so will all match.
If however you quote a created table:
CREATE TABLE "JIResourceFolder"
SELECT * FROM "JIResourceFolder" -- works
SELECT * FROM JIResourceFolder -- doesn't
Check your database schema and see if you have this table and whether it is all lower-case. Then, check the documentation for your java framework(s) and see if there is some flag that controls quoting of database tables. It seems likely that the flag is set in one place and not in another.
I just had the same issue in Jasper Studio.
My problem was that a wrong Data Adapter (a DB that did not have such a table) was assigned to the Report.
I had switch to the Design window and select the right Data Adapter in the upper right of that window right beside "Settings".

Checksum Validation after migration from Oracle to SQL Server

I am migrating a large database from oracle 11g to SQL SERVER 2008R2 using SSIS. How can the data integrity can be validated for numeric data using checksum?
In the past, I've always done this using a few application controls. It should be something that is easy to compute on both platforms.
Frequently, the end result is a query like:
select count(*)
, count(distinct col1) col1_dist_cnt
...
, count(distinct col99) col99_dist_cnt
, sum(col1) col1_sum
...
, sum(col99) col99_sum
from table
Spool to file, Excel or database and compare outcome. Save for project management and auditors.
Please read more on application control here. I wrote it for checks between various financial reporting systems for the regulatory reporting, so this approach serves most cases.
If exactly one field value is wrong, it will always show up. Two errors might compensate each other. For example row 1 col 1 gets the value from row 2 col 1.
To detect for that, multiply each value with something unique for the row. For instance, if you have a unique ID column or identity that is included in the migration too:
, sum(ID * col1) col1_sum
...
, sum(ID * col99) col99_sum
When you get number overflows, first try using the largest available precision (especially on SQL Server sometimes difficult). If not feasibly anymore, use mod function. Only few types of error are hidden by mod.
Icing on the cake is to auto generate these statements. On Oracle look at user_tables, user_tab_columns. On SQL Server look at syscolumns etc.