Lift load Dateformat issue from csv file - db2

we are migrating db2 data to db2 on cloud. We are using below lift cli operation for migration.
Extracting a database table to a CSV file using lift extract from source database.
Then loading the extracted CSV file to db2 on cloud using 'lift load'
ISSUE:
We have created some tables using ddl on the target db2oncloud which have some columns with DATA TYPE "TIMESTAMP"
while load operation(lift load), we are getting below error"
"MESSAGE": "The field in row \"2\", column \"8\" which begins with
\"\"2018-08-08-04.35.58.597660\"\" does not match the user specified
DATEFORMAT, TIMEFORMAT, or TIMESTAMPFORMAT. The row will be
rejected.", "SQLCODE": "SQL3191W"

If you use db2 as a source database, then use either:
the following property during export (to export dates, times, timestamps as usual for db2 utilities - without double quotes):
source-database-type=db2
try to use the following property during load, if you have already
exported timestamps surrounded by double quotes:
timestamp-format="YYYY-MM-DD-HH24.MI.SS.FFFFFF"

If the data was extracted using lift extract then for sure you should load the data with source-database-type=db2. Using this parameter will preconfigure all the necessary load details automatically.

Related

Azure Data Factory schema mapping not working with SQL sink

I have a simple pipeline that loads data from a csv file to an Azure SQL db.
I have added a data flow where I have ensured all schema matches the SQL table. I have a specific field which contains numbers with leading zeros. The data type in the source - projection is set to string. The field is mapped to the SQL sink showing as string data-type. The field in SQL has nvarchar(50) data-type.
Once the pipeline is run, all the leading zeros are lost and the field appears to be treated as decimal:
Original data: 0012345
Inserted data: 12345.0
The CSV data shown in the data preview is showing correctly, however for some reason it loses its formatting during insert.
Any ideas how I can get it to insert correctly?
I had repro’d in my lab and was able to load as expected. Please see the below repro details.
Source file (CSV file):
Sink table (SQL table):
ADF:
Connect the data flow source to the CSV source file. As my file is in text format, all the source columns in the projection are in a string.
Source data preview:
Connect sink to Azure SQL database to load the data to the destination table.
Data in Azure SQL database table.
Note: You can all add derived columns before sink to convert the value to string as the sink data type is a string.
Thank you very much for your response.
As per your post the DF dataflow appears to be working correctly. I have finally discovered an issue with the transformation - I have an Azure batch service which runs a python script, which does a basic transformation and saves the output to a csv file.
Interestingly, when I preview the data in the dataflow, it looks as expected. However, the values stored in SQL are not.
For the sake of others having a similar issue, my existing python script used to convert a 'float' datatype column to string-type. Upon conversion, it used to retain 1 decimal number but as all of my numbers are integers, they were ending up with .0.
The solution was to convert values to integer and then to string:
df['col_name'] = df['col_name'].astype('Int64').astype('str')

How to transform data type in Azure Data Factory

I would like to copy the data from local csv file to sql server in Azure Data Factory. The table in sql server is created already. The local csv file is exported from mysql.
When I use copy data in Azure Data Factory, there is an error "Exception occurred when converting value 'NULL' for column name 'deleted' from type 'String' to type 'DateTime'. The string was not recognized as a valid DateTime.
What I have done:
I checked the original value from column name 'deleted' is NULL, without quotes(i.e. not 'NULL').
I cannot change the data type during file format settings. The data type for all column is preset to string as default.
I tried to create data flow instead of copy data. I can change the data type from source projection. But the sink dataset cannot select sql server.
What can I do to copy data from CSV file to sql server via Azure Data Factory?
Data Flow doesn't support on-premise SQL. We can't create the source and sink.
You can use copy active or copy data tool to do that. I made an example data which delete is NULL:
As you said the delete column data is Null or contains NULL, and ALL will be considered as String. The key is that your Sink SQL Server table schema if it allows NULL.
I tested many times and it all works well.

Meta Data Column doesn't match with source column names in Informatica Powercenter

I am loading data from a CSV Flat File into a target Oracle DB using Informatica. The Flat file contains Arabic Column Names and I happen to create the table with similar column names using double-quotes. But when running the map it gives me the above error i.e. Meta Data Column doesn't match with source column names.
If I change the names of the columns to English in the target table things are fine.
I found out that it is currently a bug in Informatica and will be resolved in the next release. We are using Informatica 10.2 and oracle 12c as the target.

Transforming data type in Azure Data Factory

I have a "Copy" step in my Azure Data Factory pipeline which copies data from CSV file to MSSQL.
Unfortunately, all columns in CSV comes as String data type. How can I change these data types to match the data type in SQL table.
Here is how the data is available in CSV file.
I would like to change data type of WIPStateKey to Integer and ReportDt to Timestamp. I do not seem to find an option to achieve this.
Yes as you said "all columns in CSV comes as String data type".
But when using a copy active, choose the csv file as the source, we can import the schema and change the column data type.
I created a demo.csv file for test:
I copy data from my demo.csv file to my Azure SQL database.
During file format setting, we can change the column data type:
Table mapping:
Column mapping:
Copy completed:
Hope this helps

How to populated the table via Pentaho Data Integration's table_output step?

I am performing an ETL job via Pentaho 7.1.
The job is to populate a table 'PRO_T_TICKETS' in PostgreSQL 9.2 via the Pentaho Jobs and transformations?
I have mapped the table fields with respect to the stream fields
Mapped Fields
My Table PRO_T_TICKETS contains the Schema (Column Names) in UPPERCASE.
Is this the reason I can't populate the table PRO_T_TICKETS with my ETL Job?
I duplicated the step TABLE_OUTPUT to PRO_T_TICKETS and changed the Target table field to 'PRO_T_TICKETS2'. Pentaho created a new table with lowercase schema and populated the data in it.
But I want this data to be uploaded in the table PRO_T_TICKETS only and with the UPPERCASE schema if possible.
I am attaching the whole job here and the error thrown by Pentaho. Pentaho Error I have also tried my query by adding double quotes to the column names as you can see in the error. But it didn't help.
What do you think I should do?
When you create (or modify) the connection, select Advanced on the left panel and click on the Force to upper case or Force to lower case or, even better, Preserve case of reserved words.
To know which option to choose, copy the 4th line of your error log, the line starting with INSERT INTO "public"."PRO_T_TICKETS("OID"... in your SQL-developer tool and change the connection advanced parameters until it works.
Also, at debug time, don't use batch updates, don't use lazy conversion on previous steps, and try with one (1) field rather than all (25).
Just as a complement: it worked for me following the tips from AlainD and using specific configurations that I'd like to share with you. I have a transformation streaming data from MySQL to PostgreSQL using a Table Input and Output. In both of DBs I have uppercase objects.
I did the following steps to work in the right way:
In the table input (MySQL) the objects are uppercase too, but I typed in lowercase and it worked and I didn't set any special option in the DB Connection.
In the table output (PostgreSQL) I typed everything in uppercase (schema, table name and columns) and I also set "specify the database fields" (clicking on "Get fields").
In the target DB Connection (PostgreSQL) I put the options (in "Advanced" section): "Quote all in database" and "Preserve case of reserved words".
PS: Ah, the last option is because I've found out that there was one more problem with my fields: there was a column called "Admin" (yes guys, they created a camelcase column using a reserved word!) and for that reason I must to put "Preserve case of reserved words" and type it as "Admin" (without quotes and in camelcase) in the Table Output.