db2 import csv with null date - db2

I run this
db2 "IMPORT FROM C:\my.csv OF DEL MODIFIED BY COLDEL, LOBSINFILE DATEFORMAT=\"D/MM/YYYY\" SKIPCOUNT 1 REPLACE INTO scratch.table_name"
However some of my rows have a empty date field so I get this error
SQL3191N which begins with """" does not match the user specified DATEFORMAT, TIMEFORMAT, or TIMESTAMPFORMAT. The row will be rejected.
My CSV file looks like this
"XX","25/10/1985"
"YY",""
"ZZ","25/10/1985"
I realise if I insert charater instead of a blank string I could use NULL INDICATORS paramater.
However I do not have access to change the CSV file. Is there a way to ignore import a blank string as a null?

This is an error in your input file. DB2 differentiates between a NULL and a zero-length string. If you need to have NULL dates, a NULL would have no quotes at all, like:
"AA",
If you can't change the format of the input file, you have 2 options:
Insert your data into a staging table (changing the DATE column to a char) and then using SQL to populate the ultimate target table
Write a program to parse ("fix") the input file and then import the resulting fixed data. You can often do this without having to write the entire file out to disk – your program could write to a named pipe, and the DB2 IMPORT (and LOAD) utility is capable of reading from named pipes.

I'm not aware of anything. Yes, ideally that date field should be null.
Probably the best thing to do would be load the data into a scratch/temp table where that isn't a date column - just leave it as character data (it looks like you're already using a scratch table anyways). It should be trivial after that to use a CASE statement to transform the information into a null date if the value is blank, when doing your INSERT to the real table.

Related

Azure ADF Copy Activity with Trailing Column Delimiter

I have a strange source CSV file where it contains a trailing column delimiter at the end of each record just before the carriage return/new line.
When ADF is previewing this data, it displays only 2 columns without issue and all the data rows. However, when using the copy activity, it fails with the following exception.
ErrorCode=DelimitedTextColumnNameNotAllowNull,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The
name of column index 3 is empty. Make sure column name is properly
specified in the header
Now I understand why it's complaining about this due to trailing delimiter, but my question is whether or not there is a way to deal with this condition? I've tried including the trailing comma in the record delimiter (,\r\n), but then it just pivots the data where all the columns become rows.
Is there a way to address this condition in copy activity?
When preview the data in dataset, it seams correct:
But actually in copy actives, the data will derived to 3 columns by the column delimiter ",", the third column is empty or NULL value. This will cause the error.
If you use Data Flow import projection from source, you can see the third column:
Just for now, copy active doesn't support modify the data schema. You must use Data flow Derived Column to create a new schema for the source. For example:
Then mapping the new column/schema to sink will solve the problem.
HTH.
Use a different encoding for your CSV. CSV utf-8 will do the trick.

Getting Redshift error 1214 during copy

I have the following table in redshift:
Column | Type
id integer
value varchar(255)
I'm trying to copy in (using the datapipeline's RedshiftCopyActivity), and the data has the line 1,maybe as the entry trying to be added, but I get back the error 1214:Delimiter not found, and the raw_field_data value is maybe. Is there something I'm missing in the copy parameters?
The entire csv is three lines that goes:
1,maybe
2,no
3,yes
You may want to take a look at the similar question Redshift COPY command delimiter not found.
Make sure your RedshiftCopyActivity configuration includes FORMAT AS CSV from https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-format.html#copy-csv.
Be sure your input data has your configured delimiter between every field, even in the case of nulls.
Be sure you do not have any trailing blank lines.
You can run the following SQL (from the linked question) to see more specific details of what row is causing the problem.
SELECT le.starttime,
d.query,
d.line_number,
d.colname,
d.value,
le.raw_line,
le.err_reason
FROM stl_loaderror_detail d,
JOIN stl_load_errors le
ON d.query = le.query
ORDER BY le.starttime DESC;

Insert yyyyMMdd string into date column using Talend

I have the follow situation:
A PostgreSQL database with a table that contains a date type column called date.
A string from a delimited .txt file outputting: 20170101.
I want to insert the string into the date type column.
So far i have tried the following with mixed results/errors:
row1.YYYYMMDD
Detail Message: Type mismatch: cannot convert from String to Date
Explanation: This one is fairly obvious.
TalendDate.parseDate("yyyyMMdd",row1.YYYYMMDD)
Batch entry 0 INSERT INTO "data" ("location_id","date","avg_winddirection","avg_windspeed","avg_temperature","min_temperature","max_temperature","total_hours_sun","avg_precipitation") VALUES (209,2017-01-01 00:00:00.000000 +01:00:00,207,7.7,NULL,NULL,NULL,NULL,NULL) was aborted. Call getNextException to see the cause.
can see the string parsed into "2017-01-01 00:00:00.000000 +01:00:00".
When I try to execute the query directly i get a "SQL Error: 42601: ERROR: Syntax error at "00" position 194"
Other observations/attempts:
The funny thing is if I use '20170101' as a string in the query it works, see below.
INSERT INTO "data" ("location_id","date","avg_winddirection","avg_windspeed","avg_temperature","min_temperature","max_temperature","total_hours_sun","avg_precipitation") VALUES (209,'20170101',207,7.7,NULL,NULL,NULL,NULL,NULL)
I've also tried to change the schema of the database date column to string. It produces the following:
Batch entry 0 INSERT INTO "data" ("location_id","date","avg_winddirection","avg_windspeed","avg_temperature","min_temperature","max_temperature","total_hours_sun","avg_precipitation") VALUES (209,20170101,207,7.7,NULL,NULL,NULL,NULL,NULL) was aborted. Call getNextException to see the cause.
This query also doesn't work directly because the date isn't between single quotes.
What am i missing or not doing?
(I've started learning to use Talend 2-3 days ago)
EDIT//
Screenshots of my Job and tMap
http://imgur.com/a/kSFd0
EDIT//It doesnt appear to be a date formatting problem but a Talend to PostgreSQL connection problem
EDIT//
FIXED: It was a stupid easy problem/solution ofcourse. THe database name and schema name fields were empty... so it basically didnt know where to connect
You don't have to do anything to insert a string like 20170101 into a date column. PostgreSQL will handle it for you it's just ISO 8601's date format.
CREATE TABLE foo ( x date );
INSERT INTO foo (x) VALUES ( '20170101' );
This is just a talend problem, if anything.
[..] (209,2017-01-01 00:00:00.000000 +01:00:00,207,7.7,NULL,NULL,NULL,NULL,NULL)[..]
If Talend doesn't know by itself that passing timestamp into query requires it to be single quoted, then if possible - you need to do it.
FIXED: It was a stupid easy problem/solution ofcourse. THe database name and schema name fields were empty... so it basically didnt know where to connect thats why i got the BATCH 0 error and when i went deeper while debugging i found it couldnt find the table, stating the relation didnt exist.
Try like this,
The data in input file is: 20170101(in String format)
then set the tMap like,
The output is as follows:

load data to db2 in a single row (cell)

I need to load an entire file (contains only ASCII text), to the database (DB2 Express ed.). The table has only two columns (ID, TEXT). The ID column is PK, with auto generated data, whereas the text is CLOB(5): I have no idea about the input parameter 5, it was entered by default in the Data Studio.
Now I need to use the load utility to save a text file (contains 5 MB of data), in a single row, namely in the column TEXT. I do not want the text to be broken into different rows.
thanks for your answer in advance!
Firstly, you may want to redefine your table: CLOB(5) means you expect 5 bytes in the column, which is hardly enough for a 5 MB file. After that you can use the DB2 IMPORT or LOAD commands with the lobsinfile modifier.
Create a text file and place LOB Location Specifiers (LLS) for each file you want to import, one per line.
LLS is a way to tell IMPORT where to find LOB data. It has this
format: <file path>[.<offset>.<length>/], e.g.
/tmp/lobsource.dta.0.100/ to indicate that the first 100 bytes of
the file /tmp/lobsource.dta should be loaded into the particular LOB
column. Notice also the trailing slash. If you want to import the
entire file, skip the offset and length part. LLSes are placed in
the input file instead of the actual data for each row and LOB column.
So, for example:
echo "/home/you/yourfile.txt" > /tmp/import.dat
Since you said the IDs will be generated in the input data, you don't need to enter them in the input file, just don't forget to use the appropriate command modifier: identitymissing or generatedmissing, depending on how the ID column is defined.
Now you can connect to the database and run the IMPORT command, e.g.
db2 "import from /tmp/import.dat of del
modified by lobsinfile identitymissing
method p (1)
insert into yourtable (yourclobcolumn)"
I split the command onto multiple lines for readability, but you should type it on a single line.
method p (1) means parse the input file and read the column in position 1.
More info in the manual

how can I ignore id column when importing into mySQL via phpMyAdmin?

I need to export data from a table in database A, then import it into an identically-structured table in database B. This needs to be done via phpMyAdmin. Here's the problem: no matter what format I choose for the export (CSV or SQL) ALL columns (including the auto-incremented ID field) get exported. Because there's already data in the table in database B, I can't import the ID field with the new records - I need it to import the records and assign new auto-incremented values to the records. What settings do I need to use in either the export (to be able to choose which columns to export) or the import (to tell it to ignore the ID column in the file)?
Or should I just export as CSV, then open in Excel and delete the ID column? Is there a way to tell phpMyAdmin that it should generate new auto-incremented IDs for the records being imported, without it telling me that there's an incorrect column count in the import file?
EDIT: to clarify, I'm exporting only data, not structure.
Excel is an option to remove the column and probably the fastest at this point.
But if these databases are on the same server and you have access you can just to an INSERT INTO databaseB.table (column_list) SELECT column_list FROM databaseA.table.
You can also run the SELECT statement to just get the desired columns and then export the results. This link should be available in the recent versions of PHPMyAdmin.
It is several years since the original question, but this still came out top in a google search so I'll comment on what worked for me:
If I delete the Id column in my CSV and then try to import I get the 'Invalid column count in CSV input on line 1.' error.
But if I keep the Id column but change all of the Id values to NULL in excel (just typing NULL into the cell), then when I import this the id auto-increment fills in the new records with consecutive numbers (presumably starting with the highest existing record Id +1 ).
I'm using PHPMyAdmin 4.7.0
Another way is
Go to the import menu for that table
Add the CSV file (without an ID column)
Pick CSV in the Format section
In the section where you pick the format of the file (separated by which char, which char encloses fields, etc) there's a field called Column names Type the names of the columns you ARE including, separated by commas.