Picking up File with Latest Timestamp Talend - talend

I have designed a Talend Job which will pick up a file from S3 location. I have written a copy command in tRedshiftRow component.Now the Copy Command looks like below:-
truncate Table A;
commit;
copy Table A
(
Column1
Column2
Column3
)
from 's3://*/*/*/*_2017-02-17_14-22-48.txt'
CREDENTIALS 'aws_access_key_id=x;aws_secret_access_key=x'
DATEFORMAT 'YYYYMMDD' TIMEFORMAT 'YYYYMMDD HH:MI:SS' delimiter '|' IGNOREHEADER 1 IGNOREBLANKLINES ACCEPTINVCHARS ;
The Filename everyday will be like x_2017_MM_DD HH-MM-SS.txt where x is like File Name.MM-DD will change everyday.Can anyone please help me out

Related

Pass the string column of csv file to the Oracle table date column in PowerShell

I am uploading the csv file to the Oracle (11g) DB table from PowerShell.
The Csv file contains four varchar2 columns and one date column. I am able to upload the csv file, if I declare the Time1 column of the table as varchar2 in Oracle DB. Error occurred if I declare the Time1 column as Timestamp(2). All columns allow null in Oracle.
PowerShell Version 5.1.19041.1320
Oracle 11g DB
CSV files sample data (MM/DD/YYYY HH:MM format):
Ithaca,,TRIANGLE,NY,6/1/1930 22:00
New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00
Valley City,,DISK,ND,9/15/1934 15:30
,,,LA,8/15/1943 0:00
,,LIGHT,LA,8/15/1943 0:00
San Diego,,CIGAR,CA,1/1/1944 12:00
Wilderness,,DISK,WV,1/1/1944 12:00
PowerShell Commands:
Import-Csv $tgt_file_etl -Header 'City','Colors_Reported','Shape_Reported','State1','Time1' | %{
$cmd.CommandText = "insert into $table (CITY, COLORS_REPORTED, SHAPE_REPORTED, STATE1, TIME1)
values ('$($_.City)','$($_.Colors_Reported)','$($_.Shape_Reported)','$($_.State1)','$($_.Time1)')"
$cmd.ExecuteNonQuery() | Out-Null
}
I altered the date format in Oracle using the following commands.
ALTER SESSION SET NLS_DATE_FORMAT = 'MM-DD-RRRR'
ALTER SESSION SET NLS_TIMESTAMP_FORMAT = 'MM-DD-RRRR HH24.MI.SSXFF';
I don't know how to format the date column in powershell. Any help would be highly appreciated.
thanks to all

Not all rows copy from csv into postgresql

I have a CSV file contains over 2 mill rows.
When I using the COPY statement in postgresql, it returns a littel over 1 mill rows into postgresql.
I am using the statement below:
copy table (
columns[1],
columns[2],
columns[3],
columns[4],
columns[5],
columns[6],
columns[7],
columns[8]
)
from 'C:\Temp\co11700t_bcp\co11700t_bcp.csv' with delimiter ']' quote '"' CSV;
I have bulk-copy the data from a cmd-file, and used windows notepad to set encoding to utf-8.

Postgres: Error when using COPY from a CSV with timestamptz type

I am using Postgres 9.5.3(On Ubuntu 16.04) and I have a table with some timestamptz fields
...
datetime_received timestamptz NULL,
datetime_manufactured timestamptz NULL,
...
I used the following SQL command to generate CSV file:
COPY (select * from tmp_table limit 100000) TO '/tmp/aa.csv' DELIMITER ';' CSV HEADER;
and used:
COPY tmp_table FROM '/tmp/aa.csv' DELIMITER ';' CSV ENCODING 'UTF-8';
to import into the table.
The example of rows in the CSV file:
CM0030;;INV_AVAILABLE;2016-07-30 14:50:42.141+07;;2016-08-06 00:00:000+07;FAHCM00001;;123;;;;;1.000000;1.000000;;;;;;;;80000.000000;;;2016-07-30 14:59:08.959+07;2016-07-30 14:59:08.959+07;2016-07-30 14:59:08.959+07;2016-07-30 14:59:08.959+07;
But I encounter the following error when running the second command:
ERROR: invalid input syntax for type timestamp with time zone: "datetime_received"
CONTEXT: COPY inventory_item, line 1, column datetime_received: "datetime_received"
My database's timezone is:
show timezone;
TimeZone
-----------
localtime(GMT+7)
(1 row)
Is there any missing step or wrong configuration?
Any suggestions are appreciated!
The error you're seeing means that Postgres is trying (and failing) to convert the string 'datetime_received' to a timestamp value.
This is happening because COPY is trying to insert the header row into your table. You need to include a HEADER clause on the COPY FROM command, just like you did for the COPY TO.
More generally, when using COPY to move data around, you should make sure that the TO and FROM commands are using exactly the same options. Specifying ENCODING for one command and not the other can lead to errors, or silently corrupt data, if your client encoding is not UTF8.

Which delimiter to use when loading CSV data into Postgres?

I've come across a problem with loading some CSV files into my Postgres tables. I have data that looks like this:
ID,IS_ALIVE,BODY_TEXT
123,true,Hi Joe, I am looking for a new vehicle, can you help me out?
Now, the problem here is that the text in what is supposed to be the BODY_TEXT column is unstructured email data and can contain any sort of characters, and when I run the following COPY command it's failing because there are multiple , characters within the BODY_TEXT.
COPY sent from ('my_file.csv') DELIMITER ',' CSV;
How can I resolve this so that everything in the BODY_TEXT column gets loaded as-is without the load command potentially using characters within it as separators?
Additionally to the fixing the source file format you can do it by PostgreSQL itself.
Load all lines from file to temporary table:
create temporary table t (x text);
copy t from 'foo.csv';
Then you can to split each string using regexp like:
select regexp_matches(x, '^([0-9]+),(true|false),(.*)$') from t;
regexp_matches
---------------------------------------------------------------------------
{123,true,"Hi Joe, I am looking for a new vehicle, can you help me out?"}
{456,false,"Hello, honey, there is what I want to ask you."}
(2 rows)
You can use this query to load data to your destination table:
insert into sent(id, is_alive, body_text)
select x[1], x[2], x[3]
from (
select regexp_matches(x, '^([0-9]+),(true|false),(.*)$') as x
from t) t

exporting to csv from db2 with no delimiter

I need to export content of a db2 table to CSV file.
I read that nochardel would prevent to have the separator between each data but that is not happening.
Suppose I have a table
MY_TABLE
-----------------------
Field_A varchar(10)
Field_B varchar(10)
Field_A varchar(10)
I am using this command
export to myfile.csv of del modified by nochardel select * from MY_TABLE
I get this written into the myfile.csv
data1 ,data2 ,data3
but I would like no ',' separator like below
data1 data2 data3
Is there a way to do that?
You're asking how to eliminate the comma (,) in a comma separated values file? :-)
NOCHARDEL tells DB2 not to surround character-fields (CHAR and VARCHAR fields) with a character-field-delimiter (default is the double quote " character).
Anyway, when exporting from DB2 using the delimited format, you have to have some kind of column delimiter. There isn't a NOCOLDEL option for delimited files.
The EXPORT utility can't write fixed-length (positional) records - you would have to do this by either:
Writing a program yourself,
Using a separate utility (IBM sells the High Performance Unload utility)
Writing an SQL statement that concatenates the individual columns into a single string:
Here's an example for the last option:
export to file.del
of del
modified by nochardel
select
cast(col1 as char(20)) ||
cast(intcol as char(10)) ||
cast(deccol as char(30));
This last option can be a pain since DB2 doesn't have an sprintf() function to help format strings nicely.
Yes there is another way of doing this. I always do this:
Put select statement into a file (input.sql):
select
cast(col1 as char(20)),
cast(col2 as char(10)),
cast(col3 as char(30));
Call db2 clp like this:
db2 -x -tf input.sql -r result.txt
This will work for you, because you need to cast varchar to char. Like Ian said, casting numbers or other data types to char might bring unexpected results.
PS: I think Ian points right on the difference between CSV and fixed-length format ;-)
Use "of asc" instead of "of del". Then you can specify the fixed column locations instead of delimiting.