Getting Redshift error 1214 during copy - amazon-redshift

I have the following table in redshift:
Column | Type
id integer
value varchar(255)
I'm trying to copy in (using the datapipeline's RedshiftCopyActivity), and the data has the line 1,maybe as the entry trying to be added, but I get back the error 1214:Delimiter not found, and the raw_field_data value is maybe. Is there something I'm missing in the copy parameters?
The entire csv is three lines that goes:
1,maybe
2,no
3,yes

You may want to take a look at the similar question Redshift COPY command delimiter not found.
Make sure your RedshiftCopyActivity configuration includes FORMAT AS CSV from https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-format.html#copy-csv.
Be sure your input data has your configured delimiter between every field, even in the case of nulls.
Be sure you do not have any trailing blank lines.
You can run the following SQL (from the linked question) to see more specific details of what row is causing the problem.
SELECT le.starttime,
d.query,
d.line_number,
d.colname,
d.value,
le.raw_line,
le.err_reason
FROM stl_loaderror_detail d,
JOIN stl_load_errors le
ON d.query = le.query
ORDER BY le.starttime DESC;

Related

Azure ADF Copy Activity with Trailing Column Delimiter

I have a strange source CSV file where it contains a trailing column delimiter at the end of each record just before the carriage return/new line.
When ADF is previewing this data, it displays only 2 columns without issue and all the data rows. However, when using the copy activity, it fails with the following exception.
ErrorCode=DelimitedTextColumnNameNotAllowNull,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The
name of column index 3 is empty. Make sure column name is properly
specified in the header
Now I understand why it's complaining about this due to trailing delimiter, but my question is whether or not there is a way to deal with this condition? I've tried including the trailing comma in the record delimiter (,\r\n), but then it just pivots the data where all the columns become rows.
Is there a way to address this condition in copy activity?
When preview the data in dataset, it seams correct:
But actually in copy actives, the data will derived to 3 columns by the column delimiter ",", the third column is empty or NULL value. This will cause the error.
If you use Data Flow import projection from source, you can see the third column:
Just for now, copy active doesn't support modify the data schema. You must use Data flow Derived Column to create a new schema for the source. For example:
Then mapping the new column/schema to sink will solve the problem.
HTH.
Use a different encoding for your CSV. CSV utf-8 will do the trick.

Importing issue in postgresql using pgadmin 4

The file is not importing after having created a table. The first line of code is for the table (COPY), the second line of code is for the path of the file (FROM) and the WITH I am not entirely sure if there's a prior line of code that needs to be entered for its success as its not being highlighted in pink. The importing should be going through in either the built-in tool of pgAdmin or the syntax but neither of them generates the needed output. Here are some screenshots:
So I did another table, this time focusing on a single column and ensuring that the name of the column matched on both the table and the file and it worked. The prior example had several columns that had difference in spellings of the column content in table and the file:
You can try this sequentially...
1. First create csv file. .csv file column sequence is most important.
2. Consider the below employee_info.csv file
And consider your database table employee_info table which contain (emp_id [numeric],emp_name[character],emp_sal[numeric],emp_loc [character])
Then Execute the below query
a. copy employee_info(emp_id,emp_name,emp_sal,emp_loc) from 'C:\Users\Zbook\Desktop\employee_info.csv' DELIMITERS ',' CSV;
Note: Ensure that each .csv file row value has not null. Like below...

How to assign csv field value to SQL query written inside table input step in Pentaho Spoon

I am pretty new to Pentaho so my query might sound very novice.
I have written a transformation in which am using CSV file input step and table input step.
Steps I followed:
Initially, I created a parameter in transformation properties. The
parameter birthdate doesn't have any default value set.
I have used this parameter in postgresql query in table input step
in the following manner:
select * from person where EXTRACT(YEAR FROM birthdate) > ${birthdate};
I am reading the CSV file using CSV file input step. How do I assign the birthdate value which is present in my CSV file to the parameter which I created in the transformation?
(OR)
Could you guide me the process of assigning the CSV field value directly to the SQL query used in the table input step without the use of a parameter?
TLDR;
I recommend using a "database join" step like in my third suggestion below.
See the last image for reference
First idea - Using Table Input as originally asked
Well, you don't need any parameter for that, unless you are going to provide the value for that parameter when asking the transformation to run. If you need to read data from a CSV you can do that with this approach.
First, read your CSV and make sure your rows are ok.
After that, use a select values to keep only the columns to be used as parameters.
In the table input, use a placeholder (?) to determine where to place the data and ask it to run for each row that it receives from the source step.
Just keep in ming that the order of columns received by the table input (the columns out of the select values) is the same order that it will be used for the placeholders (?). This should not be a problem with your question that uses only one placeholder, but keep that in mind as you ramp up using Pentaho.
Second idea, using a Database Lookup
This is another approach where you can't personalize the query made to the database and may experience a better performance because you can set a "Enable cache" flag and if you don't need to use a function on your where clause this is really recommended.
Third idea, using a Database Join
That is my recommended approach if you need a function on your where clause. It looks a lot like the Table Input approach but you can skip the select values step and select what columns to use, repeat the same column a bunch of times and enable a "outer join" flag that returns the rows without result from the query
ProTip: If you feel the transformation running too slow, try to use multiple copies from the step (documentation here) and obviously make sure the table have the appropriate indexes in place.
Yes there's a way of assigning directly without the use of parameter. Do as follows.
Use Block this step until steps finish to halt the table input step till csv input step completes.
Following is how you configure each step.
Note:
Postgres query should be select * from person where EXTRACT(YEAR
FROM birthdate) > ?::integer
Check Execute for each row and Replace variables in in Table input step.
Select only the birthday column in CSV input step.

db2 import csv with null date

I run this
db2 "IMPORT FROM C:\my.csv OF DEL MODIFIED BY COLDEL, LOBSINFILE DATEFORMAT=\"D/MM/YYYY\" SKIPCOUNT 1 REPLACE INTO scratch.table_name"
However some of my rows have a empty date field so I get this error
SQL3191N which begins with """" does not match the user specified DATEFORMAT, TIMEFORMAT, or TIMESTAMPFORMAT. The row will be rejected.
My CSV file looks like this
"XX","25/10/1985"
"YY",""
"ZZ","25/10/1985"
I realise if I insert charater instead of a blank string I could use NULL INDICATORS paramater.
However I do not have access to change the CSV file. Is there a way to ignore import a blank string as a null?
This is an error in your input file. DB2 differentiates between a NULL and a zero-length string. If you need to have NULL dates, a NULL would have no quotes at all, like:
"AA",
If you can't change the format of the input file, you have 2 options:
Insert your data into a staging table (changing the DATE column to a char) and then using SQL to populate the ultimate target table
Write a program to parse ("fix") the input file and then import the resulting fixed data. You can often do this without having to write the entire file out to disk – your program could write to a named pipe, and the DB2 IMPORT (and LOAD) utility is capable of reading from named pipes.
I'm not aware of anything. Yes, ideally that date field should be null.
Probably the best thing to do would be load the data into a scratch/temp table where that isn't a date column - just leave it as character data (it looks like you're already using a scratch table anyways). It should be trivial after that to use a CASE statement to transform the information into a null date if the value is blank, when doing your INSERT to the real table.

How to import file into sqlite?

On a Mac, I have a txt file with two columns, one being an autoincrement in an sqlite table:
, "mytext1"
, "mytext2"
, "mytext3"
When I try to import this file, I get a datatype mismatch error:
.separator ","
.import mytextfile.txt mytable
How should the txt file be structured so that it uses the autoincrement?
Also, how do I enter in text that will have line breaks? For example:
"this is a description of the code below.
The text might have some line breaks and indents. Here's
the related code sample:
foreach (int i = 0; i < 5; i++){
//do some stuff here
}
this is a little more follow up text."
I need the above inserted into one row. Is there anything special I need to do to the formatting?
For one particular table, I want each of my rows as a file and import them that way. I'm guessing it is a matter of creating some sort of batch file that runs multiple imports.
Edit
That's exactly the syntax I posted, minus a tab since I'm using a comma. The missing line break in my post didn't make it as apparent. Anyways, that gives the mismatch error.
I was looking on the same problem. Looks like I've found an answer on the first part of your question — about importing a file into a table with ID field.
So yes, create a temporary table without ID, import your file into it, then do insert..select to copy its data into your target table. (Remove leading commas from mytextfile.txt).
-- assuming your table is called Strings and
-- was created like this:
-- create table Strings( ID integer primary key, Code text )
create table StringsImport( Code text );
.import mytextfile.txt StringsImport
insert into Strings ( Code ) select * from StringsImport;
drop table StringsImport;
Do not know what to do with newlines. I've read some mentions that importing in CSV mode will do the trick (.mode csv), but when I tried it did not seem to work.
In case anyone is still having issues with this you can download an SQLLite manager.
There are several that allow importing from a CSV file.
Here is one but a google search should reveal a few: http://sqlitemanager.en.softonic.com/
I'm in the process of moving data containing long text fields with various punctuation marks (they are actually articles on coding) into SQLite and I've been experimenting with various text imports.
I created a database in SQLite with a table:
CREATE TABLE test (id PRIMARY KEY AUTOINCREMENT, textfield TEXT);
then do a backup with .dump.
I then add the text below the "CREATE TABLE" line manually in the resulting .dump file as such:
INSERT INTO test textfield VALUES (1,'Is''t it great to have
really long text with various punctaution marks and
newlines');
Change any single quotes to two single quotes (change ' to ''). Note that an index number needs to be added manually (I'm sure there is an AWK/SED command to do it automatically). Change the auto increment number in the "sequence" line in the dump file to one above the last index number you added (I don't have SQLite in front of me to give you the exact line, but it should be obvious).
With the new file, I can then do a restore onto the database