postgresql - pgloader - quotes handling - postgresql

I am new to postgresql and just starting to use it. I am trying to load a file into a table and facing some issues.
Sample data - the file file1.RPT contains data in the below format
"Bharath"|Kumar|Krishnan
abc"|def|ghi
qwerty|asdfgh|lkjhg
Below is the load script that is used
LOAD CSV
INTO table1
....
WITH truncate,
fields optionally enclosed by '"',
fields escaped by '"'
fields terminated by '|'
....
However, the above script is not working and is not loading any data into the table. I am not sure whats the issue here. My understanding is that first row data has to be successfully loaded (since I have given optionally enclosed by) and the second row also must be loaded (since I am trying to escape the double quote).
Request help in getting the same rectified.
Thank you.

We cannot escape and optionally quote the same character. If the double-quote will be part of the data, then it can be ignored using field not enclosed option. The default option is field optionally enclosed by double-quote.

Apparently, you're not escaping the quote in the second row, because either you must use a backslash (or another quoting character) before:
abc\"|def|ghi
or you should enclose the entire line with quote
another alternative is to accept to have quotes in the first field, then you should use the following:
fields not enclosed
in your load script

Related

What options would load an escape character into Redshift?

Having a tough time playing with Redshift's COPY options to load a field that has an escape character immediately followed by a delimiter ('|'). Data looks like this:
00b9e290000f8350b9c780832a210000|MY DATA\|AB
So that has 3 fields that I'm trying to load. When I run with just ESCAPE, Redshift seems to properly add \ to doubleescape, but then the pipe delimiter gets ignored. So Redshift ends up trying to load all of the following into the second field: MY DATA|AB. Error message is that the delimiter was not found, since that's read as the second field with no following delimiter
I've tried running COPY with just the ESCAPE option, the CSV + ESCAPE options and a few others with no luck. Is there anything else I should try? Or should I be adding some pre-process step to doubleescape?

Single quotes stored in a Postgres database

I've been working on an Express app that has a form designed to hold lines and quotes.
Some of the lines will have single quotes('), but overall it's able to store the info and I'm able to back it up and store it without any problems. Now, when I want do pg_dump and have the database put into an SQL file, the quotes seem to cause some things to appear a bit wonky in my text editor.
Would I have to create a method to change all the single quotation marks into double, or can I leave it as is and be able to upload it back to the database without causing major issues. I know people will continue to enter in lines that contain either single or double quotations, so I would like to know any solution or answer that would help greatly.
Single quotes in character data types are no problem at all. You just need to escape them properly in string literals.
To write data with INSERT you need to quote all string literals according to SQL syntax rules. There are tools to do that for you ...
Insert text with single quotes in PostgreSQL
However, pg_dump takes care of escaping automatically. The default mode produces text output to be re-imported with COPY (much faster than INSERT), and single quotes have no special meaning there. And in (non-default) csv mode, the default quote character is double-quote (") and configurable. The manual:
QUOTE
Specifies the quoting character to be used when a data value is quoted. The default is double-quote. This must be a single one-byte character. This option is allowed only when using CSV format.
The format is defined by rules for COPY and not by SQL syntax rules.

Uploading data to RedShift using COPY

I am trying to upload data to RedShift using COPY command.
On this row:
4072462|10013868|default|2015-10-14 21:23:18.0|0|'A=0
I am getting this error:
Delimited value missing end quote
This is the COPY command:
copy test
from 's3://test/test.gz'
credentials 'aws_access_key_id=xxx;aws_secret_access_key=xxx' removequotes escape gzip
First, I hope you know why you are getting the mentioned error: You have a a single quote in one of the column values. While using the removequotes option, Redshift documentation clearly says that:
If a string has a beginning single or double quotation mark but no corresponding ending mark, the COPY command fails to load that row and returns an error.
One thing is certain: removequotes is certainly not what you are looking for.
Second, so what are your options?
If preprocessing the S3 file is in your control, consider using the escape option. Per the documentation,
When this parameter is specified, the backslash character (\) in input data is treated as an escape character.
So your input row in S3 should change to something like:
4072462|10013868|default|2015-10-14 21:23:18.0|0|\'A=0
See if the CSV DELIMITER '|' works for you. Check documentation here.

SQLite update to change double-double quotes ("") to regular quotation marks (")?

working on an iPhone app. I just imported some records into a SQL Lite database, and all my regular quote marks have been "doubled". An example:
Desired final format:
The song "ABC" will play at 3 PM.
The record is currently appearing in the database as:
The song ""ABC"" will play at 3 PM.
Does anyone know how to do a SQL update to change all "double-double" quotes to just regular quotation marks?
Just to clarify, I'm looking directly at the database, not via code. The code will just display these as "double-double" quotes just as they appear in the database, so I want to remove them. The "double-double" quotes are actually in the import file as well, but if I try to remove them, then the import fails. So I kept them there, and now that the records are successfully imported into the database, now I just want to correct the "double-double" quote thing with a mass SQL update if it's possible. Thanks in advance for any insight!
SQLite uses single quotes to escape string literals. It escapes single quotes by adding another single quote (likewise for double quotes). So technically as long as your SQL is well constructed, the import process should work properly. The strings should be enclosed in single quotes, and not double quotes. I suspect that your code may be constructing the SQL by hand instead of binding/properly escaping the values.
SQLite has a built in function to quote string's. It's called quote. Here are some sample inputs, and the corresponding output:
sqlite> SELECT quote("foo");
'foo'
sqlite> SELECT quote("foo ""bar""");
'foo "bar"'
sqlite> SELECT quote("foo 'bar'");
'foo ''bar'''
So you could remove the twice escaped double quote before it even goes to SQLite using NSString methods.
[#"badString\"\"" stringByReplacingOccurrencesOfString:#"\"\"" withString:#"\""];
If the database already contains bad values, then you could run the following update SQL to clean it up:
UPDATE table SET column = REPLACE(column, '""', '"');

Postgres using FOREIGN TABLE and data include "\"

My text file look like:
\home\stanley:123456789
c:/kobe:213
\tej\home\ant:222312
and create FOREIGN TABLE Steps:
CREATE FOREIGN TABLE file_check(txt text) SERVER file_server OPTIONS (format 'text', filename '/home/stanley/check.txt');
after select file_check (using: select * from file_check)
my console show me
homestanley:123456789
c:/kobe:213
ejhomeant:222312
Anyone can help me??
The file foreign-data-wrapper uses the same rules as COPY (presumably because it's the same code underneath). You've got to consider that backslash is an escape character...
http://www.postgresql.org/docs/9.2/static/sql-copy.html
Any other backslashed character that is not mentioned in the above table will be taken to represent itself. However, beware of adding backslashes unnecessarily, since that might accidentally produce a string matching the end-of-data marker (.) or the null string (\N by default). These strings will be recognized before any other backslash processing is done.
So you'll either need to double-up the backslashes or perhaps try it as a single-column csv file and see if that helps