DB2 Load from delimitited Files - escape " in Fields doesn't work - db2

I bet it's totaly simple and i just don't see it, but i don't get it ..
I execute the following command in DB2 command line processor:
DB2 LOAD FROM "DB_ACC_PASS_REGEXP.del" OF DEL METHOD P (1, 2, 3, 4, 5) MESSAGES "DB_ACC_PASS_REGEXP.del.msg" INSERT INTO DB_ACC_PASS_REGEXP (APP_ID,APREGEXP,EXPLAIN_TEXT,ID,OPT_KZ) NONRECOVERABLE INDEXING MODE REBUILD
Which loads the Data specified in following File into the database.
1,"[a-z]",,1,0
1,"[A-Z]",,2,0
1,"[0-9]",,3,0
1,"[!|\"|§|$|%|&|/|(|)|=|?|`|´|*|+|~|'|#|-|_|.|:|,|;|µ|<|>| |°|^]",,4,0
^
Here is the Problem
The Problem is, that only 3 of these 4 inserts will be accepted. The last one will be rejected, because DB2 Load doesn't notice the escape character before the double quotation mark.
if I change the last line to:
1,"[!|x|§|$|%|&|/|(|)|=|?|`|´|*|+|~|'|#|-|_|.|:|,|;|µ|<|>| |°|^]",,4,0
^
Here is the changed character
there is no problem ..
WHY doesn't the escape character "\" work??
edit
Okay.. I just tryed it the oracle way now and that works ... I escape " with another " so my Line looks like
1,"[!|""|§|$|%|&|/|(|)|=|?|`|´|*|+|~|'|#|-|_|.|:|,|;|µ|<|>| |°|^]",,4,0
But that's only a way to do it .. That doesn't explain why IBM offers the Backslash as an escape character (http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.admin.cmd.doc%2Fdoc%2Fr0008305.html)

Using LOAD with ascii / delimited files requires to tune the file type modifiers (look on Table 6 and Table 8 of the docu page you linked). I am not quite sure, but I can't remember using backslash as escape character in DB2.
You can either use another character delimiter as double quotes with chardel option or force no character delimiter with nochardel option.
BUT ...
In your case you need special characters as regular expressions, so you will always need to escape " with "" and ' with ''. I think there is no other way to get this working.

Related

Use of column names in Redshift COPY command which is a reserved keyword

I have a table in redshift where the column names are 'begin' and 'end'. They are Redshift keywords. I want to explicitly use them in the Redshift COPY command. Is there a workaround rather than renaming the column names in the table. That will be my last option.
I tried to enclose them within single/double quotes, but looks like the COPY command only accepts comma separated column names.
Copy command works fails if you don't escape keywords as column name. e.g. begin or end.
copy test1(col1,begin,end,col2) from 's3://example/file/data1.csv' credentials 'aws_access_key_id=XXXXXXXXXXXXXXX;aws_secret_access_key=XXXXXXXXXXX' delimiter ',';
ERROR: syntax error at or near "end"
But, it works fine if as begin and end are enclosed by double quote(") as below.
copy test1(col1,"begin","end",col2) from 's3://example/file/data1.csv' credentials 'aws_access_key_id=XXXXXXXXXXXXXXX;aws_secret_access_key=XXXXXXXXXXX' delimiter ',';
I hope it helps.
If there is some different error please update your question.

Variable substitution of multiline list of strings in PostgreSQL

I'm trying to substitute the list in the following code:
kategori NOT IN ('Fors',
'Vattenfall',
'Markerad vinterled',
'Fångstarm till led',
'Ruskmarkering',
'Tält- och eldningsförbud, tidsbegränsat',
'Skidspår')
I found this question for the multiline part. However
SELECT ('Fors',
'Vattenfall',
'Markerad vinterled',
'Fångstarm till led',
'Ruskmarkering',
'Tält- och eldningsförbud, tidsbegränsat',
'Skidspår') exclude_fell \gset
gives
ERROR: column "fors" does not exist
LINE 1: SELECT (Fors,
^
, so I tried using triple quotes, dollar quotation and escape sequenses. Nothing has worked to satisfaction. This is true even if I use a single line variable and \set, so I must have misunderstood something about variable substitution. What is the best way of doing this?

Vertica COPY + FLEX table

I want to load on a flex table a log in which each record is composed by some fields + a JSON, the format is the following:
"concorde-fe";"DETAILS.SHOWN";"1bcilyejs6d4w";"2017-01-31T00:00:04.801Z";"2017-01-31T00:00:04.714Z";"{"requestedFrom":"BUTTON","tripId":{"request":3003926837969,"mac":"v01162450701"}}"
and (after many tries) I'm using the COPY command with a CSV parser in this way:
COPY schema.flex_table from local 'C:\temp/test.log' parser fcsvparser(delimiter=';',header=false, trim=true, type='traditional')
in this way all is loaded correctly except the JSON, that is skipped and left empty.
Is there a way to load also the JSON as a string?
HINT: just for test puposes, I noticed that if in the JSON I put a '\' before every '"' in the log, the loading runs smoothly, but unfortunately I cannot modify the content of the log.
Not without modifying the file beforehand - or writing your own UDParser function.
It clearly is a strange format: CSV (well, semicolon delimited and with double quotes as string enclosers), until the children appear - which are stored with a leading double quote and a trailing double quote; doubly nested with curly braces - JSON type, ok. But you have double quotes (not doubled) within the JSON encoding - any parser would go astray on those.
You'll have to write a program (ideally in C) to remove the curly braces, to remove the column names in the JSON code and leave just a CSV line
So, from the line (the backslash at the end means an escaped newline, meaning that the three lines you see are actually one line, for readability)
"concorde-fe";"DETAILS.SHOWN";"1bcilyejs6d4w";"2017-01-31T00:00:04.801Z"; \
"2017-01-31T00:00:04.714Z"; \
"{"requestedFrom":"BUTTON","tripId":{"request":3003926837969,"mac":"v01162450701"}}"
you make (title line with column names, then data line)
col1;col2;col3;timestampz1;\
timestampz2;requestedfrom;tripid_request;tripid_mac
"concorde-fe";"DETAILS.SHOWN";"1bcilyejs6d4w";"2017-01-31T00:00:04.801Z"; \
"2017-01-31T00:00:04.714Z";"BUTTON";3003926837969;"v01162450701"
Finally, you'll be able to load it as a CSV file - and maybe you will have to then normalise everything again: tripIdseems to be a dependent structure ....
Good luck
Marco the Sane

Using ASCII 31 field separator character as Postgresql COPY delimiter

We are exporting data from Postgres 9.3 into a text file for ingestion by Spark.
We would like to use the ASCII 31 field separator character as a delimiter instead of \t so that we don't have to worry about escaping issues.
We can do so in a shell script like this:
#!/bin/bash
DELIMITER=$'\x1F'
echo "copy ( select * from table limit 1) to STDOUT WITH DELIMITER '${DELIMITER}'" | (psql ...) > /tmp/ascii31
But we're wondering, is it possible to specify a non-printable glyph as a delimiter in "pure" postgres?
edit: we attempted to use the postgres escaping convention per http://www.postgresql.org/docs/9.3/static/sql-syntax-lexical.html
warehouse=> copy ( select * from table limit 1) to STDOUT WITH DELIMITER '\x1f';
and received
ERROR: COPY delimiter must be a single one-byte character
Try prepending E before the sequence you're trying to use as a delimter. For example E'\x1f' instead of '\x1f'. Without the E PostgreSQL will read '\x1f' as four separate characters and not a hexadecimal escape sequence, hence the error message.
See the PostgreSQL manual on "String Constants with C-style Escapes" for more information.
From my testing, both of the following work:
echo "copy (select 1 a, 2 b) to stdout with delimiter u&'\\001f'"| psql;
echo "copy (select 1 a, 2 b) to stdout with delimiter e'\\x1f'"| psql;
I've extracted a small file from Actian Matrix (a fork of Amazon Redshift, both derivatives of postgres), using this notation for ASCII character code 30, "Record Separator".
unload ('SELECT btrim(class_cd) as class_cd, btrim(class_desc) as class_desc
FROM transport.stg.us_fmcsa_carrier_classes')
to '/tmp/us_fmcsa_carrier_classes_mk4.txt'
delimiter as '\036' leader;
This is an example of how this file looks in VI:
C^^Private Property
D^^Private Passenger Business
E^^Private Passenger Non-Business
I then moved this file over to a machine hosting PostgreSQL 9.5 via sftp, and used the following copy command, which seems to work well:
copy fmcsa.carrier_classes
from '/tmp/us_fmcsa_carrier_classes_mk4.txt'
delimiter u&'\001E';
Each derivative of postgres, and postgres itself seems to prefer a slightly different notation. Too bad we don't have a single standard!

How to update a record with literal percent literal (%) in PostgreSQL without saving it as "\%"

I need to update a record, which contains literal percent signs, using PostgreSQL in Railo. The query looks like
<cfquery>
update foo set bar = 'string with % in it %'
</cfQuery>
It throws error as ColdFusion normally interprets it as a wildcard character. I can escape it using the following query.
<cfquery>
update foo set bar = 'string with escaped \% in it \%'
</cfQuery>
However, the record now contains "\%" in the database and will be displayed on the page as "\%".
I found a documentation with an example of escaping percent sign in a SELECT. But it does not work for me: syntax error at or near "ESCAPE".
SELECT emp_discount
FROM Benefits
WHERE emp_discount LIKE '10\%'
ESCAPE '\';
Is there a better to achieve the same goal? The underlining database is PostgreSQL. Thanks!
Queryparameters escape special characters. Yet another reason to use them.