mongoimport on csv: bare " in non-quoted-field - mongodb

Running mongoimport, I get the error:
Failed: read error on entry #2: line 3, column 25: bare " in non-quoted-field
This is line 3:
1,0xcrypton,"Hyderabad, India","'Hyderabad', ' India'",17.38405,78.45636
There are plenty of other questions about this error, but they're all related to double quotes or escaped quotes. There's none of that here. In fact, to be on the safe side I deleted all double quotes from the csv. What's going on?
edit: Also tried removing the entire rest of the csv so it's just this line and the header line. Still getting the error.

Well, this happened because I created the csv using pandas to_csv and used the encoding='utf16' option. Apparently doing so breaks quotation marks.

Related

Error Code: 1582. Incorrect parameter count in the call to native function 'SUBSTRING_INDEX'

When I tried to copy the text from before the first comma in the first column to the second column using the following command:
UPDATE Table_Name
SET second_column = SUBSTRING_INDEX(first_column, ‘,’, 1);
I got the error message:
Error Code: 1582. Incorrect parameter count in the call to native function 'SUBSTRING_INDEX'
What is going wrong?
Thanks in advance.
The error came from copying and pasting the command from a non-programming text editor (i.e. Word):
The quotes around the delimiter were changed from two normal single quotes (') and (') to left and right single quotes (‘) and (’).
Changing the delimiter to two normal single quotes around the delimiter solved the problem:
UPDATE Table_Name
SET second_column = SUBSTRING_INDEX(first_column, ',', 1);

Variable substitution of multiline list of strings in PostgreSQL

I'm trying to substitute the list in the following code:
kategori NOT IN ('Fors',
'Vattenfall',
'Markerad vinterled',
'Fångstarm till led',
'Ruskmarkering',
'Tält- och eldningsförbud, tidsbegränsat',
'Skidspår')
I found this question for the multiline part. However
SELECT ('Fors',
'Vattenfall',
'Markerad vinterled',
'Fångstarm till led',
'Ruskmarkering',
'Tält- och eldningsförbud, tidsbegränsat',
'Skidspår') exclude_fell \gset
gives
ERROR: column "fors" does not exist
LINE 1: SELECT (Fors,
^
, so I tried using triple quotes, dollar quotation and escape sequenses. Nothing has worked to satisfaction. This is true even if I use a single line variable and \set, so I must have misunderstood something about variable substitution. What is the best way of doing this?

Vertica COPY + FLEX table

I want to load on a flex table a log in which each record is composed by some fields + a JSON, the format is the following:
"concorde-fe";"DETAILS.SHOWN";"1bcilyejs6d4w";"2017-01-31T00:00:04.801Z";"2017-01-31T00:00:04.714Z";"{"requestedFrom":"BUTTON","tripId":{"request":3003926837969,"mac":"v01162450701"}}"
and (after many tries) I'm using the COPY command with a CSV parser in this way:
COPY schema.flex_table from local 'C:\temp/test.log' parser fcsvparser(delimiter=';',header=false, trim=true, type='traditional')
in this way all is loaded correctly except the JSON, that is skipped and left empty.
Is there a way to load also the JSON as a string?
HINT: just for test puposes, I noticed that if in the JSON I put a '\' before every '"' in the log, the loading runs smoothly, but unfortunately I cannot modify the content of the log.
Not without modifying the file beforehand - or writing your own UDParser function.
It clearly is a strange format: CSV (well, semicolon delimited and with double quotes as string enclosers), until the children appear - which are stored with a leading double quote and a trailing double quote; doubly nested with curly braces - JSON type, ok. But you have double quotes (not doubled) within the JSON encoding - any parser would go astray on those.
You'll have to write a program (ideally in C) to remove the curly braces, to remove the column names in the JSON code and leave just a CSV line
So, from the line (the backslash at the end means an escaped newline, meaning that the three lines you see are actually one line, for readability)
"concorde-fe";"DETAILS.SHOWN";"1bcilyejs6d4w";"2017-01-31T00:00:04.801Z"; \
"2017-01-31T00:00:04.714Z"; \
"{"requestedFrom":"BUTTON","tripId":{"request":3003926837969,"mac":"v01162450701"}}"
you make (title line with column names, then data line)
col1;col2;col3;timestampz1;\
timestampz2;requestedfrom;tripid_request;tripid_mac
"concorde-fe";"DETAILS.SHOWN";"1bcilyejs6d4w";"2017-01-31T00:00:04.801Z"; \
"2017-01-31T00:00:04.714Z";"BUTTON";3003926837969;"v01162450701"
Finally, you'll be able to load it as a CSV file - and maybe you will have to then normalise everything again: tripIdseems to be a dependent structure ....
Good luck
Marco the Sane

Postgres: Inconsistent reported row numbers by failed Copy error

I have a file foo that I'd like to copy into a table.
\copy stage.from_csv FROM '/path/foo.dat' CSV;
If foo has an error like column mismatch or bad type, the return error is normal:
CONTEXT: COPY from_csv, line 5, column report_year: "aa"
However, if the error is caused by an extraneous quotation mark, the reported line number is always one greater than the size of the file.
CONTEXT: COPY from_csv, line 11: "02,2004,"05","123","09228","00","SUSX","PR",30,,..."
The source file has 10 lines, and I placed the error in line 5. If you examine the information in the CONTEXT message, it contains the line 5 data, so I know postgres can identify the row itself. However, it cannot identify the row by number. I have done this with a few different file lengths and the returned line number behavior is consistent.
Anyone know the cause and/or how to get around this?
That is because the error manifests only at the end of the file.
Look at this CSV file:
"google","growing",1,2
"google","growing",2,3
"google","falling","3,4
"google","growing",4,5
"yahoo","growing",1,2
There is a bug in the third line, an extra " has been added before the 3.
Now to parse a CSV file, you first read until you hit the next line break.
But be careful, line breaks that are within double quotes don't count!
So all of the line breaks are part of a quoted string, and the line continues until the end of file.
Now that we have read our line, we can continue parsing it and notice that the number of quotes is unbalanced. Hence the error message:
ERROR: unterminated CSV quoted field
CONTEXT: COPY stock, line 6: ""google","falling","3,4
"google","growing",4,5
"yahoo","growing",1,2
"
In a nutshell: the error does not occur until we have reached the end of file.

DB2 Load from delimitited Files - escape " in Fields doesn't work

I bet it's totaly simple and i just don't see it, but i don't get it ..
I execute the following command in DB2 command line processor:
DB2 LOAD FROM "DB_ACC_PASS_REGEXP.del" OF DEL METHOD P (1, 2, 3, 4, 5) MESSAGES "DB_ACC_PASS_REGEXP.del.msg" INSERT INTO DB_ACC_PASS_REGEXP (APP_ID,APREGEXP,EXPLAIN_TEXT,ID,OPT_KZ) NONRECOVERABLE INDEXING MODE REBUILD
Which loads the Data specified in following File into the database.
1,"[a-z]",,1,0
1,"[A-Z]",,2,0
1,"[0-9]",,3,0
1,"[!|\"|§|$|%|&|/|(|)|=|?|`|´|*|+|~|'|#|-|_|.|:|,|;|µ|<|>| |°|^]",,4,0
^
Here is the Problem
The Problem is, that only 3 of these 4 inserts will be accepted. The last one will be rejected, because DB2 Load doesn't notice the escape character before the double quotation mark.
if I change the last line to:
1,"[!|x|§|$|%|&|/|(|)|=|?|`|´|*|+|~|'|#|-|_|.|:|,|;|µ|<|>| |°|^]",,4,0
^
Here is the changed character
there is no problem ..
WHY doesn't the escape character "\" work??
edit
Okay.. I just tryed it the oracle way now and that works ... I escape " with another " so my Line looks like
1,"[!|""|§|$|%|&|/|(|)|=|?|`|´|*|+|~|'|#|-|_|.|:|,|;|µ|<|>| |°|^]",,4,0
But that's only a way to do it .. That doesn't explain why IBM offers the Backslash as an escape character (http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.admin.cmd.doc%2Fdoc%2Fr0008305.html)
Using LOAD with ascii / delimited files requires to tune the file type modifiers (look on Table 6 and Table 8 of the docu page you linked). I am not quite sure, but I can't remember using backslash as escape character in DB2.
You can either use another character delimiter as double quotes with chardel option or force no character delimiter with nochardel option.
BUT ...
In your case you need special characters as regular expressions, so you will always need to escape " with "" and ' with ''. I think there is no other way to get this working.