PostgreSQL COPY cannot read JSON from CSV file - postgresql

I'm copying data from a CSV file into a PostgreSQL table using COPY
My CSV file is simply:
0\"a string"
And my table "Test" was created by the following:
create table test (
id integer,
data jsonb
);
My copy statement was the following:
I received the following error:
williazz=# \copy test from 'test/test.csv' delimiters '\' CSV
ERROR: invalid input syntax for type json
DETAIL: Token "a" is invalid.
CONTEXT: JSON data, line 1: a...
COPY test, line 1, column data: "a string"
Interestingly, when I changed my CSV file to a number, it had no problem.
CSV:
0\1505
williazz=# \copy test from 'test/test.csv' delimiters '\' CSV
COPY 1
williazz=# select * from test;
id | data
----+------
0 | 1505
(1 row)
Furthermore, numbers in arrays also work:
CSV:
1\[0,1,2,3,4,5]
williazz=# select * from test;
id | data
----+---------------
0 | 1505
1 | [0,1,2,3,4,5]
(2 rows)
But as soon as I introduct a non-digit string into the JSON, the COPY stops working
0\[1,2,"three",4,5]
ERROR: invalid input syntax for type json
DETAIL: Token "three" is invalid.
CONTEXT: JSON data, line 1: [1, 2, three...
COPY test, line 1, column data: "[1, 2, three, 4, 5]"
I cannot get postgres to read a non-digit string in JSON format. I've also tried changing the data type of column "data" from jsonb to json, and using basically every combination of single and double quotes
Could someone please help me identify the problem? Thank you

Because your file is CSV encoded, it does not mean what you think.
0\"a string"
With a delimiter of \ this is two values: the number 0 and the string a string. Note the lack of quotes. Those quotes are part of the CSV string formatting. a string is not valid JSON, the quotes are required.
Instead you need to include the JSON string quotes inside the CSV string quotes. Quotes in CSV are escaped by doubling them.
0\"""a string"""
Now that is the number 0 and the string "a string" including quotes.
And as an observation, it would be simpler to remove the complication of embedding JSON into a CSV and use a pure JSON file.
[
[0, "a string"],
[1, "other string"]
]

Related

Postgres CSV import - handle empty strings as integers

I have a ton of CSV files that I'm trying to import into Postgres. The CSV data is all quoted regardless of what the data type is. Here's an example:
"3971","14","34419","","","","","6/25/2010 9:07:02 PM","70.21.238.46 "
The first 4 columns are supposed to be integers. Postgres handles the cast from the string "3971" to the integer 3971 correctly, but it pukes at the empty string in the 4th column.
PG::InvalidTextRepresentation: ERROR: invalid input syntax for type integer: ""
This is the command I'm using:
copy "mytable" from '/path/to/file.csv' with delimiter ',' NULL as '' csv header
Is there a proper way to tell Postgres to treat empty strings as null?
How to do this. Since I'm working in psql and using a file that the server user can't reach I use \copy, but the principle is the same:
create table csv_test(col1 integer, col2 integer);
cat csv_test.csv
"1",""
"","2"
\copy csv_test from '/home/aklaver/csv_test.csv' with (format 'csv', force_null (col1, col2));
COPY 2
select * from csv_test ;
col1 | col2
------+------
1 | NULL
NULL | 2

How to import a CSV containing a jsonb column type

I'm trying to import data into a table with a jsonb column type, using a csv. I've read the csv specs that say any column value containing double quotes needs to:
be wrapped in quotes (double quotes at beginning and end)
double quotes escaped with a double quote (so if you want a double quote, you must use 2 double quotes instead of just 1 double quote)
My csv column value for the jsonb type looks like this (shortened for brevity):
"[
{
""day"": 0,
""schedule"": [
{
""open"": ""07:00"",
""close"": ""12:00""
}
]
}
]"
Note: opened this csv in notepad++ in case the editor is doing any special escaping, and all quotes are as they appear in editor.
Now I was curious about what the QUOTE AND ESCAPE values were in that PGAdmin error message, so here they are copied/pasted:
QUOTE '\"'
ESCAPE '''';""
To upload to PGAdmin, do I need to use \" to around each json token as (possibly?) suggested by that QUOTE value in the error message?
I'm using Go's encoding/csv package to write the csv.
I can load your file into a json or jsonb typed column using:
copy j from '/tmp/foo.csv' csv;
or
copy j from '/tmp/foo.csv' with (format csv);
or their \copy equivalents.
Based on your truncated (incomplete) text-posted-as-image, it is hard to tell what you are actually doing. But if you do it right, it will work.
The easiest workaround I've found is to copy the json data into a text column in a temporary staging table.
Then issue a query that follows the pattern:
insert into mytable (...) select ..., json_txtcol::json from staging_table
You can process it through another command before PostgreSQL receives the data, replacing the double double-quotes with an escaped double-quote.
For example:
COPY tablename(col1, col2, col3)
FROM PROGRAM $$sed 's/""/\\"/g' myfile.csv$$
DELIMITER ',' ESCAPE '\' CSV HEADER;
Here's a working example:
/tmp/input.csv contains:
Clive Dunn, "[ { ""day"": 0, ""schedule"": [{""open"": ""07:00"", ""close"": ""12:00""}]}]", 3
In psql (but should work in PgAdmin):
postgres=# CREATE TABLE test (person text, examplejson jsonb, num int);
CREATE TABLE
postgres=# COPY test (person, examplejson, num) FROM PROGRAM $$sed 's/""/\\"/g' /tmp/input.csv$$ CSV DELIMITER ',' ESCAPE '\';
COPY 1
postgres=# SELECT * FROM test;
person | examplejson | num
------------+-----------------------------------------------------------------+-----
Clive Dunn | [{"day": 0, "schedule": [{"open": "07:00", "close": "12:00"}]}] | 3
(1 row)
Disclosure: I am an EnterpriseDB (EDB) employee.

kdb+: Save table with a column with a list of float into a csv file

I have a table "floats" with two columns: sym and prices. sym elements are strings and prices elements are list of floats.
q)LF:((3.0;1.0;2.0);(5.0;7.0;4.0);(2.0;8.0;9.0))
q)show floats:flip `sym`prices!(`6AH0`6AH6`6AH7;LF)
sym prices
-----------
6AH0 3 1 2
6AH6 5 7 4
6AH7 2 8 9
I want to export the table "floats" on a csv file but I get this error:
q)save `:floats.csv
'type
[0] save `:floats.csv
I followed this post kdb+: Save table into a csv file which solves the problem if the column is a list of string. Unfortunately when I try to convert the "prices" column to a list of chars and then save to CSV using the internal function, the procedure returns errors:
q))#[`floats;`prices;" " sv']
'type
[7] #[`floats;`prices;" " sv']
^
q))#[`floats;`prices;string]
'noamend: `. `floats
[10] #[`floats;`prices;string]
^
q))#[`floats;string `prices;" " sv']
'noamend: `. `floats
[10] #[`floats;string `prices;" " sv']
^
Please help me in converting the "prices" column to a list of chars and then save to CSV using the internal function or provide valid alternatives to export the table on a text file.
First, you need to convert float to string then use sv with adverb each right denoted by /: .
floats: update " " sv/: string each prices from floats

Insert json string with field names enclosed in single quotes into postgresql as a jsonb field

I have strings representing jsons where field names and values are enclosed in single quotes, for example {'name': 'Grzegorz', 'age': 123}. Let's assume that I have also a table in postgres database:
CREATE TABLE item (
metadata jsonb
);
I'm trying to insert rows using JOOQ. JOOQ generates the following statement:
insert into Item values('{''name'': ''Grzegorz'', ''age'': 123}'::jsonb); but an error is thrown:
ERROR: invalid input syntax for type json
LINE 1: insert into Item values('{''name'': ''Grzegorz'', ''age'': 1...
Token "'" is invalid.
JSON data, line 1: {'...
Is there any possibility to insert json with names enclosed in single quotes ' instead of double quotes " or should I convert all ' to "?
Thanks in advance!
Grzegorz
Json syntax requires double quotes, so It is not a question of Postgres. Server accepts only valid json values.
You can use replace():
insert into item
values(replace('{''name'': ''Grzegorz'', ''age'': 123}', '''', '"')::jsonb);
select * from item;
metadata
----------------------------------
{"age": 123, "name": "Grzegorz"}
(1 row)

postgresql COPY and CSV data w/ double quotes

Example CSV line:
"2012","Test User","ABC","First","71.0","","","0","0","3","3","0","0","","0","","","","","0.1","","4.0","0.1","4.2","80.8","847"
All values after "First" are numeric columns. Lots of NULL values just quoted as such, right.
Attempt at COPY:
copy mytable from 'myfile.csv' with csv header quote '"';
NOPE: ERROR: invalid input syntax for type numeric: ""
Well, yeah. It's a null value. Attempt 2 at COPY:
copy mytable from 'myfile.csv' with csv header quote '"' null '""';
NOPE: ERROR: CSV quote character must not appear in the NULL specification
What's a fella to do? Strip out all double quotes from the file before running COPY? Can do that, but I figured there's a proper solution to what must be an incredibly common problem.
While some database products treat an empty string as a NULL value, the standard says that they are distinct, and PostgreSQL treats them as distinct.
It would be best if you could generate your CSV file with an unambiguous representation. While you could use sed or something to filter the file to good format, the other option would be to COPY the data in to a table where a text column could accept the empty strings, and then populate the target table. The NULLIF function may help with that: http://www.postgresql.org/docs/9.1/interactive/functions-conditional.html#FUNCTIONS-NULLIF -- it will return NULL if both arguments match and the first value if they don't. So, something like NULLIF(txtcol, '')::numeric might work for you.
as an alternative, using
sed 's/""//g' myfile.csv > myfile-formatted.csv
psql
# copy mytable from 'myfile-formatted.csv' with csv header;
works as well.
I think all you need to do here is the following:
COPY mytable from '/dir/myfile.csv' DELIMITER ',' NULL '' WITH CSV HEADER QUOTE ;
COPY mytable from '/dir/myfile.csv' DELIMITER ',' NULL ''
WITH CSV HEADER FORCE QUOTE *;
This worked for me in Python 3.8.X
import psycopg2
import csv
from io import StringIO
db_conn = psycopg2.connect(host=t_host, port=t_port,
dbname=t_dbname, user=t_user, password=t_pw)
cur = db_conn.cursor()
csv.register_dialect('myDialect',
delimiter=',',
skipinitialspace=True,
quoting=csv.QUOTE_MINIMAL)
with open('files/emp.csv') as f:
next(f)
reader = csv.reader(f, dialect='myDialect')
buffer = StringIO()
writer = csv.writer(buffer, dialect='myDialect')
writer.writerows(reader)
buffer.seek(0)
cur.copy_from(buffer, 'personnes', sep=',', columns=('nom', 'prenom', 'telephone', 'email'))
db_conn.commit()