How can I import a jsonb column from a csv file using the COPY command? - postgresql

I am trying to import the following csv file into YugaByte DB YSQL. Note that the second entry in each row is a JSON object.
"15-06-2018","{\"file_name\": \"myfile1\", \"remote_ip\": \"X.X.X.X\"}"
"15-06-2018","{\"file_name\": \"myfile2\", \"remote_ip\": \"Y.Y.Y.Y\"}"
My table schema is:
postgres=# create table downloads_raw (request_date text, payload jsonb);
I want the JSON snippet in the imported file to become a JSONB value.
I tried doing the following:
postgres=# COPY downloads_raw FROM 'data.csv';
Hitting the following error:
ERROR: 22P04: missing data for column "payload"
CONTEXT: COPY downloads_raw, line 1: ""15-06-2018","{\"file_name\": \"myfile1\", \"remote_ip\": \"X.X.X.X\"}""
LOCATION: NextCopyFrom, copy.c:3443
Time: 2.439 ms

You need to specify FORMAT csv and ESCAPE '\'. Also, the format and escape options need to be enclosed in parenthesis. This should work:
COPY downloads_raw FROM 'data.csv' WITH (FORMAT csv, ESCAPE '\');
List of supported options for COPY command can be found here:
https://docs.yugabyte.com/latest/api/ysql/commands/cmd_copy/

Related

\copy PSQL command not working with POINT(...) geometry type

I have been trying to find the solution for quite a while...
The \copy command in Postgres is not working with the datatype geography(Point,4326).
The error it gives is:
ERROR: parse error - invalid geometry
HINT: "ST" <-- parse error at position 2 within geometry
CONTEXT: COPY data2, line 1, column loc: "ST_GeomFromText('POINT(62.0271954112486 87.9028962135794)')"
Here is the command I am using:
\copy data2(loc,d1,d2,d3,d4,d5,d6,d7,d8,d9,d10) from 'fake_data.csv' delimiter ',' csv;
I have inserted into the table using the exact same format as the csv file and it has been successful. It seems it is just something with the \copy command that doesn't like the format.
Here is an example row from my csv file:
ST_GeomFromEWKT('SRID=4326;POINT(28.872109890126964 160.10529558104636)'),24.237968,129.512386,227.032799,27.644993,60.959401,25.178026,201.229746,34.178728,250.975993,3.635878
You cannot use COPY to process an expression like this. Rather, the file has to contain the extended well-known binary (EWKB) format, which is what you get when you run
SELECT ST_GeomFromEWKT('SRID=4326;POINT(81.5538863138809 42.72341176405514)');
In your case, the CVS file will have to look like this:
0101000020E61000000C9D009842DF3C403DA0D6945E036440,24.237968,129.512386,227.032799,27.644993,60.959401,25.178026,201.229746,34.178728,250.975993,3.635878

Copy data from a txt file into database

I am using pgAdminIII and I want to copy data from a .txt file to my database.Let's say that we have a file called Address.txt and it has these values:
1,1970 Napa Ct.,Bothell,98011
2,9833 Mt. Dias Blv.,Bothell,98011
3,"7484, Roundtree Drive",Bothell,98011
4,9539 Glenside Dr,Bothell,98011
If I type
COPY myTable FROM 'C:\Address.txt' (DELIMITER(','));
I will get
ERROR: extra data after last expected column
CONTEXT: COPY address, line 3: "7484, Roundtree Drive",Bothell,98011
What do I need to add to the COPY command in order to ignore the , as a new column inside the " "?
You need to specify quote character such that:
COPY mytable FROM 'C:\Address.txt' DELIMITER ',' QUOTE '"' csv;

How to set the delimiter, Postgresql

I am wondering what the delimiter from this .csv file is. I am trying to import the .csv via the COPY FROM Statement, but somehow it throws always an error. When I change the delimiter to E'\t' it throws an error. When I change the delimiter to '|' it throws a different error. I have been trying to import a silly .csv file for 3 days and I cannot achieve a success. I really need your help. Here is my .csv file: Download here, please
My code on postgresql looks like this:
CREATE TABLE movie
(
imdib varchar NOT NULL,
name varchar NOT NULL,
year integer,
rating float ,
votes integer,
runtime varchar ,
directors varchar ,
actors varchar ,
genres varchar
);
MY COPY Statement:
COPY movie FROM '/home/max/Schreibtisch/imdb_top100t_2015-06-18.csv' (DELIMITER E'\t', FORMAT CSV, NULL '', ENCODING 'UTF8');
When I use SHOW SERVER_ENCODING it says "UTF8". But why the hell can't postgre read the datas from the columns? I really do not get it. I use Ubuntu 64 bit, the .csv file has all the permissions it needs, postgresql has also. Please help me.
These are my errors:
ERROR: missing data for column "name"
CONTEXT: COPY movie, line 1: "tt0468569,The Dark Knight,2008,9,1440667,152 mins.,Christopher Nolan,Christian Bale|Heath Ledger|Aar..."
********** Error **********
ERROR: missing data for column "name"
SQL state: 22P04
Context: COPY movie, line 1: "tt0468569,The Dark Knight,2008,9,1440667,152 mins.,Christopher Nolan,Christian Bale|Heath Ledger|Aar..."
Use this code instead it is working fine on Linux as well on windows
\COPY movie(imdib,name,year,rating,votes,runtime,directors,actors,genres) FROM 'D:\test.csv' WITH DELIMITER '|' CSV HEADER;
and one more thing insert header in your csv file like shown below:
imdib|name|year|rating|votes|runtime|directors|actors|genres
tt0111161|The Shawshank Redemption|1994|9.3|1468273|142 mins.|Frank Darabont|Tim Robbins|Morgan Freeman
and use single byte delimiter like ',','|' etc.
Hope this will work for you ..!
The following works for me:
COPY movie (imdib,name,year,rating,votes,runtime,directors,actors,genres)
FROM 'imdb_top100t_2015-06-18.csv'
WITH (format csv, header false, delimiter E'\t', NULL '');
Unfortunately the file is invalid because on line 12011 the column year contains the value 2015 Video and thus the import fails because this can't be converted to an integer. And then further down (line 64155) there is an invalid value NA for the rating which can't be converted to a float and then one more for the votes.
But if you create the table with all varchar columns the above command worked for me.

Python PostgreSQL using copy_from to COPY list of objects to table

I'm using Python 2.7 and psycopg2 to connect to my DB server ( PostgreSQL 9.3 ) and I a list of objects of ( Product Class ) holds the items which i want to insert
products_list = []
products_list.append(product1)
products_list.append(product2)
And I want to use copy_from to insert this products list to the product table. I tried some tutorials and i had a problem with converting the products list to CSV format because the values contain single quote, new lines, tabs and double quotes. For example ( Product Description ) :
<div class="product_desc">
Details :
Product's Name : name
</div>
The escaping corrupted the HTML code by adding single quote before any single quote and it, So i need to use a save way to convert the list into CSV to COPY it? OR using any other way to insert the list without converting it to CSV format??
I figured it out, First of all i created a function to convert my object to csv row
import csv
#staticmethod
def adding_product_to_csv(item, out):
writer = csv.writer(out, quoting=csv.QUOTE_MINIMAL,quotechar='"',delimiter=',',lineterminator="\r\n")
writer.writerow([item.name,item.description])
Then in my code i created a csv file using Python IO to store the data in it to COPY it and stored every object in the csv file using my previous function:
file_name = "/tmp/file.csv"
myfile = open(file_name, 'a')
for item in object_items:
adding_product_to_csv(item, myfile)
Now I created the CSV file and it's ready to be copied using copy_from which exists in psycopg2 :
# For some reason it needs to be closed before copying it to the table
csv_file.close()
cursor.copy_expert("COPY products(name, description) from stdin with delimiter as ',' csv QUOTE '\"' ESCAPE '\"' NULL 'null' ",open(file_name))
conn.commit()
# Clearing the file
open(file_name, 'w').close()
And it's working now.

postgresql how to have COPY interpret formatted numeric fields automatically?

I have an input CSV file containing something like:
SD-32MM-1001,"100.00",4/11/2012
SD-32MM-1001,"1,000.00",4/12/2012
I was trying to COPY import that into a postgresql table(varchar,float8,date) and ran into an error:
# copy foo from '/tmp/foo.csv' with header csv;
ERROR: invalid input syntax for type double precision: "1,000.00"
Time: 1.251 ms
Aside from preprocessing the input file, is there some setting in PG that will have it read a file like the one above and convert to numeric form in COPY? Something other than COPY?
If preprocessing is required, can it be set as part of the COPY command? (Not the psql \copy)?
Thanks a lot.
The option to pre processing is to first copy to a temporary table as text. From there insert into the definitive table using the to_number function:
select to_number('1,000.00', 'FM000,009.99')::double precision;
It's an odd CSV file that surrounds numeric values with double quotes, but leaves values like SD-32MM-1001 unquoted. In fact, I'm not sure I've ever seen a CSV file like that.
If I were in your shoes, I'd try copy against a file formatted like this.
"SD-32MM-1001",100.00,4/11/2012
"SD-32MM-1001",1000.00,4/12/2012
Note that numbers have no commas. I was able to import that file successfully with
copy test from '/fullpath/test.dat' with csv
I think your best bet is to get better formatted output from your source.