DROP table if exists legislators;
CREATE table legislators
(
...
)
;
COPY legislators
FROM 'C:\data\legislators.csv'
DELIMITER ',' -- It is written in different line from `FROM` clause but it raises ERROR.
CSV HEADER;
I am trying to import CSV file to PostgreSQL on HeidiSQL v11. When I execute a query written in muliple lines as above, it raises an error:
ERROR: syntax error at or near "CSV" LINE 1: CSV HEADER;
However, I found that if I write a FROM clause and DELIMITER ',' in a single line together as below, it works well.
COPY legislators
FROM 'C:\data\legislators.csv' DELIMITER ',' -- These FROM and DELIMITER should be the same line to work
CSV HEADER;
I know SQL basically ignores whitespace, but I am confused why this happens.
It would be very appreciate someone help me. Thanks.
That's just because HeidiSQL is not very smart about parsing PostgreSQL lines and gets confused. It executed the statement as two statements, which causes the error.
Use a different client with PostgreSQL.
Related
I want to write a CSV file to a table in Postgres via Airflow.
I came across this Airflow documentation denoting that the hook already has a builtin function for CSV export.
And used this thread on how to use it.
I have a python operator whose python_callable is as follows:
def copy_expert_csv():
hook = PostgresHook(postgres_conn_id='warehouse',host='data-warehouse',
database='datalake',
user='root',
password='root',
port=9999)
with hook.get_conn() as connection:
hook.copy_expert("""COPY datalake.public.wcc_users FROM stdin WITH CSV HEADER
DELIMITER as ',' """,
'includes/cleaned_data/wwc/' + str(date.today()) + '_wwc_cleaned ')
connection.commit()
The task finishes successfully as shown in the image.
:
And there is no error logs on my database either:
materials-data-warehouse-1 | 2022-04-29 17:43:01.942 UTC [198] STATEMENT: COPY datalake.public.wcc_users FROM STDIN WITH (FORMAT CSV) HEADER
My file has around 1000 rows. However, when I select from my table, there are 0 rows inserted.
The column naming in table are different from the file and also 2 columns have date and timestamp datatypes rather than text. Can it be the cause? Then why no errors are thrown?
Seems that table definition was incorrect. This throws no errors but inserts nothing either.
How to fix this error, when I try to copy a csv into my personal database table, it gives this error.
Bigger background is I want to import a csv file from local to my database to as an extract and dependent table. Don't know how to directly load the file to be a table, so, I first create an empty table, then copy the csv file to this empty table.
This is the command I used:
\copy persons (supervisor_lname, supervisor_fname, lname, fname, supervisor_id)
FROM '/Users/baoying/Downloads/sql.csv'
DELIMITER ','
CSV HEADER;
I'm very confused about this error I'm getting in my Query Tool in PgAdmin. I've been working on this for days, and cannot find a solution to fixing this error when attempting to upload this csv file to my Postgres table.
ERROR: invalid input syntax for type numeric: "2021-02-14"
CONTEXT: COPY CardData, line 2, column sold_price: "2021-02-14"
SQL state: 22P02
Here is my code in the Query Tool that I am running
CREATE TABLE Public."CardData"(Title text, Sold_Price decimal, Bids int, Sold_Date date, Card_Link text, Image_Link text)
select * from Public."CardData"
COPY Public."CardData" FROM 'W:\Python_Projects\cardscrapper_project\ebay_api\card_data_test.csv' DELIMITER ',' CSV HEADER ;
Here is a sample from the first row of my csv file.
Title,Sold_Date,Sold_Price,Bids,Card_Link,Image_Link
2018 Contenders Optic Sam Darnold #103 Red Rookie #/99 PSA 8 NM-MT AUTO 10,2021-02-14,104.5,26,https://www.ebay.com/itm/2018-Contenders-Optic-Sam-Darnold-103-Red-Rookie-99-PSA-8-NM-MT-AUTO-10/143935698791?hash=item21833c7767%3Ag%3AjewAAOSwNb9gGEvi&LH_Auction=1,https://i.ebayimg.com/thumbs/images/g/jewAAOSwNb9gGEvi/s-l225.jpg
The "Sold_Date" column is in the correct datetime format that is easy for Postgres to understand, but the error is calling on the "Sold-Price" column?
I'm very confused. Any help is greatly appreciated.
Notice that the columns are not in the same order in the csv file and in the table.
You would have to specify the proper column order
COPY Public."CardData" (Title,Sold_Date,Sold_Price,Bids,Card_Link,Image_Link)
FROM 'W:\Python_Projects\cardscrapper_project\ebay_api\card_data_test.csv'
DELIMITER ',' CSV HEADER ;
You have created the table with sold_price as the second column, so the COPY command will expect a price/number to be the second column in your CSV file. Your CSV file, however has sold_date as the second column, which will lead to the data type mismatch error that you see.
Either you can re-define your CREATE TABLE statement with the sold_date as second column and sold_price as 4th column, or you can specify the column parsing order in your COPY statement as COPY public."CardData" (<column order>)
Another option is to open up the CSV file in Excel and re-order the columns and do a Save As...
I tried to import a csv file (of size ~6GB) from s3 to redshift with the COPY command:
copy test.test_pat_temp from 's3://some_location/large_file.csv'
credentials 'aws_access_key_id=<access_key>;aws_secret_access_key=<Secret_Key>'
DELIMITER AS ','
EMPTYASNULL
BLANKSASNULL;
But got the following error:
An error occurred when executing the SQL command:
copy test_qa.test_pat_temp from 's3://some_location/large_file.csv'
credentials 'aws_access_...
Amazon Invalid operation: Load into table 'test_pat_temp' failed. Check 'stl_load_errors' system table for details.;
Execution time: 42.34s
1 statement failed.
The reason for the error in 'stl_load_errors' table is "Extra column(s) found".
I checked the csv file and it had comma (,) in many cells of name column. e.g. Lastname,Firstname.
How do I handle the comma while importing the csv file in redshift? I googled the error and just got the generic answer "handle the commas in the required column". Can anyone give me some details on how to handle the comma?
There are 329 columns and one of the columns is FULL_NAME with value say "Last_name, First_name". The values of the row are separated by comma. so a row would be something like: 1,2,88,,"Last_name,First_name",Company,,,,stack,overflow,,,, and so on.
I managed to import the file by simply adding REMOVEQUOTES option:
copy test.test_pat_temp from 's3://some_location/large_file.csv'
credentials 'aws_access_key_id=;aws_secret_access_key='
EMPTYASNULL
BLANKSASNULL
REMOVEQUOTES;
I have created a table in my database with name 'con' which has two columns with the name 'date' and 'kgs'. I am trying to extract data from this 'hi.rpt' file copied on this location 'H:Sir\data\reporting\hi.rpt' and want to store values in the table 'con' in my database.
I have tried this code in pgadmin
When I run:
COPY con (date,kgs)
FROM 'H:Sir\data\reporting\hi.rpt'
WITH DELIMITER ','
CSV HEADER
date AS 'Datum/Uhrzeit'
kgs AS 'Summe'
I get the error:
ERROR: syntax error at or near "date"
LINE 5: date AS 'Datum/Uhrzeit'
^
********** Error **********
ERROR: syntax error at or near "date"
SQL state: 42601
Character: 113
"hi.rpt" file from which i am reading the data look like this:
Datum/Uhrzeit,Sta.,Bez.,Unit,TBId,Batch,OrderNr,Mat1,Total1,Mat2,Total2,Mat3,Total3,Mat4,Total4,Mat5,Total5,Mat6,Total6,Summe
41521.512369(04.09.13 12:17:48),TB01,TB01,005,300,9553,,2,27010.47,0,0.00,0,0.00,3,1749.19,0,0.00,0,0.00,28759.66
41521.547592(04.09.13 13:08:31),TB01,TB01,005,300,9570,,2,27057.32,0,0.00,0,0.00,3,1753.34,0,0.00,0,0.00,28810.66
Is it possible to extract only two data values from 20 different type of data that i have in this 'hi.rpt' file or not?
or is there only a mistake in the syntax that i have written?
What is the correct way to write it?
I don't know where you got that syntax, but COPY doesn't take a list of column aliases like that. See the help:
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
(AS isn't one of the listed options; to see the full output run \d copy in psql, or look at the manual for the copy command online).
There is no mapping facility in COPY that lets you read only some columns of the input CSV. It'd be really useful, but nobody's had the time/interest/funding to implement it yet. It's really only one of many data transform/filtering tasks people want anyway.
PostgreSQL expects the column-list given in COPY to be in the same order, left-to-right, as what's in the CSV file, and have the same number of entries as the CSV file has columns. So if you write:
COPY con (date,kgs)
then PostgreSQL will expect an input CSV with exactly two columns. It'll use the first csv column for the "date" table column and the second csv column for the "kgs" table column. It doesn't care what the CSV headers are, they're ignored if you specify WITH (FORMAT CSV, HEADER ON), or treated as normal data rows if you don't specify HEADER.
PostgreSQL 9.4 adds FROM PROGRAM to COPY, so you could run a shell command to read the file and filter it. A simple Python or Perl script would do the job.
If it's a small file, just open a copy in the spreadsheet of your choice as a csv file, delete the unwanted columns, and save it, so only the date and kgs columns remain.
Alternately, COPY to a staging table that has all the same columns as the CSV, then do an INSERT INTO ... SELECT to transfer just the wanted data into the real target table.