How to import CSV into PostgreSQL - postgresql

Respected,
I have problems with importing CSV into PostgreSQL via pgAdmin. No matter what I do, it shows the following error:
ERROR: extra data after last expected column.
Can anyone please help me and point me out a possible solution?
Thank you.
Milorad K.

check that your data is formatted as postgresql expects it to be
That error could be caused by specifying the wrong quote character or the wrong field separator. or it could be that your input file is corrupt.
I've had corrupt CSV files from banks before, so don't trust anyone.

Related

Question/Resolved - "extra data after last expected column" Error when trying to import a csv file into postgresql

Just posting this question and the solution since it took forever for me to figure this out.
Using CSV file, I was trying to import data into PostgreSQL with pgAdmin. I kept running into the same issue of "extra data after last expected column."
Solution that worked for me (instead of using Import module): copy tablename (columns) FROM 'file location .csv' CSV HEADER
Since some of the data included multiple commas within the cell, it was counting as a new column each time.

COPY a csv file with 108 column into postgresql

I have a csv file with 108 columns which i try to import in my postgresql table. It is obvious that I don't want to specify every columns in my CREATE TABLE statement. But when I enter
\COPY 'table_name' FROM 'directory' DELIMITER ',' CSV HEADER; this error message shows up: "ERROR: Extra Data after Last Expected Column". When having a few columns I know how to fix this problem but, like I said, i don't want to specified the entire 108 columns. By the way my table does contain any columns at all. Any help on how I could do that? Thx !
When dealing with problems like this, I often cheat. Plenty of tools exist online for converting CSV to SQL, https://www.convertcsv.com/csv-to-sql.htm being one of them.
Copy/paste your CSV, copy/paste the generated SQL. Not the most elegant solution, although will work as a one-off situation.
Now, if you're looking to repeat this process regularly (automated I hope), then Python may be a interesting language to explore to quickly write a script to do this for you, then schedule it at a CRON job or whatever method you prefer for invoking it automatically with the correct input (CSV file).
Please feel free to let me know if I've misunderstood your original question, or if I can provide any more help give me a shout and I'll do my best!

boolean field in redshift copy

I am producing a comma-separated file in S3 that needs to be copied to a staging table in a redshift database using the postgres COPY command.
It has one boolean field. With every sensible way I can think of to represent the boolean value in the file, redshift copy complains, usually with "Unknown boolean format".
I'm going to give up and change the staging table field to a smallint so that I can proceed with the copy and translate the value on the load from staging to the final redshift table, but I'm curious if anyone knows the correct incantation.
A zero or one works just fine for us.
Check your loads carefully, it may well be another issue that's 'pushing' invalid data into your boolean column.
For instance, we had all kinds of crazy characters embedded in our data that would cause errors like that. I eventually settled on using the US character for the record separator.
Check to make sure you're excluding the headers during the COPY command.
I ran into the same problem, but adding the ignoreheader 1 option (ignores 1 header line during import) solved the issue.

Dealing with errors during a copy from

I've to import a file from an external source to a postgresql table.
I tried to do it with \copy from , but I keep getting errors (additional columns) in the middle of the file.
Is there a way to tell postgresql to ignore lines containing errors during a "\copy from" ?
Thanks
Give it a try with PostgreSQL Loader instead.
No. All data is correct or there is no data at all, those are the two options you have in PostgreSQL.

OpenRowSet command in TSQL is returning NULLS

Been investigating for a while now and keep hitting a brick wall. I am importing from xls files into temp tables via the OpenRowset command. Now I have a problem where I’m trying to import a certain column has a range values but the most common are the following. Columns structured as long numbers i.e. 15598 and the some columns as strings i.e. 15598-E.
Now the openrowset is reading the string version no problem but is reporting the number version as a NULL. I read (http://www.sqldts.com/254.aspx ) that openrowset has that issue and the author speaks of implementing “HDR=YES;IMEX=1” into the query string but that’s not working for me at all.
Have any of you guys every encountered this?
Just some more info as well. I may not do this with the JET engine (Microsoft.Jet.OLEDB.4.0) so this is what my query looks like:
SELECT *
FROM
OPENROWSET('MSDASQL'
, 'Driver=Microsoft Excel Driver (*.xls);HDR=YES;IMEX=1;DBQ=C:\ImportFile.xls;'
, 'SELECT * FROM [Sheet1$]')
I notice you are using the Excel ODBC driver. Have you tried the JET OLEDB Provider with the equivalent connection string?
select * from openrowset(
'Microsoft.Jet.OLEDB.4.0',
'Data Source=C:\ImportFile.xls;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1"',
'SELECT * FROM [Sheet1$]')
EDIT: Sorry, just noticed your last paragraph. Surely the Excel ODBC driver still goes via the JET engine, so what difference would it make?
EDIT: I have looked at the KB194124 link, and the registry values it recommends are the default values on my machine, which I have never changed. I have used the above method several times myself without problems. Maybe it's an environmental issue?
If you don't mind opening the file in Excel, take the columns that have the problem, select the column, and do
Data -> Text to Columns -> Next -> Next -> Text
Save the spreadsheet and they should all come in as Text in OPENROWSET
I've found using .CSV files instead of Excel, opened by setting up a Linked Server, and setting up the format of the files in schema.ini a more practical approach for handling imports like this, with that method you can explicitly choose each column's format.
We've come across the same issue. Unfortunately we've not found a solution either. There's more information here which indicates that there might be a registry fix.
I had the same problem. I fixed it cuting and pasting a row that contains a column with the string/numeric value (for example 123ABC) in the first row position of the sheet. For some reason T-SQL reads the first row and assumes that all the values are numeric.
Response by SqlACID in this link worked great [https://wikigurus.com/Article/Show/185717/OpenRowSet-command-in-TSQL-is-returning-NULLS] :-
If you don't mind opening the file in Excel, take the columns that have the problem, select the column, and do
Data -> Text to Columns -> Next -> Next -> Text
Save the spreadsheet and they should all come in as Text in OPENROWSET
I've found using .CSV files instead of Excel, opened by setting up a Linked Server, and setting up the format of the files in schema.ini a more practical approach for handling imports like this, with that method you can explicitly choose each column's format.