"Invalid Input Syntax for Integer" in pgAdmin - postgresql

I'm migrating data into Postgresql. I can generate my data into CSV or tab-delimited files, and I'm trying to import these files using pgAdmin.
An example CSV file looks exactly like this:
86,72,1,test
72,64,1,another test
The table I'm importing into looks like this:
CREATE TABLE common.category
(
id integer NOT NULL,
parent integer,
user_id integer,
name character varying(128),
CONSTRAINT category_pkey PRIMARY KEY (id),
CONSTRAINT category_parent_fkey FOREIGN KEY (parent)
REFERENCES common.category (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
)
However, upon importing this example, pgAdmin complains about an Invalid Input Syntax for Integer: "86" on the first line.
What am I missing here? I've tried performing the same import using a tab-delimited file, I've tried converting to both Windows and Unix EOLs.

Your sample have dependencies in the order of data imported. There is a foreign key 'parent' referencing 'id'. Having id 64 already in table, changing the order of your sample lines it imports just fine with:
COPY common.category
FROM 'd:\temp\importme.txt'
WITH CSV

I came across the same problem. After 2 hours of google, this did solve it. I just re-added the first line of the csv file, and every thing goes well now.

I had the same error after creating a new text file in Windows Explorer and changing the file extension to .csv.
I copied columns from an existing CSV file in Excel to the new one, also in Excel. After reading #Litty's comment about it not being tab-delimited it made me wonder if that was my problem.
Sure enough, opening the files in Excel hid the tab delimiting. When I opened it in Notepad++ it was obvious. I had to Export->Change File Type->CSV (Comma delimited) before I could import the file using pgAdmin as a default CSV file.

Related

How to import data from csv to postgresql table that has an auto_increment first column or field

Hi I am trying to import data to a postgresql table from a csv file, so I'd like to know how do I exlude the first column since it is an identity column that increments when data is inserted?
Error I get is here
If you are using some script to add or modify data, then you should just skip the variable on the script (e.g. does not write then on the insert statement), but on doing so you should modify your csv to delete the insert column(the data, if any, and the separator, usually a comma) since the number of variables are now different.
Looking at your print I suppose you are using pgadmin or some simmilar GUI, on the case of pgadmin if you click on the column and select the import\export data... option, you will open a windows where you should select "import" and then, on the upper windows menu, click on "columns" and exclude the "ID" or any other auto-increment, this slution also needs you to remove the csv column as well.

Basic questions about Cloud SQL

I'm trying to populate a cloud sql database using a cloud storage bucket, but I'm getting some errors. The csv has the headers (or column names) as first row and does not have all the columns (some columns in the database can be null, so I'm loading the data I need for now).
The database is in postgresql and this is the first database in GCP I'm trying to configure and I'm a little bit confused.
Does it matters if the csv file has the column names?
Does the order of the columns matter in the csv file? (I guess they do if there are not present in the csv)
The PK of the table is a serial number, which I'm not including in the csv file. Do I need to include also the PK? I mean, because its a serial number it should be "auto assigned", right?
Sorry for the noob questions and thanks in advance :)
This is all covered by the COPY documentation.
It matters in that you will have to specify the HEADER option so that the first line is skipped:
[...] on input, the first line is ignored.
The order matters, and if the CSV file does not contain all the columns in the same order as the table, you have to specify them with COPY:
COPY mytable (col12, col2, col4, ...) FROM '/dir/afile' OPTIONS (...);
Same as above: if you omit a table column in the column list, it will be filled with the default value, in that case that is the autogenerated number.

How to avoid OIDs column from table in PostgreSQL?

I am using PostgreSQL 9.6. I have created a table with create query.
But when i checked in left panel of pgAdmin, under table i found more six columns named tableid,cmax,xmax,cmin,xmin and ctid.
When i searched about this, I found that these are OIDs column and does not affect to data on other columns.
I have to import data into this table. So after selecting table, from right click i got option for import/Export. So from that i am importing .csv file.
But when i tried to import the data in table, i am getting error like,
ERROR: column "tableoid" of relation "account" does not exist
Please suggest me how to eliminate these OID columns from table.
You must be missing some column that is present in the csv named "tableoid".
In this case ,TABLE according to the import file must be created first. IF there is no prior table , it wont work. This may help.
http://www.postgresqltutorial.com/import-csv-file-into-posgresql-table/

Is it possible to insert and replace rows with pgloader?

My use case is the following: I have data coming from a csv file and I need to load it into a table (so far so good, nothing new here). It might happen that same data is sent with updated columns, in which case I would like to try to insert and replace in case of duplicate.
So my table is as follows:
CREATE TABLE codes (
code TEXT NOT NULL,
position_x INT,
position_y INT
PRIMARY KEY (code)
);
And incoming csv file is like this:
TEST01,1,1
TEST02,1,2
TEST0131,3
TEST04,1,4
It might happen that sometime in the future I get another csv file with:
TEST01,1,1000 <<<<< updated value
TEST05,1,5
TEST0631,6
TEST07,1,7
Right now what is happening is when I run for the first file, everything is fine, but when I execute for the second one I'm getting an error:
2017-04-26T10:33:51.306000+01:00 ERROR Database error 23505: duplicate key value violates unique constraint "codes_pkey"
DETAIL: Key (code)=(TEST01) already exists.
I load data using:
pgloader csv.load
And my csv.load file looks like this:
LOAD CSV
FROM 'codes.csv' (code, position_x, position_y)
INTO postgresql://localhost:5432/codes?tablename=codes (code, position_x, position_y)
WITH fields optionally enclosed by '"',
fields terminated by ',';
Is what I'm trying to do possible with pgloader?
I also tried dropping constrains for the primary key but then I end up with duplicate entries in the table.
Thanks a lot for your help.
No, you can't. As per reference
To work around that (load exceptions, eg PK violations), pgloader cuts the data into batches of 25000 rows
each, so that when a problem occurs it's only impacting that many rows
of data.
in brackets - mine...
The best you can do is load csv to table with same structure and then merge data with help of query (EXCEPT, OUTER JOIN ... where null and so on)

Ignore duplicates when importing from CSV

I'm using PostgreSQL database, after I've created my table I have to populate them with a CSV file. However the CSV file is corrupted and it violates the primary key rule and so the database is throwing an error and I'm unable to populate the table. Any ideas how to tell the database to ignore the duplicates when importing from CSV? Writing a script to remove them from the CSV file is no acceptable. Any workarounds are welcome too. Thank you! : )
On postgreSQL, duplicate rows are not permitted if they violate a unique constraint.
I think that your best option, is to import your CSV file on to temp table that has no constraint, delete from it duplicate values, and finally import from this temp table to your final table.