Postgresql copy from file changed rows order - postgresql

I'm trying to save text file content into Postgresql database. First of all I want to copy file into one table with one column in order to iterate over it and save values into specific tables.
I'm using Postgresql version 11.5 on Mac. I copied file into temp table (one line as one row). Then I wrote plpgsql function that iterates over each row of temp table and parses values into other tables. It was working fine on small dataset but when I used bigger one ~ aprox (6*10^5 lines) function failed because it expected specific (same as in file) rows order. After some investigation it turned out that rows order in temp table is different than lines order in the file. What's more interesting first difference occurs on 455864th row.
CREATE TABLE "Temp"
(
data_row text
);
COPY "Temp"(data_row) FROM 'PATH_TO_FILE';
I expect COPY FROM command to copy data in same order that is in file.

Related

Does Postgres COPY <table> FROM command check constraint(s) as each row is inserted?

I have a dump of a table as a CSV file. One of the columns in the table has a foreign key constraint on another column in the same table (think of it as a parent/child relationship).
I used the COPY <table> FROM command to load the data from the CSV file into the table. It worked fine. I though it worked because the rows were ordered correctly in the CSV file such that the parent rows came before the children rows.
To test the above hypothesis, I relocated some rows in the CSV file so that some children rows came before the corresponding "parent" rows. To my surprise the COPY command still worked and all rows were inserted correctly.
Is this because the COPY command checks the constraint(s) in bulk after all rows are inserted instead of checking each row one by one as they are inserted?
Note that I am performing these operations in Python 3 using the pycopg2 library.

How to import data from csv to postgresql table that has an auto_increment first column or field

Hi I am trying to import data to a postgresql table from a csv file, so I'd like to know how do I exlude the first column since it is an identity column that increments when data is inserted?
Error I get is here
If you are using some script to add or modify data, then you should just skip the variable on the script (e.g. does not write then on the insert statement), but on doing so you should modify your csv to delete the insert column(the data, if any, and the separator, usually a comma) since the number of variables are now different.
Looking at your print I suppose you are using pgadmin or some simmilar GUI, on the case of pgadmin if you click on the column and select the import\export data... option, you will open a windows where you should select "import" and then, on the upper windows menu, click on "columns" and exclude the "ID" or any other auto-increment, this slution also needs you to remove the csv column as well.

Why querying from binary file use more main memory space than text file through Postgres file_fdw?

I am using Postgres file data wrapper(file_fdw) to perform query from a remote file. Interestingly, I saw that if Postgres perform a query on a binary file, it uses more main memory space than text file such as .csv. Can anyone confirm why is this happen? Is there any way to optimize this?
My file contains two column (id | geometry), here geometry represents polygon. So, my foreign table consists of two column. I have tested with join query like ST_Overlaps() and ST_Contains() on CSV file and Postgres compatible binary file.
SELECT COUNT(*) FROM ftable1 a, ftable2 b WHERE ST_Overlaps(a.geom, b.geom);
I have checked the memory usage by PostgreSQL using htop, I saw that if Postgres perform a query on a binary file, it uses more main memory space than CSV file. For example, if it is 500MB for CSV then almost 1GB for a binary file. why?

PostgreSQL leaving headers blank while importing CSV

I'm trying to import a CSV file with column names "Zip Code", "2010 Population", "Land-Sq-Mi" and "Density per Sq Mile" into my test table, which is named derp--that's why I have the drop statement at the beginning, so I don't replicate any rows and can start clean in each iteration.
Code is as follows:
DROP TABLE derp;
CREATE TABLE public.derp("Zip Code" varchar, "2010 Population" integer, "Land-Sq-Mi" numeric, "Density Per Sq Mile" numeric);
COPY derp("Zip Code", "2010 Population", "Land-Sq-Mi", "Density Per Sq Mile")
FROM '/home/michael/PycharmProjects/cmsDataProject/Zipcode-ZCTA-Population-Density-And-Area-Unsorted.csv'
DELIMITER','
CSV HEADER;
This does a fine job of importing the actual data, but it leaves the column headers blank in the pgadmin III data view. I looked at the source file in Nano--the headers are there, and if they weren't the query would have thrown a syntax error telling me that there was no relation for the column I was trying to import into.
Any ideas about what I'm doing wrong?
Edit: I would like pgadmin III data view to display the header names, and possibly a way to verify that the columns are actually named even if they aren't being imported and not displayed. To reiterate, every row after the headers is intact and in view, just the header row is blank.
Edit 2: When I CREATE TABLE public.derp(); and then manually add the columns, they show correctly in the data view. Something about the multi-line query statement was causing the breakage.
So pgadmin is not showing the column names but it's showing the data?
If you open a table in pgadmin, then alter the table, but keep the table window open, it seems to lose the column names.
Close the window with the table. Click the tables icon in the pgadmin tree view and refresh the tables, and reopen the table window.

How to UPDATE table from csv file?

How to update table from csv file in PostgreSQL? (version 9.2.4)
Copy command is for insert. But I need to update table. How can I update table from csv file without temp table?
I don't want to copy to temp table from csv file and update table from temp table.
And no merge command like Oracle?
The simple and fast way is with a temporary staging table, like detailed in this closely related answer:
How to update selected rows with values from a CSV file in Postgres?
If you don't "want" that for some unknown reason, there are more ways:
A foreign data wrapper with file_fdw.
You can run UPDATE commands directly using this one.
pg_read_file(). For special use cases.
Details in this related answer:
Read data from a text file inside a trigger
There is no MERGE command in Postgres, even less for COPY.
Discussion about whether and how to add it is ongoing. Check out the Postgres Wiki for details.