How to update the Postgresql using CSV file multiple times - postgresql

I have a CSV file whose data is to be imported to Postgres database , I did it using import function in pgadmin III but the problem is my CSV file changes frequently so how to import the data overwriting the already existing data in database from CSV file ?

You can save WAL logging through an optimization between TRUNCATE/COPY in the same transaction. The basic idea is to wipe the database table with TRUNCATE and reimport the data with COPY. This doesn't need to be done manually with pgAdmin each time. It can be scripted with something like:
BEGIN;
-- The CSV file is 'mydata.csv' and the table is 'mydata'.
TRUNCATE mydata;
COPY mydata FROM 'mydata.csv' WITH (FORMAT csv);
COMMIT;
Note that it requires superuser access to work. The COPY command also takes various arguments, so you can adjust for different settings for null and headers etc.
Finally it should be noted that you ideally want these both to be in the same transaction. I'm not going to over-complicate this example here though as this level of care isn't needed in many of the real-world sorts of cases where one is copying in a CSV file. If you think your situation needs it, it's not too hard to track down.

Related

PostgreSQL - mounting csv / other file - type volumes/tablespaces

In few other DB engines I can easily extract (part of) table to single file.
Then if needed I can 'mount' this file as regular table. Querying is obviously slow but this is very useful
I wonder if similar stuff is possible with psql ?
I know COPY FROM/TO function - but for bigger tables I need to wait ages in order to copy records from CSV
Yes, you can use file_fdw to access (read) a CSV file on the database server as if it were a table.

Using Foreign Data Wrappers in PostgreSQL (variable filename)

I'm running PostgreSQL 9.3 and want to import some daily generated csv files into specific tables.
I started playing with FDW (Foreign Data Wrapper) and pointed to a specific csv, where I can query and append/upsert to a table.
But I have two more needs:
- The file generation date and source branch is present in the filename, and only there.
I need to get this information and insert also in the table.
- As expected, the files names are not fixed, so the FDW doesn't know where to get the information.
I thought about solving this using some unix tools (although my Postgres runs on windows), basically for each file in a list (from a previously created index), the script would rename the file and pass the branch and date as parameters to a psql.exe command line, where the import would be from a fixed name in FDW.
This would work, but this script sound a bit like a hack and not a very "elegant" solution.
Does anyone has an better suggestion?
Thanks!

Dynamically create table from csv

I am faced with a situation where we get a lot of CSV files from different clients but there is always some issue with column count and column length that out target table is expecting.
What is the best way to handle frequently changing CSV files. My goal is load these CSV files into Postgres database.
I checked the \COPY command in Postgres but it does have an option to create a table.
You could try creating a pg_dump compatible file instead which has the appropriate "create table" section and use that to load your data instead.
I recommend using an external ETL tool like CloverETL, Talend Studio, or Pentaho Kettle for data loading when you're having to massage different kinds of data.
\copy is really intended for importing well-formed data in a known structure.

Import to postgreSQL from csv filtering each line

I have the following question (even thorough research couldn't help me):
I want to import data from a (rather large) CSV/TXT file to a postgreSQL DB, but I want to filter each line (before importing it) based on specific criteria.
What command/solution can I use?
On sidenote: If I am not reading from file, but a data stream what is the relevant command/procedure?
Thank you all in advance and sorry if this has been in some answer/doc that I have missed!
Petros
To explain the staging table approach, which is what I use myself:
Create a table (could be a temporary table) matching your csv structure
Import into that table, doing no filtering
Process and import your data into the real tables using SQL to filter and process.
Now, in PostgreSQL, you could also use the file_fdw to give you direct sql access to csv files. In general the staging table solution will usually be cleaner, but you can do this by essentially letting PostgreSQL treat the file as a table and going through a foreign data wrapper.

PostgreSQL dump Temp table

I created a temp table in my PostgreSQL DB using the following query
SELECT * INTO TEMP TABLE tempdata FROM data WHERE id=2004;
Now I want to create a backup of this temp table tempdata.
So i use the following command line execution
"C:\Program Files\PostgreSQL\9.0\bin\pg_dump.exe" -F t -a -U my_admin -t tempdata myDB >"e:\mydump.backup"
I get a message saying
pg_dump: No matching tables were found
Is it possible to create a dump of temp tables?
Am I doing it correctly?
P.S. : I would also want to restore the same.I don't want to use any extra components.
TIA.
I don't think you'll be able to use pg_dump for that temporary table. The problem is that temporary tables only exist within the session where they were created:
PostgreSQL instead requires each session to issue its own CREATE TEMPORARY TABLE command for each temporary table to be used. This allows different sessions to use the same temporary table name for different purposes, whereas the standard's approach constrains all instances of a given temporary table name to have the same table structure.
So you'd create the temporary table in one session but pg_dump would be using a different session that doesn't have your temporary table.
However, COPY should work:
COPY moves data between PostgreSQL tables and standard file-system files.
but you'll either be copying the data to the standard output or a file on the database server (which requires superuser access):
COPY with a file name instructs the PostgreSQL server to directly read from or write to a file. The file must be accessible to the server and the name must be specified from the viewpoint of the server.
[...]
COPY naming a file is only allowed to database superusers, since it allows reading or writing any file that the server has privileges to access.
So using COPY to dump the temporary table straight to a file might not be an option. You can COPY to the standard output though but how well that will work depends on how you're accessing the database.
You might have better luck if you didn't use temporary tables. You would, of course, have to manage unique table names to avoid conflicts with other sessions and you'd have to take care to ensure that your non-temporary temporary tables were dropped when you were done with them.