Export Postgres Database into CSV file - postgresql

I want to export a Postgres database into a CSV file. Is this possible?
If it is possible, then how can I do this? I have seen that we can convert a particular table into a CSV file but I don't know about a whole database.

I made this pl/pgsql function to create one .csv file per table (excluding views, thanks to #tarikki):
CREATE OR REPLACE FUNCTION db_to_csv(path TEXT) RETURNS void AS $$
declare
tables RECORD;
statement TEXT;
begin
FOR tables IN
SELECT (table_schema || '.' || table_name) AS schema_table
FROM information_schema.tables t INNER JOIN information_schema.schemata s
ON s.schema_name = t.table_schema
WHERE t.table_schema NOT IN ('pg_catalog', 'information_schema')
AND t.table_type NOT IN ('VIEW')
ORDER BY schema_table
LOOP
statement := 'COPY ' || tables.schema_table || ' TO ''' || path || '/' || tables.schema_table || '.csv' ||''' DELIMITER '';'' CSV HEADER';
EXECUTE statement;
END LOOP;
return;
end;
$$ LANGUAGE plpgsql;
And I use it this way:
SELECT db_to_csv('/home/user/dir');
-- this will create one csv file per table, in /home/user/dir/

You can use this at psql console:
\copy (SELECT foo,bar FROM whatever) TO '/tmp/file.csv' DELIMITER ',' CSV HEADER
Or it in bash console:
psql -P format=unaligned -P tuples_only -P fieldsep=\, -c "SELECT foo,bar FROM whatever" > output_file

Modified jlldoras brilliant answer by adding one line to prevent the script from trying to copy views:
CREATE OR REPLACE FUNCTION db_to_csv(path TEXT) RETURNS void AS $$
declare
tables RECORD;
statement TEXT;
begin
FOR tables IN
SELECT (table_schema || '.' || table_name) AS schema_table
FROM information_schema.tables t INNER JOIN information_schema.schemata s
ON s.schema_name = t.table_schema
WHERE t.table_schema NOT IN ('pg_catalog', 'information_schema', 'configuration')
AND t.table_type NOT IN ('VIEW')
ORDER BY schema_table
LOOP
statement := 'COPY ' || tables.schema_table || ' TO ''' || path || '/' || tables.schema_table || '.csv' ||''' DELIMITER '';'' CSV HEADER';
EXECUTE statement;
END LOOP;
return;
end;
$$ LANGUAGE plpgsql;

If you want to specify the database and user while exporting you can just modify the answer given by Piotr as follows
psql -P format=unaligned -P tuples_only -P fieldsep=\, -c "select * from tableName" > tableName_exp.csv -U <USER> -d <DB_NAME>

Do you want one big CSV file with data from all tables?
Probably not. You want separate files for each table or one big file with more information that can be expressed in CSV file header.
Separate files
Other answers shows how to create separate files for each table. You can query database to show you all tables with such query:
SELECT DISTINCT table_name
FROM information_schema.columns
WHERE table_schema='public'
AND position('_' in table_name) <> 1
ORDER BY 1
One big file
One big file with all tables in CSV format used by PostgreSQL COPY command can be created with pg_dump command. Output will also have all CREATE TABLE, CREATE FUNCTION etc, but with Python, Perl or similar language you can easily extract only CSV data.

I downloaded a copy of RazorSQL, opened the database server and right-clicked on the database and selected Export Tables and it gave me the option of CSV, EXCEL, SQL etc...

Related

PostgreSQL Syntax error, can't find it

I have a code which creates 6 templates, adds data to them, merges them and export it as data. I can make it work by using F5 on different paragraphs, but I want to make the whole code work. Can someone help me, I am pretty new.
CREATE TEMP TABLE john1
(email VARCHAR(200));
COPY john1(email) from 'E:\WORK\FXJohn1.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john2
(email VARCHAR(200));
COPY john2(email) from 'E:\WORK\FXJohn2.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john3
(email VARCHAR(200));
COPY john3(email) from 'E:\WORK\FXJohn3.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john4
(email VARCHAR(200));
COPY john4(email) from 'E:\WORK\FXJohn4.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john5
(email VARCHAR(200));
COPY john5(email) from 'E:\WORK\FXJohn5.csv' DELIMITER ',' CSV HEADER
CREATE TEMP TABLE john6
(email VARCHAR(200));
COPY john6(email) from 'E:\WORK\FXJohn6.csv' DELIMITER ',' CSV HEADER
CREATE TABLE finished AS
(SELECT * FROM john1
UNION
SELECT * FROM john2
UNION
SELECT * FROM john3
UNION
SELECT * FROM john4
UNION
SELECT * FROM john5
UNION
SELECT * FROM john6);
DO $func$
BEGIN
EXECUTE $$
COPY public."finished" TO 'E:\$$ || to_char(CURRENT_DATE, 'YYYY_MM_DD') || $$.csv' DELIMITER ',' CSV HEADER;
$$;
END;
$func$ LANGUAGE plpgsql;
#Rupert
Sorry, but for some reason this script is not running for me, I get this error:
ERROR: syntax error at or near "for" LINE 1: for x in $(ls FXJohn1*.csv);
Do I change the variables correctly?
for x in $(ls file_name*.csv);
| I change file_name to one of my .csv in the folder|
do psql -c "copy table_name from
| I change table_name to my current table name I've created|
'/path/. todir/$x' csv" db_name; done
| I change path to E:\WORK (there are my all my csv files.
Firstly you can load multiple .csv files into the same table. So let's set that up first:
CREATE TABLE finished
(
email varchar(200)
)
Then you can load multiple files from the same folder using a simple bash script:
for x in $(ls file_name*.csv);
do psql -c "copy table_name from
'/path/. todir/$x' csv" db_name; done
This saves you doing multiple 'copies' and then the multiple UNIONs.
Then you can run your script:
DO $func$
BEGIN
EXECUTE $$
COPY public."finished" TO 'E:\$$ || to_char(CURRENT_DATE,
'YYYY_MM_DD') || $$.csv'
DELIMITER ',' CSV HEADER;
$$;
END;
$func$ LANGUAGE plpgsql;

truncate function doesnt work in postgres

I have created the following function to truncate bunch of tables starting with "irm_gtresult". There are no syntax errors in my function, but the function doesn't truncate the tables when I run it. What could be wrong here?
My Postgres db version is 8.4.
create or replace function br()
RETURNS void
LANGUAGE plpgsql
AS
$$
DECLARE
row text;
BEGIN
FOR row IN
select table_name from information_schema.tables where table_name ILIKE 'irm_gtresult%'
LOOP
EXECUTE 'TRUNCATE TABLE ' || row;
END LOOP;
END;
$$;
Call:
select br();
Your code is valid. I tested and it works for me in Postgres 9.4.
Using the outdated and unsupported version 8.4 (like you added) may be the problem. The version is just too old, consider upgrading to a current version.
However, I have a couple of suggestions:
Don't use key word row as variable name.
You don't need to loop, you can TRUNCATE all tables in a single command. Faster, shorter.
You may need to add CASCADE if there are dependencies. Be aware of the effect.
CREATE OR REPLACE FUNCTION br()
RETURNS void AS
$func$
BEGIN
EXECUTE (
SELECT 'TRUNCATE TABLE '
|| string_agg(format('%I.%I', schemaname, tablename), ',')
|| ' CASCADE'
FROM pg_tables t
WHERE tablename ILIKE 'irm_gtresult%'
AND schemaname = 'public'
-- AND tableowner = 'postgres' -- optionaly restrict to one user
);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT br();
I am using the view pg_tables from the system catalog. You can as well use information_schema.tables like you did. Note the subtle differences:
How to check if a table exists in a given schema
Related answers with more explanation:
Can I truncate tables dynamically?
Truncating all tables in a Postgres database
To truncate in postgres you just have to use the TRUNC() function.
Example:
SELECT TRUNC(price, 0) AS truncated_price
FROM product;

postgresql - how to use a cursor or select statement to generate mulitple DML statements

New to postgres and I'm using Postgresql 9.3. Is there a way with postgresql to generate a file with multiple DML statements?
For example, I want to select table name where tablename like '_foo%' and then rename all those tables to '_bar%'. Do I need to do this in a cursor or can I do this within a select statement? (like in Oracle)
ALTER TABLE RENAME tst1_foo TO tst1_bar;
ALTER TABLE RENAME tst2_foo TO tst2_bar;
ALTER TABLE RENAME tst3_foo TO tst3_bar;
I'd like to print those out to a .sql file.
Please provide a basic example if possible. Thanks.
You can use psql and the pg_tables system view. Set the output to unaligned mode:
\a
Set the output to show only rows:
\t on
Send output to your file:
\o yourfile.sql
Run the query:
SELECT 'ALTER TABLE RENAME ' || tablename || ' TO ' ||
REGEXP_REPLACE ( tablename, '_foo$', '_bar' ) || ';'
FROM pg_tables
WHERE tablename LIKE '%_foo';
Close the file:
\o
and/or close psql:
\q

How to duplicate schemas in PostgreSQL

I have a database with schema public and schema_A. I need to create a new schema schema_b with the same structure than schema_a.
I found the function below, the problem is that it does not copy the foreign key constraints.
CREATE OR REPLACE FUNCTION clone_schema(source_schema text, dest_schema text)
RETURNS void AS
$BODY$
DECLARE
object text;
buffer text;
default_ text;
column_ text;
BEGIN
EXECUTE 'CREATE SCHEMA ' || dest_schema ;
-- TODO: Find a way to make this sequence's owner is the correct table.
FOR object IN
SELECT sequence_name::text FROM information_schema.SEQUENCES WHERE sequence_schema = source_schema
LOOP
EXECUTE 'CREATE SEQUENCE ' || dest_schema || '.' || object;
END LOOP;
FOR object IN
SELECT table_name::text FROM information_schema.TABLES WHERE table_schema = source_schema
LOOP
buffer := dest_schema || '.' || object;
EXECUTE 'CREATE TABLE ' || buffer || ' (LIKE ' || source_schema || '.' || object || ' INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING DEFAULTS)';
FOR column_, default_ IN
SELECT column_name::text, REPLACE(column_default::text, source_schema, dest_schema) FROM information_schema.COLUMNS WHERE table_schema = dest_schema AND table_name = object AND column_default LIKE 'nextval(%' || source_schema || '%::regclass)'
LOOP
EXECUTE 'ALTER TABLE ' || buffer || ' ALTER COLUMN ' || column_ || ' SET DEFAULT ' || default_;
END LOOP;
END LOOP;
END;
$BODY$ LANGUAGE plpgsql
How can I clone/copy schema_A with the foreign key constraints?
You can probably do it from the command line without using files:
pg_dump -U user --schema='fromschema' database | sed 's/fromschmea/toschema/g' | psql -U user -d database
Note that this searches and replaces all occurrences of the string that is your schema name, so it may affect your data.
I would use pg_dump to dump the schema without data:
-s
--schema-only
Dump only the object definitions (schema), not data.
This option is the inverse of --data-only. It is similar to, but for historical reasons not identical to, specifying --section=pre-data --section=post-data.
(Do not confuse this with the --schema option, which uses the word "schema" in a different meaning.)
To exclude table data for only a subset of tables in the database, see --exclude-table-data.
pg_dump $DB -p $PORT -n $SCHEMA -s -f filename.pgsql
Then rename the schema in the dump (search & replace) and restore it with psql.
psql $DB -f filename.pgsql
Foreign key constraints referencing tables in other schemas are copied to point to the same schema.
References to tables within the same schema point to the respective tables within the copied schema.
I will share a solution for my problem which was the same with a small addition. I needed to clone a schema, create a new database user and assign ownership of all objects in the new schema to that user.
For the following example let's assume that the reference schema is called ref_schema and the target schema new_schema. The reference schema and all the objects within are owned by a user called ref_user.
1. dump the reference schema with pg_dump:
pg_dump -n ref_schema -f dump.sql database_name
2. create a new database user with the name new_user:
CREATE USER new_user
3. rename the schema ref_schema to new_schema:
ALTER SCHEMA ref_schema RENAME TO new_schema
4. change ownership of all objects in the renamed schema to the new user
REASSIGN OWNED BY ref_user TO new_user
5. restore the original reference schema from the dump
psql -f dump.sql database_name
I hope someone finds this helpful.
A bit late to the party but, some sql here could help you along your way:
get schema oid:
namespace_id = SELECT oid
FROM pg_namespace
WHERE nspname = '<schema name>';
get table's oid:
table_id = SELECT relfilenode
FROM pg_class
WHERE relnamespace = '<namespace_id>' AND relname = '<table_name>'
get foreign key constraints:
SELECT con.conname, pg_catalog.pg_get_constraintdef(con.oid) AS condef
FROM pg_catalog.pg_constraint AS con
JOIN pg_class AS cl ON cl.relnamespace = con.connamespace AND cl.relfilenode = con.conrelid
WHERE con.conrelid = '<table_relid>'::pg_catalog.oid AND con.contype = 'f';
A good resource for PostgreSQL system tables can be found here. Additionally, you can learn more about the internal queries pg_dump makes to gather dump information by viewing it's source code.
Probably the easiest way to see how pg_dump gathers all your data would be to use strace on it, like so:
$ strace -f -e sendto -s8192 -o pg_dump.trace pg_dump -s -n <schema>
$ grep -oP '(SET|SELECT)\s.+(?=\\0)' pg_dump.trace
You'll still have to sort through the morass of statements but, it should help you piece together a cloning tool programmatically and avoid having to drop to a shell to invoke pg_dump.
Just ran into same. Sometimes I am missing remap_schema :)
The problem - neither from above addresses the Fc - standard format which is crucial for large schemas.
So I came up with something which uses it :
Pseudo code below - should work.
Requires rename of source for duration of pg_dump which, of course, might not be an option :(
Source :
pg_dump --pre-data in sql format
psql rename sosurce to target
pg_dump -Fc --data-only
psql rename back
pg_dump --post-data in sql format
Target :
sed source_schema->target_schema pre-data sql |psql
pg_restore Fc dump
sed source_schema->target_schema post-data sql |psql
sed above usually will include any other manipulations ( say different user names between source and target ) But it will be way much faster as data will not be part of the file

How to copy from CSV file to PostgreSQL table with headers in CSV file?

I want to copy a CSV file to a Postgres table. There are about 100 columns in this table, so I do not want to rewrite them if I don't have to.
I am using the \copy table from 'table.csv' delimiter ',' csv; command but without a table created I get ERROR: relation "table" does not exist. If I add a blank table I get no error, but nothing happens. I tried this command two or three times and there was no output or messages, but the table was not updated when I checked it through PGAdmin.
Is there a way to import a table with headers included like I am trying to do?
This worked. The first row had column names in it.
COPY wheat FROM 'wheat_crop_data.csv' DELIMITER ';' CSV HEADER
With the Python library pandas, you can easily create column names and infer data types from a csv file.
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine('postgresql://user:pass#localhost/db_name')
df = pd.read_csv('/path/to/csv_file')
df.to_sql('pandas_db', engine)
The if_exists parameter can be set to replace or append to an existing table, e.g. df.to_sql('pandas_db', engine, if_exists='replace'). This works for additional input file types as well, docs here and here.
Alternative by terminal with no permission
The pg documentation at NOTES
say
The path will be interpreted relative to the working directory of the server process (normally the cluster's data directory), not the client's working directory.
So, gerally, using psql or any client, even in a local server, you have problems ... And, if you're expressing COPY command for other users, eg. at a Github README, the reader will have problems ...
The only way to express relative path with client permissions is using STDIN,
When STDIN or STDOUT is specified, data is transmitted via the connection between the client and the server.
as remembered here:
psql -h remotehost -d remote_mydb -U myuser -c \
"copy mytable (column1, column2) from STDIN with delimiter as ','" \
< ./relative_path/file.csv
I have been using this function for a while with no problems. You just need to provide the number columns there are in the csv file, and it will take the header names from the first row and create the table for you:
create or replace function data.load_csv_file
(
target_table text, -- name of the table that will be created
csv_file_path text,
col_count integer
)
returns void
as $$
declare
iter integer; -- dummy integer to iterate columns with
col text; -- to keep column names in each iteration
col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet
begin
set schema 'data';
create table temp_table ();
-- add just enough number of columns
for iter in 1..col_count
loop
execute format ('alter table temp_table add column col_%s text;', iter);
end loop;
-- copy the data from csv file
execute format ('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_file_path);
iter := 1;
col_first := (select col_1
from temp_table
limit 1);
-- update the column names based on the first row which has the column names
for col in execute format ('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
loop
execute format ('alter table temp_table rename column col_%s to %s', iter, col);
iter := iter + 1;
end loop;
-- delete the columns row // using quote_ident or %I does not work here!?
execute format ('delete from temp_table where %s = %L', col_first, col_first);
-- change the temp table name to the name given as parameter, if not blank
if length (target_table) > 0 then
execute format ('alter table temp_table rename to %I', target_table);
end if;
end;
$$ language plpgsql;
## csv with header
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
-c "\COPY TB_NAME FROM 'data_sample.csv' WITH (FORMAT CSV, header);"
## csv without header
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
-c "\COPY TB_NAME FROM 'data_sample.csv' WITH (FORMAT CSV);"
## csv without header, specify column
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
-c "\COPY TB_NAME(COL1,COL2) FROM 'data_sample.csv' WITH (FORMAT CSV);"
all columns in csv should be same as table (or same as specified column)
about COPY
https://www.postgresql.org/docs/9.2/sql-copy.html
You can use d6tstack which creates the table for you and is faster than pd.to_sql() because it uses native DB import commands. It supports Postgres as well as MYSQL and MS SQL.
import pandas as pd
df = pd.read_csv('table.csv')
uri_psql = 'postgresql+psycopg2://usr:pwd#localhost/db'
d6tstack.utils.pd_to_psql(df, uri_psql, 'table')
It is also useful for importing multiple CSVs, solving data schema changes and/or preprocess with pandas (eg for dates) before writing to db, see further down in examples notebook
d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'),
apply_after_read=apply_fun).to_psql_combine(uri_psql, 'table')