Backup complete postgis database with geometry transformation - postgresql

I have a PostGIS enabled PostgreSQL database that is no longer needed in production. I would like to back it up, but the geometry columns of PostGIS should be transformed to a simple, long-term stable text format like WKT.
I'm aware of the ST_AsText function.
SELECT road_id, ST_AsText(road_geom) AS geom, road_name FROM roads;
But how do I apply this to backup the complete database with several tables with geometries and many without?

That's the backup strategy I finally applied:
1. Normal dump of the PostGIS-Postgres database with pg_dump. Plain-text format to improve readability.
2. CSV backup of all tables
For this I created a copy of my initial database ("la_as_text") and transformed all the geometry columns to text with the following script (thanks #Jim Jones). It's very specific for my case, but I decided to post it like this anyway. Just in case somebody runs in the came issues as I did with views that depend on geometry columns and gist indexes. The views won't work if you change the datatype to text and text columns are also not valid for gist indexing.
#!/bin/bash
# has to be run by a user who has owner permissions for the database
rm delete_views.txt delete_views_mod.txt alter_script.txt alter_script_mod.txt
# delete all views - not necessary in long-term-backup and cause problems when altering geometry columns
echo "select 'drop view ' ||table_name|| ';' from information_schema.views where table_schema not in ('pg_catalog', 'information_schema') and table_name "'!'"~ '^pg_';" | psql la_as_text >> delete_views.txt
cat delete_views.txt | tail -n +3 | head -n -2 >> delete_views_mod.txt
cat delete_views_mod.txt | psql la_as_text
# delete one particular gist index that depends on a geometry column -- text can't be indexed with gist
echo "drop index idx_combined_points_the_geom;" | psql la_as_text
# change data type geometry (EWKB) to human readable text (WKT)
echo "select 'alter table '||table_schema||'.'||table_name||' alter column '||column_name||' type text USING ST_AsText('||column_name||');' from information_schema.columns where table_schema = 'public' and udt_name = 'geometry';" | psql la_as_text >> alter_script.txt
cat alter_script.txt | tail -n +3 | head -n -2 >> alter_script_mod.txt
cat alter_script_mod.txt | psql la_as_text
The resulting database lacks a lot of its initial functionality due to the missing geometry data type, but it's human readable.
Instead of a normal dump I exported all the individual tables as ';'-separated text files with the following script:
#!/bin/bash
SCHEMA="public"
DB="la_as_text"
psql -Atc "select tablename from pg_tables where schemaname='$SCHEMA'" $DB |\
while read TBL; do
psql -c "copy $SCHEMA.$TBL to stdout with csv delimiter ';'" $DB > $TBL.csv
done
I'm pretty confident that this backup can be reconstructed in the future -- if necessary.

Related

What is the difference between => and ->?

The command prompt sometimes switches between => and -> when using the interactive terminal psql. It's not clear to me what this indicates.
For example,
$ psql postgres
psql (9.5.10)
Type "help" for help.
postgres=> /h
postgres->
When you see the ->, it is letting the user know that the current line is a continuation of an incomplete statement. The previous command was not properly ended. In other words, it is waiting for you to end the statement with a ;. See this example of a query broken up into three different lines. It doesn't run my query until I end the statement with a semi-colon.
test=> select * from
test-> pg_catalog.pg_tables
test-> where tablename='test';
schemaname | tablename | tableowner | tablespace | hasindexes
------------+-----------+------------+------------+-----------,
(0 rows)

psql non-select: how to remove formatting and show only certain columns?

I'm looking to remove all line drawing characters from:
PGPASSWORD="..." psql -d postgres -h "1.2.3.4" -p 9432 -c 'show pool_nodes' -U owner
node_id | hostname | port | status | lb_weight | role
---------+---------------+------+--------+-----------+---------
0 | 10.20.30.40 | 5432 | 2 | 0.500000 | primary
1 | 10.20.30.41 | 5432 | 2 | 0.500000 | standby
(2 rows)
Adding the -t option gets rid of the header and footer, but the vertical bars are still present:
PGPASSWORD="..." psql -t -d postgres -h "1.2.3.4" -p 9432 -c 'show pool_nodes' -U owner
0 | 10.20.30.40 | 5432 | 2 | 0.500000 | primary
1 | 10.20.30.41 | 5432 | 2 | 0.500000 | standby
Note that this question is specific to show pool_nodes and other similar non-select SQL statements.
My present workaround is to involve the Linux cut command:
<previous command> | cut -d '|' -f 4
The question has two parts:
How using psql only can the vertical bars above be removed?
How using psql only can only a specific column (for example, status) or columns be shown? For example, the result might be just two lines, each showing the number 2.
I'm using psql version psql (PostgreSQL) 9.2.18 on a CentOS 7 server.
For scripting psql use psql -qAtX:
quiet
tuples-only
unAligned output
do not read .psqlrc (X)
To filter columns you must name them in the SELECT list. psql always outputs the full result set it gets from the server. E.g. SELECT status FROM pool_nodes.
Or you can cut to extract ordinal column numbers e.g.
psql -qAtX -c 'whatever' | cut -d '|' -f 1,2-4
(I have no idea how show pool_nodes can produce the output you show here, since SHOW returns a single scalar value...)
To change the delimiter from a pipe | to something else, use -F e.g. -F ','. But be warned, the delimiter is not escaped when it appears in output, this isn't CSV. You might want to consider a tab as a useful option; you have to enter a quoted literal tab to do this. (If doing it in an interactive shell, search for "how to enter literal tab in bash" when you get stuck).
Example showing all the above, given dummy data:
CREATE TABLE dummy_table (
a integer,
b integer,
c text,
d text
);
INSERT INTO dummy_table
VALUES
(1,1,'chicken','turkey'),
(2,2,'goat','cow'),
(3,3,'mantis','cricket');
query, with single space as the column delimiter (so you'd better not have spaces in your data!):
psql -qAtX -F ' ' -c 'SELECT a, b, d FROM dummy_table'
If for some reason you cannot generate a column-list for SELECT you can instead filter by column-ordinal with cut:
psql -qAtX -F '^' -c 'TABLE dummy_table' | cut -d '^' -f 1-2,4

PostgreSQL - copy first 5 rows of all tables and output to a file

I need to grab the first 5 rows of every table in PostgreSQL and output them to my computer in .csv and (preferably) .sql. There are 275 total tables.
Is this possible to do via CLI in a single scripted command?
So far I'm able to copy a single table at a time, but it's taking forever.
\COPY (SELECT * from table-name limit 5) TO '/vagrant/testexport.csv' DELIMITER ',' CSV HEADER;
bash file:
tables=$(psql -d a -tXa -c "COPY(select concat(schemaname,'.',tablename) as tables from pg_tables) to '/tmp/tlist'")
for i in $(cat /tmp/tlist); do
psql -d a -tXa -c "\COPY (SELECT * from $i limit 5) TO '/tmp/$i.csv' DELIMITER ',' CSV HEADER;";
done

Export to CSV and Compress with GZIP in postgres

I need to export a big table to csv file and compress it.
I can export it using COPY command from postgres like -
COPY foo_table to '/tmp/foo_table.csv' delimiters',' CSV HEADER;
And then can compress it using gzip like -
gzip -c foo_table.csv > foo.gz
The problem with this approach is, I need to create this intermediate csv file, which itself is huge, before I get my final compressed file.
Is there a way of export table in csv and compressing the file in one step?
Regards,
Sujit
The trick is to make COPY send its output to stdout, then pipe the output through gzip:
psql -c "COPY foo_table TO stdout DELIMITER ',' CSV HEADER" \
| gzip > foo_table.csv.gz
You can use directly, as per docs, https://www.postgresql.org/docs/9.4/sql-copy.html
COPY foo_table to PROGRAM 'gzip > /tmp/foo_table.csv' delimiter ',' CSV HEADER;
Expanding a bit on #Joey's answer, below adds support for a couple more features available in the manual.
psql -c "COPY \"Foo_table\" (column1, column2) TO stdout DELIMITER ',' CSV HEADER" \
| gzip > foo_table.csv.gz
If you have capital letters in your table name (woe be onto you), you need the \" before and after the table name.
The second thing I've added is column listing.
Also note from the docs:
This operation is not as efficient as the SQL COPY command because all data must pass through the client/server connection. For large amounts of data the SQL command might be preferable.
PostgreSQL 13.4
psql command \copy also works combined with SELECT column_1, column_2, ... and timestamp date +"%Y-%m-%d_%H%M%S" for filename dump.
\copy (SELECT id, column_1, column_2, ... FROM foo_table) \
TO PROGRAM 'gzip > ~/Downloads/foo_table_dump_`date +"%Y-%m-%d_%H%M%S"`.csv.gz' \
DELIMITER ',' CSV HEADER ;

Copy permissions from another table

Is it possible to copy the user permissions from one table in a PostgreSQL database to another table? Is it just a matter of updating the pg_class.relacl column value for the target table to the value for the source table, as in:
UPDATE pg_class
SET relacl=(SELECT relacl FROM pg_class WHERE relname='source_table')
WHERE relname='target_table';
This seems to work, but am I missing anything else that may need to be done or other 'gotchas' with this method?
If you can use command-line instead of SQL then a safer approach would be to use pg_dump:
pg_dump dbname -t oldtablename -s \
| egrep '^(GRANT|REVOKE)' \
| sed 's/oldtablename/newtablename/' \
| psql dbname
I assume a unix server. On Windows I'd use pg_dump -s to a file, manually edit it and then import it to a database.
Maybe you'll also need to copy permissions to sequences owned by this table - pg_dump will work.
The pg_dump approach is nice and simple, however, it doesn't work with tables in other schemas, as the output doesn't qualify the table with schema name. Instead it generates:
SET search_path = foo, pg_catalog;
...
GRANT SELECT ON foo_table to foo_user;
and will fail to grant privileges to an nonexistent public.foo_table relation.
Also, if you have relations with the same name in different schemas, you need to ensure that you only rename the table in the specified schema. I began to hack a bash script base on the above to take care of this but it started to become a bit unwieldy, so I switched to perl.
Usage: transfer-acl old-qualified-relation=new-qualified-relation
e.g. transfer-acl foo.foo_table=foo.bar_table will apply the grants on foo.foo_table to the foo.bar_table. I didn't implement any REVOKE rewriting because I wasn't able to get a dump to emit any.
#! /usr/bin/perl
use strict;
use warnings;
my %rename = map {(split '=')} #ARGV;
open my $dump, '-|', qw(pg_dump customer -s), map {('-t', $_)} keys %rename
or die "Cannot open pipe from pg_dump: $!\n";
my $schema = 'public';
while (<$dump>) {
if (/^SET search_path = (\w+)/) {
$schema = $1;
}
elsif (/^(GRANT .*? ON TABLE )(\w+)( TO (?:[^;]+);)$/) {
my $fq_table = "$schema." . $2; # fully-qualified schema.table
print "$1$rename{$fq_table}$3\n" if exists $rename{$fq_table};
}
}
Pipe the results of this to psql and you're set.